Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
One-time payment vs. subscription 🔥 what actually makes more money?

I built a habit-tracking app and launched it six months ago. Initially, I made it a one-time purchase for $9.99. Sales were okay, but nothing crazy. Recently, I switched to a $3.99/month subscription model, and suddenly my revenue is way higher... even with fewer purchases.

But now I’m getting tons of complaints from users who bought it before and feel “cheated.” Some are leaving 1-star reviews, and I feel like I burned my early adopters.

Did I screw up? Should I have offered lifetime access at a higher price? If you’ve switched models before, what worked best for you?

https://redd.it/1iwyszm
@r_devops
Why pay $150 per parallel e2e test, am I missing something?

Sharding Playwright across a few runners isn't particularly tricky. So, I'm confused how saucelabs and browserstack can charge $150 per parallel test in their virtual cloud. That's not even on real devices.

Is there something I'm missing that makes this appealing? Maybe it's only relevant for bigger test suites for reasons I haven't encountered yet.

https://redd.it/1ix065e
@r_devops
Just tried a new profiler: what would you optimize first?

I was looking for better ways to debug performance bottlenecks and came across a new profiling tool that just dropped on GitHub. Decided to test it out on one of our services, and the results were... eye-opening.

The flame graph it generated (screenshot attached) revealed:

\- A DB operation consuming way more resources than expected... we thought it was optimized, but apparently not.

\- Some unexpected runtime garbage collection overhead, wasn’t on our radar at all.

For those who’ve worked with flame graphs before, where would you start optimizing? Do I tackle the DB queries first or look at memory management?

Screenshot is attached here: https://drive.google.com/file/d/1QZJHtEyRxDr2LfIW8VIDVD6sZwokCneo/view?usp=sharing

https://redd.it/1ix0wfc
@r_devops
GitHub Actions, Pulumi GCP, Artifact Registry and Docker - Cannot perform an interactive login from a non TTY device

Hi everyone! [I'm cross-posting ](https://stackoverflow.com/questions/79463461/github-actions-pulumi-gcp-artifact-registry-and-docker-cannot-perform-an-int)from Stack Overflow.



I'm using Pulumi in GitHub Actions to deploy to GCP's Artifact Registry with Workload Identity Federation. When it reaches Pulumi's code to push to artifact registry I receive:


```
docker:image:Image temporal-worker-dev {"Client":{"Platform":{"Name":"Docker Engine - Community"},"Version":"26.1.3","ApiVersion":"1.45","DefaultAPIVersion":"1.45","GitCommit":"b72abbb","GoVersion":"go1.21.10","Os":"linux","Arch":"amd64","BuildTime":"Thu May 16 08:33:35 2024","Context":"default"},"Server":{"Platform":{"Name":"Docker Engine - Community"},"Components":[{"Name":"Engine","Version":"26.1.3","Details":{"ApiVersion":"1.45","Arch":"amd64","BuildTime":"Thu May 16 08:33:35 2024","Experimental":"false","GitCommit":"8e96db1","GoVersion":"go1.21.10","KernelVersion":"6.8.0-1021-azure","MinAPIVersion":"1.24","Os":"linux"}},{"Name":"containerd","Version":"1.7.25","Details":{"GitCommit":"bcc810d6b9066471b0b6fa75f557a15a1cbf31bb"}},{"Name":"runc","Version":"1.2.4","Details":{"GitCommit":"v1.2.4-0-g6c52b3f"}},{"Name":"docker-init","Version":"0.19.0","Details":{"GitCommit":"de40ad0"}}],"Version":"26.1.3","ApiVersion":"1.45","MinAPIVersion":"1.24","GitCommit":"8e96db1","GoVersion":"go1.21.10","Os":"linux","A
docker:image:Image temporal-worker-dev error: Error: Cannot perform an interactive login from a non TTY device
docker:image:Image temporal-worker-dev docker login failed
docker:image:Image remix-app-dev error: Error: Cannot perform an interactive login from a non TTY device
docker:image:Image remix-app-dev docker login failed
pulumi:pulumi:Stack alertdown-infra-dev running error: an unhandled error occurred: program failed:
docker:image:Image remix-app-dev **failed** 1 error
docker:image:Image temporal-worker-dev **failed** 1 error
pulumi:pulumi:Stack alertdown-infra-dev **failed** 1 error
Diagnostics:
docker:image:Image (remix-app-dev):
error: Error: Cannot perform an interactive login from a non TTY device
docker:image:Image (temporal-worker-dev):
error: Error: Cannot perform an interactive login from a non TTY device
pulumi:pulumi:Stack (alertdown-infra-dev):
error: an unhandled error occurred: program failed:
waiting for RPCs: docker login failed with error: exit status 1
```

I have two docker containers, and this is my yaml:

```
name: Deploy to Staging
on:
push:
branches:
- main
permissions:
actions: read
contents: read
id-token: write
jobs:
ci:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v2
- uses: pnpm/action-setup@v4
with:
version: 9
- uses: actions/setup-node@v4
with:
node-version: 22
cache: 'pnpm'
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build affected apps
run: pnpm exec nx affected -t build

deploy:
runs-on: ubuntu-latest
environment: staging
needs: [ci]
steps:
- uses: actions/checkout@v4
- name: Create .env file
run: |
cat << EOF > libs/infrastructure/src/pulumi/.env
PULUMI_MAIN_SERVICE_ACCOUNT_STAGING="${{ secrets.PULUMI_MAIN_SERVICE_ACCOUNT_STAGING }}"
PULUMI_WORKLOAD_IDENTITY_PROVIDER_ID_STAGING="${{ secrets.PULUMI_WORKLOAD_IDENTITY_PROVIDER_ID_STAGING }}"
PULUMI_DOPPLER_REMIX_PROJECT="remix-app"
PULUMI_DOPPLER_REMIX_STAGING_TOKEN="${{ secrets.PULUMI_DOPPLER_REMIX_STAGING_TOKEN }}"
PULUMI_DOPPLER_REMIX_STAGING_BRANCH_NAME="stg"
PULUMI_DOPPLER_TEMPORAL_PROJECT="temporal-worker"
PULUMI_DOPPLER_TEMPORAL_STAGING_TOKEN="${{ secrets.PULUMI_DOPPLER_TEMPORAL_STAGING_TOKEN }}"
PULUMI_DOPPLER_TEMPORAL_STAGING_BRANCH_NAME="stg"
PULUMI_DOPPLER_CLOUD_RUN_REMIX_STAGING_TOKEN="${{ secrets.PULUMI_DOPPLER_CLOUD_RUN_REMIX_STAGING_TOKEN }}"
PULUMI_DOPPLER_CLOUD_RUN_TEMPORAL_STAGING_TOKEN="${{ secrets.PULUMI_DOPPLER_CLOUD_RUN_TEMPORAL_STAGING_TOKEN }}"
EOF

- name: Configure Workload Identity Federation
id: auth
uses: google-github-actions/auth@v2
with:
workload_identity_provider: ${{ secrets.GCP_STAGING_WORKLOAD_IDENTITY_PROVIDER_ID }}
project_id: ${{ secrets.GCP_STAGING_PROJECT_ID }}
service_account: [email protected]
token_format: 'access_token'

- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v2

- name: Configure Docker for Artifact Registry
run: |
gcloud auth configure-docker us-east1-docker.pkg.dev

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Login to Artifact Registry
uses: docker/login-action@v3
with:
registry: us-east1-docker.pkg.dev
username: oauth2accesstoken
password: ${{ steps.auth.outputs.access_token }}

- name: Run Pulumi
uses: pulumi/actions@v6
with:
work-dir: 'libs/infrastructure/src/pulumi'
command: 'up'
stack-name: 'alertdown/alertdown-infra/dev'
comment-on-pr: true
env:
PULUMI_ACCESS_TOKEN: ${{ secrets.PULUMI_ACCESS_TOKEN }}

```

I've verified that my service account has the right permissions, and that the `google-github-actions/auth@v2` works correctly.



Any ideas? I don't know what else to try.

https://redd.it/1ix1bf8
@r_devops
Inexpensive managed code repos

Hey all,

I'm a CIO for a manufacturing firm. They have a couple of engineers who have asked me to spin up code repo infrastructure for storing some code and config files. Nothing serious, and they don't publish any public-facing apps. I have no intention on spending some inordinate amount of money.

That said, to ensure these repos are managed by the organization but the engineers retain full management of creating/managing repos, etc. There really should be very little if any IT support cost, it just needs to be owned by IT so that a terminated employee's code/configs cannot just be lost to time.

We use exclusively Microsoft services, so I was thinking potentially Azure Repos or GitHub. I have an absolute requirement for Entra SSO, but otherwise this will be a simple Git server. That said, what solution would be best for us?

Sorry if I seem exceedingly unfamiliar - I am! I don't typically work with firms that have any software dev capabilities.

https://redd.it/1ix490z
@r_devops
Is Product Hunt rigged? Some products start with 50 votes, is that normal?

Hey everyone,
I posted my product today on Product Hunt and I’ve been working hard to create hype around it on X, LinkedIn, and Reddit. However, looking at the graph, I noticed something odd—some products seem to get 50 votes or more right from the start, while mine (and others) had to build up votes over time. It’s pretty clear that some products are boosting votes or starting with 50 votes out of nowhere.

Is this normal? How do some products get such a big initial push while others, like mine, don’t get the same? Any thoughts on this?

Thanks for your input!

https://drive.google.com/file/d/1QRt8PnAfN8lWeLD4S6v3TKbIyDwuL7hv/view?usp=sharing
this is the graph of the vote

https://redd.it/1ix5xgc
@r_devops
Do companies hire fresher DevOps?

Does company hires newbie with no Job experience in DevOps but has build some impressive projects revolving around DevOps?

https://redd.it/1ix5a7g
@r_devops
Anyone gave the CKA AFTER 18th Feb Changes?

Hello everyone, my exam is scheduled on 2nd March. Can anyone share the exam experience if they gave the new exam.
Thanks

https://redd.it/1ix17lw
@r_devops
Ente: Self Host the Google Photos Alternative and Own Your Privacy

Hey folks,

After seeing too many half-baked self-hosting guides that leave out crucial production details, I decided to write a comprehensive guide on deploying Ente (an end-to-end encrypted Google Photos alternative) using Kubernetes.

What's covered:

- Full K8s deployment manifests with Kustomize
- Automated Docker image builds with GitHub Actions
- Frontend deployment to GitHub Pages
- Proper secrets management with External Secrets Operator
- Production-ready PostgreSQL setup using CloudNative PG operator
- Complete IaC using OpenTofu (Terraform)

No fluff, no basic tutorials - just practical, production-ready code that you can adapt for your setup.

All configurations are available in the post, and I've included detailed explanations for the important bits.

https://developer-friendly.blog/blog/2025/02/24/ente-self-host-the-google-photos-alternative-and-own-your-privacy/

Happy to answer any questions or discuss alternative approaches!

https://redd.it/1ix6zo8
@r_devops
How is your API Manager instances managed from an organization structure?

Loaded question but interested in how the Azure API Managment, API Gateways, etc. managed within your organization. I have the most experience with azure APIM so may use APIM constructs that may or may not translate to the AWS, GCP, compatible services. Generally, I see two parts. One is the onboarding of the infrastructure such as deploying the APIM using terraform, ensuring TLS, and network connectivity is good to go. Then things get a bit spicy.


\- Global Policies, subscriptions, and general architecture

\- Application Team onboarding processes (API Ops)



Just curious if you have a single team that manages all aspects of APIM or if there's a shared responsibility model?



https://redd.it/1ix6rlo
@r_devops
Looking for an Open-Source Logging Solution with S3 + Parquet + Querying Support

Hey everyone!

We're currently using OpenSearch for logging, but we frequently need to access older logs. We're looking for an open-source solution that can store logs in AWS S3 in Parquet format while still allowing us to query them directly from S3.

Additionally, we sometimes need to perform upserts on logs, which is much easier in S3 compared to OpenSearch, where it can take days to process.

Any recommendations?

https://redd.it/1ixg9k7
@r_devops
Ephemeral environment companies that support docker compose and helm?

We have about 20-30 uservices and we use k8s to make our production and single staging environment. It has been an issue for a while now, but it’s getting really bad: contention over the single DB is the biggest issue.

We tried to build our own ephemeral environment or preview apps before but my company didn’t give us enough time on it and did not put enough resource towards it resulting in a half done project; total waste of time and money.

Is anyone using release.com or qovery? We met with both and release seems to support our more complicated environment: docker compose, uservices, helm and they support production or staging database clones with rds and cloud sql. This is seems really cool vs seeds or managing it ourselves.

Anyone have experience with either company? Trying to make a decision soon. Any other companies we should look at?

https://redd.it/1ixgkic
@r_devops
Just Got My GCP Professional Cloud DevOps Engineer Cert! 🎉🎉

Super hyped to share that I’m now a Google Cloud Professional Cloud DevOps Engineer!
It’s been a wild ride learning all things CI/CD, automation, and SRE, but totally worth it.

https://redd.it/1ixiute
@r_devops
👍1
Managing TLS cert on k8s with subdomain concern

Hello everyone,

I have a question regarding TLS certificate management in Kubernetes. I understand that Cert-Manager uses the ACME protocol to automate certificate issuance and renewal. However, I currently obtain my certificates manually—generating a private key, creating a CSR, and submitting it to a certificate provider for issuance.

# My Setup:

My team manages the domain [abc.example.com](https://abc.example.com), meaning we control everything under this subdomain.
I obtained a wildcard certificate for *.abc.example.com, which I use for all services and Ingress resources in my development environment.

# My Questions:

1. Can Cert-Manager effectively manage wildcard certificates for third-level subdomains?
Since my company uses Entrust as our certificate provider, I found that they offer an ACME server that supports HTTP-01 and DNS-01 challenges.
Would it make sense to configure a ClusterIssuer in Cert-Manager to handle my wildcard certificate (*.abc.example.com)?
2. How does the ACME challenge work for third-level subdomains?
If I create a ClusterIssuer for [abc.example.com](https://abc.example.com), can I request a certificate for [uptime.abc.example.com](https://uptime.abc.example.com) directly by specifying:yamlCopyEdittls: - hosts: - "uptime.abc.example.com"
Or do I need to start at the root domain (example.com) level and work down from there?

&#8203;

  tls:
  - hosts:
    - "uptime.dev-k8s.med.ubc.ca"



# Seeking Recommendations:

For those managing Kubernetes TLS certificates at a similar subdomain level, how do you approach this?

Do you use Cert-Manager to automate third-level subdomain certificates, or do you prefer manual wildcard certs?
What’s your recommended best practice to simplify certificate management for a company-wide setup?

Looking forward to your insights and discussion. Thanks in advance for your responses!

https://redd.it/1ixisb1
@r_devops
Why do you want to become a DevOps engineer?

To all of you out there who would like to become DevOps engineers: why so?

I’m seeing a lot of questions in this subreddit like „how do I switch from X to DevOps” or „how do I learn quickly how to do DevOps”, etc.

Are there any particular aspects you find tempting? Money? Cloud? Automation?

Let’s discuss!

https://redd.it/1ixpft7
@r_devops
Github Enterprise Service Account Management

Looking for some guidance on what others are doing in enterprise environments for service accounts they use for Github. As a security leader, normally I would want to make service accounts passwordless and use keys I can auto-rotate. But github doesn't seem to allow for this. Additionally, if we enable full on SCIM and team sync with Okta, it seems the options are more limited.

I feel like this particular situation must have been dealt with an infinite amount of times but I cannot find great documentation/guidance anywhere.

Basically this is what I am looking to do:

1) Have automated provisioning with SSO enabled for Github

2) Allow for use cases with terraform scripts that need to leverage github accounts but properly secure these. Preferably with secrets/keys and not passwords but open to any solution that can be autorotated to minimize credential exposure.

Has anyone found a good solution for this yet?

https://redd.it/1ixp6a7
@r_devops
Document pipelines

Hi,

I would like to know if it's possible to document CI/CD pipelines?

Are there any best practises?

How to better represent them using standardized techniques?

I would like to know your views or practises that you have in place.

https://redd.it/1ixrj8t
@r_devops
2,160 DevOps jobs scraped from corporate websites (hiring.cafe)

I got sick and tired of ghost jobs & 3rd party offshore agencies on LinkedIn & Indeed. So I wrote a script that fetches jobs from 30k+ company websites' career pages and uses ChatGPT to extract relevant information (ex salary, remote, etc.) from job descriptions. You can use it here: (HiringCafe). Here is a filter for DevOps jobs (2,160 and counting). I'm also scraping every company page 3x/day, so the results will stay fresh if you check back the next day.

Hope this tool is useful! Please lmk how I can improve it. You can follow my progress on r/hiringcafe

Here is my technique for doing so (for analytics nerds):

(1) Identify list of verified companies: I use Apollo.io to identify companies that could be hiring. I wrote a web crawler that crawls each corporate page, and then used ChatGPT o1-mini to classify (binary classification) each page if it contains a job description or not. If it contains a job description, I add it to a list and proceed to step 2

(2) After playing with ChatGPT's API, I realized that you can effectively dump raw job descriptions (in HTML) and ask it to give you formatted information back in JSON (ex salary, yoe, etc). I used this technique to scrape 1.6 million jobs (with over \~50k remote jobs) and built powerful filters.

(3) Once I had the structured JSON data (containing salary, years of experience, remote status, job title, company name, location, and other relevant fields) from ChatGPT's extraction process, I needed a robust search engine to allow users to query and filter jobs efficiently. I chose Elasticsearch due to its powerful full-text search capabilities, filtering, and aggregation features. I built a simple UI on top of this using React/Tailwind.

https://redd.it/1ixsdz1
@r_devops