Reddit DevOps
268 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
I made a TUI for OpenTofu (Terraform) provider registry

If you're like me, when developing terraform code, you often switch to your browser and then google "terraform aws provider" or "terraform github provider" to browse available resources, their documentation, versions etc. I hated that workflow and decided to fix it by creating a TUI that interacts with OpenTofu registry API (still compatible with Terraform). Now whether you are a VIM, VSCode or IntelliJ user, you can use the terminal that's always nearby to look up exactly what you need.

GitHub: https://github.com/djetelina/tofuref
PyPi: https://pypi.org/project/tofuref/


Any feedback and suggestions are appreciated, while I was content enough with the current state to release it as 1.0, I'm sure there's more this tool could do :)

https://redd.it/1kqynmk
@r_devops
Feeling lost - dont know what to do with my career

Hi guys,
I am writing this post, as I am lost what to do with my career.

Small backgroud:
I am 23, and 3 years ago, just after my first year at university, I started internship in a big company, as I wanted to quickly gain some experience and internships at my collage are obligatory anyway (studing Telecomunnication engineering/CS).
As I was really devoted to the internship (Python developer), I took every extra task possible and tried to help with every interesting topic in sight, got very positive feedback and I stayed in.
With time my job quickly gravitated towards DevOps, more responsibilities, while still studing full time.

And here I am, after 3 years of studing full time, while in breaks between one lecture and another logging to dailes and meetings, spending all my spare time doing homeworks after work or doing work after day at university.
I berely finished my degree, after extending it for a half a year.
Now, after pursuing my master for half a year, I will probably start it again, as I failed most of exams already.
Things which used to be fun, now are only a chore, I have to force myself to study anything after 8 hours at work. Even things that used to interest me.

Now I am staring at another failed pipeline in terraform, wondering how did I finished here. Something that was supposed to be quick internship, ended in being full time career.
But here is a trap which I dont know how to deal with: the job is well paid, much more then any of my collegues from uni do, the team is fine and I am really appriciated here. The problem is, I dont really like this kind of job, I always wanted to do something more "interesting" and this job is quite frustrating (continous debugging, fixing pipelines and waiting ages for someone to do his tasks to unblock me (big company)).

I am feeling lost with next steps:

1. Taking some loooong break, and focusing on uni.
2. Trying to focus on job, hoping it will get better with more free time (but I am not sure if I will ever go for master degree if I skip it now...), maybe DevOps isnt that bad and I will regret changing career in future?
3. Trying to join company focused on my interest (space exploration, also programming) which I am after first rounds of interview and waiting for decision. Catch is, its half a salary which I make here.

https://redd.it/1kr0a51
@r_devops
Similar to cold start problem

My spring boot application is taking 120s to start, When a new pod gets spawned up in kubernetes cluster.

So, I have to include the readiness probe. Which is slow downing the load testing.

am I missing something here. can the spring application start can happen beforehead?

https://redd.it/1kqxx9u
@r_devops
Built something to monitor and forecast API usage across providers like OpenAI — curious if other DevOps folks face this pain

Hey all,

I’ve been working on a side project to deal with a challenge I ran into while building with LLM APIs — tracking and forecasting usage across providers like OpenAI and Anthropic. Especially when running workloads at scale, it’s easy to lose visibility into token consumption, cost spikes, or quota limits.

The tool I’m building:
• Monitors real-time usage (tokens, credits, endpoint data)
• Alerts when you hit certain thresholds (like 80% of quota)
• Forecasts future usage based on historical trends
• And checks if providers are up/down before your workflows break

Would love to know:
Do any of you manage LLM or third-party API usage this way?
What tooling do you use today to keep track of spend and reliability?

Not trying to pitch anything — just genuinely curious how others are solving this in a DevOps environment, especially when infra teams are told to “make sure OpenAI doesn’t break production” 🙃

If you’re interested, I’m happy to share a link in the comments so you can try it out and give feedback. Thanks!

https://redd.it/1kr222o
@r_devops
Anybody here built their own K8s operator? If so, what was the use case?

I’m trying to expand my K8s knowledge and Go skills by figuring out some good use cases for creating my own operator.

So far, the only thing I could come up with is an operator that analyzes cluster event logs and offers up a report for security improvements leveraging AI API.

I would like to find something a bit more practical though.

https://redd.it/1kr2twg
@r_devops
Are my requests for compensation unreasonable?

Hello!

Looking to jump ship on a failing startup. I have 3.5 yrs of intimate DevOps experience and another 7ish with traditional Sysadmin/DBA knowledge. I'm the main IC of our team and also leading/managing. I'm looking for a new role. Senior Devops, SRE or Cloud Platform and my asks are:

* $170k or more (realistically it's a starting point and I would probably go down to $150k)
* 100% Remote
* Also my kube experience is somewhat limited outside of EKS :/

Am I asking for the world when I'm really not worth that? Have not got a lot of traction on applications so far.

Here's a snip from my resume:

```
Core Competencies

Infrastructure Platforms: AWS, GCP, Linode, On-Premise & Co-Located Data Centers
IaC: Terraform, Terragrunt, CloudFormation, Ansible, Packer, AWS CLI/SDK
Monitoring & Observability: Datadog, Prometheus, Grafana, Loki, OpenSearch, ELK stack
Scripting & Automation: Python, Golang, Java, Bash, Lambda, Step Functions
Orchestration: EKS, Docker, Rancher, Helm, AWS ECS
CI/CD: CircleCI, GitHub Actions, AWS CodePipeline/Deploy/Build, Elastic Beanstalk, AWX, Packer
Web & Runtime Environments: Apache, PHP, Nginx, Traefik
Databases: PostgreSQL, MySQL, MongoDB, MSSQL, Oracle
Data Tools: Airflow (Astronomer), Snowflake, dbt
Compliance & Security: PCI, SOC2, AWS WAF, Cloudflare, Apache ModSecurity

Professional Experience
DevOps Engineering Manager | Oct 2024 – Present
DevOps Engineer | March 2022 – Oct 2024

Led and designed a full-scale cloud migration from a legacy hosting provider to AWS, establishing a secure, scalable multi-account architecture to support long-term growth and compliance.

Broke apart a tightly coupled monolith into containerized microservices deployed via Amazon ECS, improving deployment speed, fault isolation, and scalability.

Enabled developer self-service and infrastructure consistency by authoring reusable, opinionated Terraform modules for AWS resources.

Automated previously manual deployments by orchestrating CI/CD pipelines across CircleCI, GitHub Actions, and AWX, improving delivery speed and reliability.

Replaced a costly third-party WAF/CDN with a fully managed AWS WAF and CloudFront solution, saving over $125,000 annually without compromising security posture.

Reduced operational toil and unblocked engineering teams by writing targeted automation (scripts, Lambdas, monitoring hooks) to bridge platform gaps and streamline workflows.

Championed observability, compliance, and performance tuning efforts across dev, staging, and production environments, supporting both legacy systems and modern stacks.
```

https://redd.it/1kr4fia
@r_devops
Upcoming Grad wanting to get into Cloud or DevOps - I need resume help

Hey everyone!

I'm currently set to obtain a degree in Computer Science (Cloud Computing specialization) from my college, as I sought to direct my career trajectory towards IT roles related to cloud and DevOps (i.e. Cloud Support, SWE, DevOps Engineer, SRE, DevSecOps Engineer, etc.). Throughout my time, I've undertaken multiple projects that involved specific tools used by professionals (Terraform, Jenkins, Kubernetes, ArgoCD, AWS services, Prometheus, Grafana, etc.) or involved building different types of cloud infrastructures and web applications. I've added these projects to my resume which ran up to 2 pages, so I condensed it down to one page:

Resume: Current Resume

It's tough to gauge what the job market is right now, but it seems as though it's quite tough to land interviews, despite the experience listed on my resume. For some reason, I feel as though both my work and project experiences appear to be... unimpressive, which has been pushing me to undertake more complex projects and even consider taking AWS certification exams. Networking is admittedly tough for me as well. The projects I've done were generally done with web servers launched from AWS, so I've been gradually rebuilding them so that I can include them in my GitHub repos.

Ultimately, I just feel stuck. I know resumes always have room for improvement, so I think there certainly must be something wrong (or hindering) my resume. Can anyone help review my resume and share any suggestions, insights, or critiques you have? I would absolutely appreciate any advice!

https://redd.it/1kr2tr5
@r_devops
backup for local code devs might lose?

before pushing to staging, which is authorized by mr. big boss, these guys work on trillion branches, which i assume is bad practice to push to the non CI branches...seems like too crowded for the repo.

what happened is that one of our devs accidentally erased all his local files(git stash pop).

we've went over his flow - that he should first do git stash apply, and then garbage dispose at the end of the day manually. but these things can happen still.


so if you can offer some best practices?

what i know so far

1)git bundle, not sure exactly how to use.

2) repo for backup for devs, without the whole code of the app-for tenacity/contain sensitive code.

3) simply toss non CI branches to the usual repo..



https://redd.it/1kr747f
@r_devops
Is there demand in Europe for a tool that scans Kubernetes clusters for security and inefficiency?

I'm an engineer working on an idea for a new tool aimed at European companies running Kubernetes.

The goal is to automatically surface both security issues and inefficiencies in clusters. Things like overly permissive RBAC, missing network policies, or unsafe pod configurations. But also unused configmaps, idle workloads, or resource waste from overprovisioning.

Most of the tools I see today are US-based, which in the current light of day can feel uneasy for european companies. E.g., looking at what happened with Microsoft banning accounts. What I have in mind is something you can self-host or run in a European cloud, with more focus on actionable findings and EU Privacy Laws.

I’m curious:
\- What do you currently use to monitor this?
\- Is this even a real problem in your day-to-day?
\- Would you consider paying for something like this, or do you prefer building these checks in-house?

Happy to hear any and all feedback. Especially if you think this is already solved. That’s valuable input too.

https://redd.it/1krc7w5
@r_devops
What tools do you use for adhoc remote execution?

Question mainly concerned with cloud native deployments but could extend to onprem. For context, we have thousands of k8s and compute instances running in all public clouds, but this concerns orgs of any nontrivial scale.

Often in the course of automated or manual incident response, we'll want to run some (potentially distributed) operation, e.g.:

all clusters running workloadA --> execute shell command in a chosen pod, and potentially do something with the output (think lightweight dag workflow)
in all k8s where cluster name matches some pattern --> rollout restart sts in namespaceY
instances where cpu > 90% --> generate diagnostics and push to s3
list configmaps in aws us-east-1 with updated >= 7d

TLDR: query engine + workflow engine for cloud environments.

What tool(s) are you using to solve this? If vendored (Datadog Workflow Automation, PD Runbook Automation), is your team happy with it?

https://redd.it/1krdlb9
@r_devops
Discussion: On running Cypress tests when code is currently split into multiple repos (frontend and backend) & also for each pull request from those repos

Hello,

I am trying to fulfill a technical design requirement and I think I have a way but want to ask here (hoping I can find better options):

Current setup: I have a frowned and backend repos and the code gets deployed on k8s cluster and then we update Cypress with the Ingress URL (post frontend and backend with ingress) for running the tests.

We use GitHub Action Workflows as our CI (And ArgoCD as CD, which is not a topic in this conversation)

Ask: We need ephemeral env's where for each PR (from either repos), we want the cypress to run. But, in order for cypress to run it needs a working both frontend and backend (with ingress) to run in order to run the end-to-end tests.

What I came up with here is:

* For each PR (for example frontend PR), I can label with the {pr\_name} and deploy a copy of the backend deployment and pass the payload to cypress and vice-versa.
* But with this approach, I need to add the kustomize yaml files of both frontend and backend into my GitHub Action workflows in the Cypress tests.
* Is this the best approach? Can I make it better than this approach?



**On the side (I also):**

I also have a working CI/CD integration with these separate repos, where when there is a PR created, I have a CI in those repos to handover the build docker sha to the kustomize modules repo and in that repo, I have an argocd Pull Request Generator waiting for it to consume it and deploy a new namespace based on the PR\_LABEL that I abreast set.



I am all ears on how the community approached this design setups 🙋🏻‍♂️🙋🏻‍♂️

Cheers!!

https://redd.it/1krf463
@r_devops
Need feedback on "Fantastic Job Finder 2000"

Hey r/devops,

I've been looking for work for almost a year now, and out of utter boredom, hacked together a tiny open-source "tool" (if you could call it that):

Parses a YAML profile → searches boards, google etc. → asks ChatGPT to re-order a résumé for each posting
Keeps facts honest by only re-phrasing what’s in the YAML,
Spits out an ATS-friendly Markdown/PDF.
Digs up any dirt it can find on a company and advises of it. Layoffs, high turnover, displeasure with management, etc.

Repo: **https://github.com/vsysio-bgould/jobhunt**

I’d love eyes on the prompt design / YAML schema.

What’s missing for a DevOps résumé?
Too opinionated on cloud separation? Would I even be considered for an Azure role, seeing as I only know AWS?
Ideas to slap a UI on this thing?
YAML make sense for this prompt?

Since I've been using it, my response rate has gone up ten-fold. I've had 3 interviews this week already. I was lucky to get one a month before.

And yeah, I know the name is cheesy. I'm bad with names.

Has anybody tried this approach before for their job search? Any suggestions to improve it?

Also, does it make sense for me to keep excluding US jobs, since I'm Canadian? Since all this tariffs nonsense began, I've had exactly 0 US employers or recruiters reach out to me, despite representing about 300+ applications.

https://redd.it/1krfws2
@r_devops
Advantages of running own Kubernetes cluster on a rented server?

My organization is pushing for renting servers and installing and maintaining our own kubernetes cluster instead of paying for a managed kubernetes cluster. I simply don't see the point in installing and maintaining it ourselves, anyone?

https://redd.it/1krdpit
@r_devops
What does devops/ cloud infrastructure look like in the finance sector?

Curious as I’ve always wanted to work for a bank/ fintech

https://redd.it/1kri1t6
@r_devops
Collective Consciousness Simulator

Collective Consciousness Simulator

The following Google Colab Node Book contains the first Collective Consciousness Simulator. It can be used, distributed, improved, and expanded collectively in any way.

The collective expansion of this simulator could achieve a level of significance comparable to that of ChatGPT. But it is very hard to start the prozess so please follow the link and leave me a comant

Link: https://colab.research.google.com/drive/1t4GkKnlD3U43Hu0pwCderOVAEwz25hnn?usp=sharing

https://redd.it/1krny6f
@r_devops
Docker Command Tips & Tricks for Everyday DevOps Work!

Hey everyone 👋

If you're working with containers regularly and want to boost your Docker command-line game, I put together a collection of **handy Docker tricks** that can save time and reduce headaches.

🔹 What’s inside:

* 🔁 Re-run previous containers quickly
* 🧹 Clean up dangling images and volumes
* 🧪 Run one-off commands without writing Dockerfiles
* 📂 Copy files in/out of running containers
* 🚀 Performance tips for faster image builds

Whether you're a beginner or a seasoned DevOps engineer, I’m sure you’ll find at least one command that makes your workflow smoother.

📘 Check it out:
👉 [https://devopshunter.blogspot.com/2022/07/docker-command-tricks-tips.html](https://devopshunter.blogspot.com/2022/07/docker-command-tricks-tips.html)

Would love to hear what tricks you use that aren’t as well-known!

https://redd.it/1kropbg
@r_devops
Configuration Variables

All my companies applications are configuration driven. At the moment we use Azure DevOps for CICD.

However, the library groups are awful and have no auditing and has grown out of hand. What are your methods for handling mass configuration? My idea was having a configuration repo which the applications can pull in and use.

If any advice, please share!

https://redd.it/1krpa76
@r_devops
Elasticsearch Labs

Hi all, can someone point me to the right direction so i can prepare my self for some interview that wants elasticsearch experience? platforms like kodekloud doesn't have labs for it unfortunately, thanks!

https://redd.it/1krqzg4
@r_devops
Ms teams chat bot

Hi guys,
We’re investigating if it’s possible to build a bot which communicates certain kubernetes actions from teams to a private aks cluster.

In our current situation we have a golang bot running in an azure container app which is connected to slack, this works perfect. The communication works via websocket which makes it quite easy to arrange this. But to my understanding ms teams does not support this. My knowledge with teams is quite basic so I’m kind of wondering if it’s even possible to rewrite this for teams.

Slack is being replaced by teams in my organisation (unfortunately) so hence the use case. I’m curious if someone has done this before and what their experience was like.

Thanks guys!

https://redd.it/1krr2qd
@r_devops
Vibe Coding is great until its not... How are you tackling this challenege personally or in your team?

I promise I’m not turning into a “back in my day” rant, but things just working is becoming rare.. only 3–4 years ago things where basic but bugs where rare to expierence. Yesterday, I was drafting an email in Gmail when suddenly the Send, BBC and Discard buttons just wouldn’t click, and entire lines of text duplicated themselves out of nowhere.

With the pace of software updates, shrinking dev cycles, and now this thing folks call “vibe coding,” it feels like on-call nightmares are staging a comeback.... only this time, nobody truly knows what they’re on call for 😭. Vibe coding can crank out features fast, but pushing it live without understanding its quirks (or owning up when something breaks) strikes me as downright reckless.

Back in the day, on-call meant a team of engineers who knew every corner of the codebase. Now? It feels like handing the keys to a car nobody’s test-driven. Sure, 100% unit test coverage looks great on paper, but it’s not the same as real world, black-box, user-centered validation.

So I’m curious: how are you folks testing or validating “vibe code” in your shops? Have you seen similar random tech gremlins, or is it just my luck? Let’s compare war stories—maybe there’s a better way to keep our digital lives from glitching into chaos.

https://redd.it/1krt8bo
@r_devops
I really hate working in tech but can't do anything else

I've been a Dev for over 20 years with some exposure to DevOps. I really hate everything about it - the people, the "culture", AI. I've gotten to the point where I can barely make myself go into work or even feign the slightest bit of interest / effort each day. Just doing the bare minimum to pass myself.

Anyone else feel like this? What are other potential careers where someone with a tech background can look to switch to? Literally anything would be better than this grey blandness.

https://redd.it/1krtx2h
@r_devops