Helm gets messy fast — how do you keep your charts maintainable at scale?
One day you're like “cool, I just need to override this value.”
Next thing, you're 12 layers deep into a chart you didn’t write… and staging is suddenly on fire.
I’ve seen teams try to standardize Helm across services — but it always turns into some kind of chart spaghetti over time.
Anyone out there found a sane way to work with Helm at scale in real teams?
https://redd.it/1mhben0
@r_devops
One day you're like “cool, I just need to override this value.”
Next thing, you're 12 layers deep into a chart you didn’t write… and staging is suddenly on fire.
I’ve seen teams try to standardize Helm across services — but it always turns into some kind of chart spaghetti over time.
Anyone out there found a sane way to work with Helm at scale in real teams?
https://redd.it/1mhben0
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Careers UK?
Had a couple of job offers but nothing major in the past few months. 2 years of experience, reckoning I could achieve £60k.
LinkedIn and Indeed just aren’t cutting it anymore for me. I’ve also found applying direct to company gives me more success than recruiters reaching out about FinTech jobs all the time. What do people use in the UK for looking for jobs?
https://redd.it/1mhp1em
@r_devops
Had a couple of job offers but nothing major in the past few months. 2 years of experience, reckoning I could achieve £60k.
LinkedIn and Indeed just aren’t cutting it anymore for me. I’ve also found applying direct to company gives me more success than recruiters reaching out about FinTech jobs all the time. What do people use in the UK for looking for jobs?
https://redd.it/1mhp1em
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Generalize or Specialize?
I came across an ever again popping up question I'm asking to myself:
"Should I generalize or specialize as a developer?"
I chose developer to bring in all kind of tech related domains (I guess DevOps also count's :D just kidding). But what is your point of view on that? If you sticking more or less inside of your domain? Or are you spreading out to every interesting GitHub repo you can find and jumping right into it?
https://redd.it/1mhsle9
@r_devops
I came across an ever again popping up question I'm asking to myself:
"Should I generalize or specialize as a developer?"
I chose developer to bring in all kind of tech related domains (I guess DevOps also count's :D just kidding). But what is your point of view on that? If you sticking more or less inside of your domain? Or are you spreading out to every interesting GitHub repo you can find and jumping right into it?
https://redd.it/1mhsle9
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Anyone found a stable way to run GPU inference on AWS without spot interruptions?
We’re running LLM inference on AWS with a small team and hitting issues with spot reclaim events. We’ve tried capacity-optimized ASGs, fallbacks, even checkpointing, but it still breaks when latency matters.
Reserved Instances aren’t flexible enough for us and pricing is tough on on-demand.
Just wondering — is there a way to stay on AWS but get some price relief and still keep workloads stable?
https://redd.it/1mhu165
@r_devops
We’re running LLM inference on AWS with a small team and hitting issues with spot reclaim events. We’ve tried capacity-optimized ASGs, fallbacks, even checkpointing, but it still breaks when latency matters.
Reserved Instances aren’t flexible enough for us and pricing is tough on on-demand.
Just wondering — is there a way to stay on AWS but get some price relief and still keep workloads stable?
https://redd.it/1mhu165
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Most common Startup Problem - Want to rotate a secret ? - But not knowing where that secret actually existed across our codebase.
Does any paid or free tool offer this solution in appsec space ?
We have recently integrated this feature with DefendStack-Suite asset inventory, we were just trying to solve a problem for one startup.
https://redd.it/1mi072h
@r_devops
Does any paid or free tool offer this solution in appsec space ?
We have recently integrated this feature with DefendStack-Suite asset inventory, we were just trying to solve a problem for one startup.
https://redd.it/1mi072h
@r_devops
GitHub
GitHub - Defendstack/DefendStack-Suite: Open source defense for your entire stack
Open source defense for your entire stack. Contribute to Defendstack/DefendStack-Suite development by creating an account on GitHub.
Indexing issue on my laravel website
Hey everyone, I’ve recently launched a website built with Laravel, but I'm facing issues with getting it indexed by Google. When I search, none of the pages appear in the search results. I’ve submitted the site in Google Search Console and even tried the URL inspection tool, but it still won’t index. I’ve checked my robots.txt file and meta tags to make sure I’m not accidentally blocking crawlers, and I’ve also generated a proper sitemap using Spatie’s Laravel Sitemap package. The site returns a 200 status code and appears to be mobile-friendly. Still, nothing shows up in the index. Has anyone faced similar issues with Laravel SEO or indexing? Any advice or fixes would be appreciated!
https://redd.it/1mi1fzd
@r_devops
Hey everyone, I’ve recently launched a website built with Laravel, but I'm facing issues with getting it indexed by Google. When I search, none of the pages appear in the search results. I’ve submitted the site in Google Search Console and even tried the URL inspection tool, but it still won’t index. I’ve checked my robots.txt file and meta tags to make sure I’m not accidentally blocking crawlers, and I’ve also generated a proper sitemap using Spatie’s Laravel Sitemap package. The site returns a 200 status code and appears to be mobile-friendly. Still, nothing shows up in the index. Has anyone faced similar issues with Laravel SEO or indexing? Any advice or fixes would be appreciated!
https://redd.it/1mi1fzd
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Manager gave bad reviews for getting too involved in code level details
So basically what the title says, my manager gave me a 3/5 rating on satisfaction and his remarks were that I get involved in code level details which is the work of the developers. What even is DevOps then ?? Why the fuck won't I check the code to get an overall understanding of the project, later if anything goes wrong in deployment they'll blame the DevOps people.idk man my company has a totally different understanding of what DevOps means, hardly includes me in regular project meetings . To make it clear i don't mess with the code, I just ask questions related to the app logic or something necessary for the pipeline or cloud infra .
https://redd.it/1mi275k
@r_devops
So basically what the title says, my manager gave me a 3/5 rating on satisfaction and his remarks were that I get involved in code level details which is the work of the developers. What even is DevOps then ?? Why the fuck won't I check the code to get an overall understanding of the project, later if anything goes wrong in deployment they'll blame the DevOps people.idk man my company has a totally different understanding of what DevOps means, hardly includes me in regular project meetings . To make it clear i don't mess with the code, I just ask questions related to the app logic or something necessary for the pipeline or cloud infra .
https://redd.it/1mi275k
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
We built a software that lets you shutdown your unused non-prod environments!
I am so excited to introduce **ZopNight** to the Reddit community.
It's a simple tool that connects with your cloud accounts, and lets you shut off your non-prod cloud environments when it’s not in use (especially during non-working hours).
It's straightforward, and simple, and can genuinely save you a big chunk off your cloud bills.
I’ve seen so many teams running sandboxes, QA pipelines, demo stacks, and other infra that they only need during the day. But they keep them running 24/7. Nights, weekends, even holidays. It’s like paying full rent for an office that’s empty half the time.
A screenshot of ZopNight's resources screen
Most people try to fix it with cron jobs or the schedulers that come with their cloud provider. But they usually only cover some resources, they break easily, and no one wants to maintain them forever.
This is ZopNight's resource scheduler
That’s why we built **ZopNight**. No installs. No scripts.
Just connect your AWS or GCP account, group resources by app or team, and pick a schedule like “8am to 8pm weekdays.” You can drag and drop to adjust it, override manually when you need to, and even set budget guardrails so you never overspend.
Do comment if you want support for OCI & Azure, we would love to work with you to help us improve our product.
Also proud to inform you that one of our first users, a huge FMCG company based in Asia, scheduled 192 resources across 34 groups and 12 teams with ZopNight. They’re now saving around $166k, a whopping 30 percent of their entire bill, every month on their cloud bill. That’s about $2M a year in savings. And it took them about 5 mins to set up their first scheduler, and about half a day to set up the entire thing, I mean the whole thing.
This is a beta screen, coming soon for all users!
It doesn’t take more than 5 mins to connect your cloud account, sync up resources, and set up the first scheduler. The time needed to set up the entire thing depends on the complexity of your infra.
If you’ve got non-prod infra burning money while no one’s using it, I’d love for you to try ZopNight.
I’m here to answer any questions and hear your feedback.
We are currently running a waitlist that provides lifetime access to the first 100 users. Do try it. We would be happy for you to pick the tool apart, and help us improve! And if you can find value, well nothing could make us happier!
**Try ZopNight today!**
https://redd.it/1mi2gmx
@r_devops
I am so excited to introduce **ZopNight** to the Reddit community.
It's a simple tool that connects with your cloud accounts, and lets you shut off your non-prod cloud environments when it’s not in use (especially during non-working hours).
It's straightforward, and simple, and can genuinely save you a big chunk off your cloud bills.
I’ve seen so many teams running sandboxes, QA pipelines, demo stacks, and other infra that they only need during the day. But they keep them running 24/7. Nights, weekends, even holidays. It’s like paying full rent for an office that’s empty half the time.
A screenshot of ZopNight's resources screen
Most people try to fix it with cron jobs or the schedulers that come with their cloud provider. But they usually only cover some resources, they break easily, and no one wants to maintain them forever.
This is ZopNight's resource scheduler
That’s why we built **ZopNight**. No installs. No scripts.
Just connect your AWS or GCP account, group resources by app or team, and pick a schedule like “8am to 8pm weekdays.” You can drag and drop to adjust it, override manually when you need to, and even set budget guardrails so you never overspend.
Do comment if you want support for OCI & Azure, we would love to work with you to help us improve our product.
Also proud to inform you that one of our first users, a huge FMCG company based in Asia, scheduled 192 resources across 34 groups and 12 teams with ZopNight. They’re now saving around $166k, a whopping 30 percent of their entire bill, every month on their cloud bill. That’s about $2M a year in savings. And it took them about 5 mins to set up their first scheduler, and about half a day to set up the entire thing, I mean the whole thing.
This is a beta screen, coming soon for all users!
It doesn’t take more than 5 mins to connect your cloud account, sync up resources, and set up the first scheduler. The time needed to set up the entire thing depends on the complexity of your infra.
If you’ve got non-prod infra burning money while no one’s using it, I’d love for you to try ZopNight.
I’m here to answer any questions and hear your feedback.
We are currently running a waitlist that provides lifetime access to the first 100 users. Do try it. We would be happy for you to pick the tool apart, and help us improve! And if you can find value, well nothing could make us happier!
**Try ZopNight today!**
https://redd.it/1mi2gmx
@r_devops
zop.dev
ZopNight — Manage Smarter | ZopDev
Cut cloud costs by up to 60% with intelligent scheduling, resource optimization, and real-time visibility across AWS, GCP, and Azure.
DevOps just got easier: Free AI agent for system & cloud configs - try it out!
Hey everyone,
I’m excited to share Configen – a fully free AI agent designed to automate and simplify configuration across PCs and cloud environments. Configen acts as your personal AI assistant for managing configs, automating workflows, and keeping your system in top shape with minimal manual effort.
I’m looking for:
Feedback – what sucks, what’s missing, what’s cool?
A technical cofounder (if you’re into AI/automation)
Anyone who wants to test or help out!
Let's connect!
https://redd.it/1mi500f
@r_devops
Hey everyone,
I’m excited to share Configen – a fully free AI agent designed to automate and simplify configuration across PCs and cloud environments. Configen acts as your personal AI assistant for managing configs, automating workflows, and keeping your system in top shape with minimal manual effort.
I’m looking for:
Feedback – what sucks, what’s missing, what’s cool?
A technical cofounder (if you’re into AI/automation)
Anyone who wants to test or help out!
Let's connect!
https://redd.it/1mi500f
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Looking for a technical Co-Founder (AI Stripe Extension for startups)
I am looking for a co-founder.
Project: AI Stripe Extension for startups
Requirements:
\- Over 25 years old
\- From Europe or North America
\- Software developer
\- At least one presentable project with users.
\- Extensive experience with Stripe pricing integration.
DM me for further details.
Thanks
https://redd.it/1mi6l3y
@r_devops
I am looking for a co-founder.
Project: AI Stripe Extension for startups
Requirements:
\- Over 25 years old
\- From Europe or North America
\- Software developer
\- At least one presentable project with users.
\- Extensive experience with Stripe pricing integration.
DM me for further details.
Thanks
https://redd.it/1mi6l3y
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How to test serverless apps like AWS Lambda Functions
We have Data syncing pipeline from Postgres(AWS Aurora ) to AWS Opensearch via Debezium (cdc ) -> kakfa ( MSK ) -> AWS Lambda -> AWS Opensearch.
We have some complex logic in Lambda which is written in python. It contains multiple functions and connects to AWS services like Postgres ( AWS Aurora ) , AWS opensearch , Kafka ( MSK ). Right now whenever we update the code of lambda function , we reupload it again. We want to do unit and integration testing for this lambda code. But we are new to testing serverless applications.
On an overview, I have got to know that we can do the testing in local by mocking the other AWS services used in the code. Emulators are an option but they might not be up to date and differ from actual production environment .
Is there any better way or process to unit and integration test these lambda functions ? Any suggestions would be helpful
https://redd.it/1mi7doj
@r_devops
We have Data syncing pipeline from Postgres(AWS Aurora ) to AWS Opensearch via Debezium (cdc ) -> kakfa ( MSK ) -> AWS Lambda -> AWS Opensearch.
We have some complex logic in Lambda which is written in python. It contains multiple functions and connects to AWS services like Postgres ( AWS Aurora ) , AWS opensearch , Kafka ( MSK ). Right now whenever we update the code of lambda function , we reupload it again. We want to do unit and integration testing for this lambda code. But we are new to testing serverless applications.
On an overview, I have got to know that we can do the testing in local by mocking the other AWS services used in the code. Emulators are an option but they might not be up to date and differ from actual production environment .
Is there any better way or process to unit and integration test these lambda functions ? Any suggestions would be helpful
https://redd.it/1mi7doj
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
We spent weeks debugging a Kubernetes issue that ended up being a “default” config
Sometimes the enemy is not complexity… it’s the defaults.
Spent 3 weeks chasing a weird DNS failure in our staging Kubernetes environment. Metrics were fine, pods healthy, logs clean. But some internal services randomly failed to resolve names.
Guess what? The root cause: kube-dns had a low CPU limit set by default, and under moderate load it silently choked. No alerts. No logs. Just random resolution failures.
Lesson: always check what’s “default” before assuming it's sane. Kubernetes gives you power, but it also assumes you know what you’re doing.
Anyone else lost weeks to a dumb default config?
https://redd.it/1mi8xp6
@r_devops
Sometimes the enemy is not complexity… it’s the defaults.
Spent 3 weeks chasing a weird DNS failure in our staging Kubernetes environment. Metrics were fine, pods healthy, logs clean. But some internal services randomly failed to resolve names.
Guess what? The root cause: kube-dns had a low CPU limit set by default, and under moderate load it silently choked. No alerts. No logs. Just random resolution failures.
Lesson: always check what’s “default” before assuming it's sane. Kubernetes gives you power, but it also assumes you know what you’re doing.
Anyone else lost weeks to a dumb default config?
https://redd.it/1mi8xp6
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
I started posting on youtube and an unexpected outcome has been some of the easiest interviews of my career
https://youtu.be/mE8Xmkk\_qSw
https://redd.it/1mi6jdo
@r_devops
https://youtu.be/mE8Xmkk\_qSw
https://redd.it/1mi6jdo
@r_devops
YouTube
The New Way To Land A Software Engineer Job in 2025
Cloud Engineering Freelancing: https://cloudzap.co or [email protected]
Subscribe: https://www.youtube.com/@joshgeissler?sub_confirmation=1
In this video I walk through my strategy for standing out from the rest of the software engineer industry, positioning…
Subscribe: https://www.youtube.com/@joshgeissler?sub_confirmation=1
In this video I walk through my strategy for standing out from the rest of the software engineer industry, positioning…
Give me a real-structured-roadmap for devops
So i know like basic mern and I am in my 4th year and kindaa realising slowly how fc up is sde and developer role so thinking to quietly shift towards the devops role .
I need like a roadmap through which i can easily learn it in like 2-3 months
I am hardworking and got time .
Help me PLEASE!
https://redd.it/1mic1ae
@r_devops
So i know like basic mern and I am in my 4th year and kindaa realising slowly how fc up is sde and developer role so thinking to quietly shift towards the devops role .
I need like a roadmap through which i can easily learn it in like 2-3 months
I am hardworking and got time .
Help me PLEASE!
https://redd.it/1mic1ae
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Anyone build dev self service around terraform atmos?
We are redoing our terraform across our services by firstly creating centralized terraform modules (instead of the copy paste we have today).
I wanted to take it one step further and introduce atmos to further abstract the terraform away as yaml, and then maybe build some sort of a self-service utility or something which generates that yaml and a PR depending on what infrastructure the developer needs.
Is anyone doing something similar?
Thanks.
https://redd.it/1mib9xg
@r_devops
We are redoing our terraform across our services by firstly creating centralized terraform modules (instead of the copy paste we have today).
I wanted to take it one step further and introduce atmos to further abstract the terraform away as yaml, and then maybe build some sort of a self-service utility or something which generates that yaml and a PR depending on what infrastructure the developer needs.
Is anyone doing something similar?
Thanks.
https://redd.it/1mib9xg
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Sam Lambert (PlanetScale) is an interesting guy
Didn't expect to like his answers, but he talks about the free tier being deleted and some other interesting stuff
https://www.youtube.com/watch?v=BzKUm2pJchI
https://redd.it/1miidg4
@r_devops
Didn't expect to like his answers, but he talks about the free tier being deleted and some other interesting stuff
https://www.youtube.com/watch?v=BzKUm2pJchI
https://redd.it/1miidg4
@r_devops
YouTube
Sam Lambert (CEO @ PlanetScale) on building tools developers actually trust
What defines a truly great developer experience?
Sam Lambert is the CEO at PlanetScale, building the next-generation cloud database. Previously Sam was Vice President of Engineering at GitHub, where he was responsible for scaling the company and culture…
Sam Lambert is the CEO at PlanetScale, building the next-generation cloud database. Previously Sam was Vice President of Engineering at GitHub, where he was responsible for scaling the company and culture…
❤1
How do you all handle automatic version increments? (dev vs release)
Our company uses github and has Branch Protection enabled across all of our organizations, enterprise wide. Branch Protection is a new requirement, so the old versioning flow is broken. I've inherited a legacy python application and I'm feeling REALLY stupid this morning for some reason.
Previously, jenkins would kick off a release.sh script which would (in addition to lots of other stuff) hit "bumpversion" (strips .dev from version for the release), push to master, and then hit bumpversion to increment to .dev. With BP enabled, this is no longer a reasonable work flow, so I need to come up with a workaround.
I'd prefer not to do the versioning manually, but if I must, I must.
How have you all tackled semver increments during releases? I could write a custom app that would bump the release version, automatically create a new PR for master, then bump it back to .dev, wherein I'd have to go approve the PR, but that seems like overkill for some reason.
https://redd.it/1mijq97
@r_devops
Our company uses github and has Branch Protection enabled across all of our organizations, enterprise wide. Branch Protection is a new requirement, so the old versioning flow is broken. I've inherited a legacy python application and I'm feeling REALLY stupid this morning for some reason.
Previously, jenkins would kick off a release.sh script which would (in addition to lots of other stuff) hit "bumpversion" (strips .dev from version for the release), push to master, and then hit bumpversion to increment to .dev. With BP enabled, this is no longer a reasonable work flow, so I need to come up with a workaround.
I'd prefer not to do the versioning manually, but if I must, I must.
How have you all tackled semver increments during releases? I could write a custom app that would bump the release version, automatically create a new PR for master, then bump it back to .dev, wherein I'd have to go approve the PR, but that seems like overkill for some reason.
https://redd.it/1mijq97
@r_devops
Introducing the latest release of the tAI tool
It quickly helps you getting commands you don't remember for your daily work.
https://github.com/bjarneo/tAI
https://redd.it/1mijfmq
@r_devops
It quickly helps you getting commands you don't remember for your daily work.
https://github.com/bjarneo/tAI
https://redd.it/1mijfmq
@r_devops
GitHub
GitHub - bjarneo/tAI: tAI is an AI terminal assistant CLI that helps you with Linux and macOS commands.
tAI is an AI terminal assistant CLI that helps you with Linux and macOS commands. - bjarneo/tAI
What makes devs happy
Curious, what keeps devs motivated and excited? Some devs aren’t as performant as others.
https://redd.it/1miumva
@r_devops
Curious, what keeps devs motivated and excited? Some devs aren’t as performant as others.
https://redd.it/1miumva
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
High latency serving my HF model on Kubernetes (NVIDIA T4)
To be honest, I dont know where else to shoot to issue.
First of all my infra:
3 CPU Workers - N4-Standard-4 ( 4vcpus/16gb )
1 GPU Worker - Nvidia T4 ( 4vcpus /16gb)
I’m running three microservices, grounding, API layer, and RabbitMQ management, and a “GPU consumer” service backed by an NVIDIA T4.
• On dedicated VMs I see about 1.5 seconds round-trip per request, but when I move everything into Kubernetes it never drops below 2.5 seconds.
• I’ve already tried co-locating the API and inference containers in a single pod with hostNetwork but the latency stays the same.
• There is no CPU limitations or Memory Limitations ( pods barely will reach 40/50% )
• on the first API call the GPU Consumer will load up the Model, which takes around 8/10 second to get a response back ( expected ), then it gets stable at 2.5
This happens on self-hosted k3s on GCP VMS or GKE.
This is more or less how it looks
Client > API > RabbitMQ > Grounding Consumer > RabbitMQ > GPU Consumer
Batch processing works wonders, since we dont care about latency at all, but stw it seems impossible.
Thx!
https://redd.it/1mix0oi
@r_devops
To be honest, I dont know where else to shoot to issue.
First of all my infra:
3 CPU Workers - N4-Standard-4 ( 4vcpus/16gb )
1 GPU Worker - Nvidia T4 ( 4vcpus /16gb)
I’m running three microservices, grounding, API layer, and RabbitMQ management, and a “GPU consumer” service backed by an NVIDIA T4.
• On dedicated VMs I see about 1.5 seconds round-trip per request, but when I move everything into Kubernetes it never drops below 2.5 seconds.
• I’ve already tried co-locating the API and inference containers in a single pod with hostNetwork but the latency stays the same.
• There is no CPU limitations or Memory Limitations ( pods barely will reach 40/50% )
• on the first API call the GPU Consumer will load up the Model, which takes around 8/10 second to get a response back ( expected ), then it gets stable at 2.5
This happens on self-hosted k3s on GCP VMS or GKE.
This is more or less how it looks
Client > API > RabbitMQ > Grounding Consumer > RabbitMQ > GPU Consumer
Batch processing works wonders, since we dont care about latency at all, but stw it seems impossible.
Thx!
https://redd.it/1mix0oi
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
DevOps Freelancing in Europe or US
I am curious to know what the market looks like currently for freelancing in the field of DevOps in Europe, especially Germany.
https://redd.it/1mix9wb
@r_devops
I am curious to know what the market looks like currently for freelancing in the field of DevOps in Europe, especially Germany.
https://redd.it/1mix9wb
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community