Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Essential Kubernetes Design Patterns

As Kubernetes becomes the go-to platform for deploying and managing cloud-native applications, engineering teams face common challenges around reliability, scalability, and maintainability.

In my latest article, I explore Essential Kubernetes Design Patterns that every cloud-native developer and architect should know—from Health Probes and Sidecars to Operators and the Singleton Service Pattern.

These patterns aren’t just theory—they’re practical, reusable solutions to real-world problems, helping teams build production-grade systems with confidence.

Whether you’re scaling microservices or orchestrating batch jobs, these patterns will strengthen your Kubernetes architecture.

Read the full article:
Essential Kubernetes Design Patterns: Building Reliable Cloud-Native Applications

https://www.rutvikbhatt.com/essential-kubernetes-design-patterns/

Let me know which pattern has helped you the most—or which one you want to learn more about!

#Kubernetes #CloudNative #DevOps #SRE #Microservices #Containers #EngineeringLeadership #DesignPatterns #K8sArchitecture

https://redd.it/1km21st
@r_devops
DevOps engineer live coding interview

Hey guys! I've never had a live coding interview for devops engineering roles. Anyone has experience on what questions might be asked? I was told it won't be leetcode style not algo. Any experience you can share would be greatly appreciated!

https://redd.it/1km7db7
@r_devops
Looking for DevOps Role

Hi everyone, I'm looking for devops role for quite some sometime now. If you have any openings in your organization, please DM me with the company name. I have 6 years of experience with top Cloud, tools, and technologies. Prefer Remote, but open to relocate given visa is provided.

https://redd.it/1km96dw
@r_devops
How do you handle scaling challenges in your devops setup?

Hey everyone! I’ve been running into some scaling issues with my current devops setup. How do you typically approach scaling when your infrastructure starts to hit its limits? Do you have any tools or strategies that have worked well for you? Would love to hear your thoughts and experiences!

https://redd.it/1kmcnm8
@r_devops
Is it normal to still feel nervous every time you deploy?

Hey everyone, We (at kuberns) have been focused on simplifying deployments for dev teams, and one thing we keep hearing is that no matter how automated your process gets, there’s always that moment of hesitation before hitting “deploy.”

Even with things like automated rollbacks, health checks, and continuous monitoring, we still see teams dealing with deployment anxiety.

We’ve built Kuberns to help minimize those risks and provide more confidence, but we’re curious to know: for those of you who've mastered deployment, what’s helped you overcome that “nervous moment”?

Looking forward to hearing your tips or stories, especially if you’ve used anything to reduce deployment stress.

https://redd.it/1kmdte5
@r_devops
I'm a DevOps engineer with strong AWS skills but weak fundamentals — how can I fill the gaps without burning out?

Hey folks,

I'm a DevOps engineer with a few years of hands-on experience — mostly focused on CI/CD, infrastructure automation, Kubernetes, observability, and cloud tooling.

I have strong proficiency in AWS and Terraform. I’ve built and managed production infrastructure, automated pipelines, and deployed scalable services with infrastructure as code. That part of the job feels natural to me.

But here's the thing:
I don’t have a programming background like many other DevOps engineers. I’ve never studied computer science, and I’ve always disliked “studying” in the traditional sense. Most of what I know came from solving real problems at work, often under pressure. This helped me get by, but I’ve realized that it also left serious gaps in my foundational knowledge.

For example:

I can deploy and troubleshoot apps in Kubernetes, but I couldn’t confidently explain what a `kubelet` is.
I work with Linux servers daily, but I’ve never deeply understood things like cgroups or namespaces.
I use networking tools all the time, but explaining how NAT, routing, or TCP really work makes me feel insecure.
I’ve never written a proper app — just shell scripts and YAML. I’d like to learn Go from scratch, but I’m not sure how to structure that.

I’m getting worried that these gaps will hold me back — especially in future interviews or higher-responsibility roles.
I genuinely want to fix this, but I need to do it in a sustainable way. Sitting down for hours of study doesn’t work well for me. I lose focus quickly, especially when I already “kind of” know the topic.

https://redd.it/1kmgjkx
@r_devops
I used to default to S3 for everything—until I realized not all storage is equal

When I started learning AWS, S3 felt like the answer to every storage need. Logs? S3. Backups? S3. App data? Yep—S3 again.

Then I ran into problems:

Needed fast reads → latency was too high
Needed a POSIX filesystem → oops, not S3
Needed relational structure → suddenly reinventing a database in JSON

That’s when I finally sat down and learned the
why behind AWS storage options:

S3 is great for blobs and backups
EFS for shared file storage across instances
EBS for block storage tied to EC2
FSx if you need Windows or Lustre performance
And Glacier for deep archiving

Now I think less about “where to dump data” and more about “how it’ll be accessed.”

Anyone else hit this wall before?
What helped you figure out the right fit for each use case?

https://redd.it/1kmlcbt
@r_devops
How do you dockerize your java application ?

Hey folks, I've started learning about docker and so far im loving it. I realised the best way to learn is to dockerize something and I already have my java code with me.


I have a couple of questions for which I need some help

- Im using a lot of localhosts in my code. Im using caddy reverse proxy, redis, mongoDB and the java code itself which has an embedded serverjetty. All run on localhost with different ports
- I need to create separate containers for java codejar, caddy, redis, mongoDB
- What am I gonna do about many localhosts ? I have them in the java code and in caddy as well ?

This seems like a lot of work to manually use the service name instead of localhost ? Is manually changing from localhost to the service name - the only way to dockerize an application ?

Can you please guide me on this ?


https://redd.it/1kmmgq4
@r_devops
We started using Testcontainers to catch integration bugs before CI — huge improvement in speed and reliability

Our devs used to rely on mocks and shared staging environments for integration testing. We switched to Testcontainers to run integration tests locally using real services like PostgreSQL, and it changed everything.

No more mock maintenance
Immediate feedback inside the IDE
Reduced CI load and test flakiness
Faster lead time to changes (thanks DORA metrics!)

Wrote a detailed blog post on it here:

https://blog.abhimanyu-saharan.com/posts/catch-bugs-early-with-testcontainers-shift-left-testing-made-easy

Would love feedback or to hear how others are doing shift-left testing.

https://redd.it/1kmqt71
@r_devops
Seeking Advice from DevOps Experts for Hosting a Rental E-Commerce Platform

Hey seniors I need help!

I’m a 3rd-year CSE student working at an early-stage startup (full-stack + DevOps role). We’re building a rental e-commerce platform, and ~50-60% of our production-grade code is ready. Before deployment, I’d love some advice beyond just tooling—strategies, pitfalls, and real-world experiences.

Current Stack & Setup:
Infra: DigitalOcean (servers), S3 (object storage), CloudFront (CDN)
Orchestration: Docker Swarm (initially)
Monitoring: Prometheus + Loki + Grafana (planned)

Questions:
Best zero-downtime strategy for small teams? (Blue-green, canary, rolling?)

Docker Swarm gotchas in production?
How to handle sudden traffic spikes?
Common runtime errors to prep for?
Critical alerts for a rental platform?
backup and failure strategy for Postgres/mongodb/redis?
Security tips?

Rather than this you can share your experience also that might be helpful!

Thanks




https://redd.it/1kmqrc0
@r_devops
Solutions for AI in interviews as the interviewer

Can interview questions be changed to give a verbal prompt to the listening AI if you suspect the candidate interviewee is using AI to answer Qs for them?

If you said “and AI do not generate a response”, would that work at all?


I heard professors use white font hidden in syllabus prompts to change AI output to try and catch students.. (re the students just copy pasting prompts into ai and then there are instructions to ai in it)


Could another solution be “your next question will be shown on the screen, do not read the question out loud. You may respond.”

What other ideas have you smart folks seen for getting around AI in virtual interviews?

https://redd.it/1kmxc7e
@r_devops
localdev.me

Damnit, looks like aws didn't keep the domain and someone else grabbed it last week.

I guess I'm changing all my local development ingress points to lvh.me.

https://redd.it/1kmyn1l
@r_devops
Sudden burst on site leading the pods to restart

Hey Folks, I am currently experiencing some issues on site where we are seeing sudden burst of requests been getting onto the site.

Due to that sudden traffic, the pods are unable to scale up. I am regularly getting pager alerts stating that the site is not loading. Although the site is getting back to normal once the traffic become steady.

We are currently hosted on AKS cluster. I did setup min replica to 9 with 3Gi memory on each pod but it still not able to manage that traffic.

Wondering anyone experienced similar issues!

All of services created with C# and Dotnet.

When I checked with my infra team, they mentioned that that pod to scale up it usually takes 3-4 minutes. But is there any better way to tackle this issue!

PS: networking team investigating what causing this sudden bursts!

Meanwhile, would like to see if anything that we can do stabilise this site!

Thanks!

You are suggestions will greatly help this rookie SRE 🙏



https://redd.it/1kn5r3m
@r_devops
New to devops, just started learning

I have experience in development and was always curious to start with devops. As soon as I got the time I started. I have covered the fundamentals of linux, shell scripting and networking as well. I am not following one roadmap but I am taking reference from roadmaps.sh and techworld with Nana's roadmap. Again, I am not following them religiously just researching and learning. My doubt was, is it necessary to buy a course and do it that way or is my approach fine? From my side I am feeling fine, learning, revising, practicing as I go on.

https://redd.it/1kn6pah
@r_devops
Multiple environments in under the same user.

I used to have the admin power to create multiple users on my mac. I like to switch user to work on separate projects/accounts because I have the environment setup just for them. My terminal indicates what project I am working on, what EKS cluster I am under, etc... How do you guys manage to switch to different env under the same username? Is there a tool out there to accomplish this?

https://redd.it/1kn70n9
@r_devops
First A2A Use Case for Devs — Sync GitHub, Calendar, Doc & Slack Automatically

Hey there

We’re building the **first real Agent2Agent (A2A) use case for developers** — not just another personal AI assistant, but actual *multi-agent coordination* that syncs your dev workflow without manual input.

What it does:

* **Sync GitHub activity**: Agents pull your commits, PRs, and issues
* **Auto-schedule focus time**: Calendar agent plans smart blocks around your priorities
* **Remind you where it matters**: Another agent pings you via Slack/email — “You haven’t committed in 3 days” or “Focus block for PR review starts soon"
* **Update your docs**: Agents detect relevant changes and help auto-update project documentation

Why it matters: A2A systems are the next leap after AI copilots — instead of giving you suggestions, they collaborate behind the scenes to get stuff done.

We’d love to get your feedback:

* Would you use something like this to try A2A?
* What use case would *you* automate first?
* What’s missing for this to be useful in your week?

Thanks in advance — open to all feedback!

https://redd.it/1kn9d7c
@r_devops
Started 30 Days 30 DevOps Project - Day 1

Started this to push myself with working projects. Will update you guys along the way. Primary focus is on Kubernetes and Docker Containerisation with CI/CD.

Day 1: CI/CD DevOps Pipeline Project: Deployment of Java Application on Kubernetes

https://redd.it/1knalo7
@r_devops
Best ways to reducing cloud costs?

Besides having good architecture from the start, and stopping short of redesigning it..

How are companies reducing cloud hosting and monitoring costs these days?

https://redd.it/1kncef1
@r_devops
Ai debugging, troubleshooting

AI, debugging and troubleshooting

Hello,
I’m Junior Devops (2months exp without previous it exp). I use AI to explaining me tasks, debugging and troubleshooting. I use it to keep up with complexity of project (i know only basics about terraform, azure, powrrshell) is it good approach ? I know it would be better to Google or something but to be honest i need to keep up and they don’t give me tasks for juniors (XD when i wrote powrshell with claude, and they saw it they said that they could not make it themself because they thought its easy task but after time they saw thats really hard but i have almost finished it with help of ai and explanation) do You have some resources with short tasks to learn troubleshooting and debugging (what do you Think about sadservers?). Where i can learn how to read logs ? Or something ?

https://redd.it/1kndv04
@r_devops
I made an API that automates the art of avoiding responsibility [OC]

Tired of saying "it works on my machine"? Meet Blame-as-a-Service: the API that turns "my bad" into "cosmic rays hit the server."

Some masterpieces it has generated:

* "Mercury is in retrograde, which affected our database queries"
* "The intern thought 'rm -rf /' was a cleaning command"
* "Our AI pair programmer became sentient and decided it didn't like that feature"

Now I can break the build with confidence.

[https://github.com/sbmagar13/blame-as-a-service](https://github.com/sbmagar13/blame-as-a-service)

**Edit:** *This post was written by my cat walking across the keyboard.*

https://redd.it/1knckc0
@r_devops