Reddit DevOps
270 subscribers
9 photos
31.1K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Interview questions!!

I am aware that Devops is a vast field. You have to know tools . You have to know process. You have to show your engineering chops. You have to know yourself. It is not limited to certain tools , certain process or certain organization.
But my question is to the people who are interviewing. Who is your ideal candidate!!
A. Is he who can solve your problem.
B. Is he who fits your team
C. Is he who fits company culture
D. Is he who is affordable

F. Back door ??

This is a vote on you !! Please vote as you please . Trust this as a self assessment test . !!

https://redd.it/1mnu192
@r_devops
What are the hardest tasks you had to complete during your career?

What are the hardest tasks you had to complete during your career? I am curious to know what one might expect as DevOps engineer. I am not a DevOps engineer, but I did take a bunch of courses just in case the job market becomes really competitive since people tend to prefer people with a wide array of skills.

https://redd.it/1mnvau9
@r_devops
Switching inter-service calls from HTTPS to STOMP over WebSockets - Bad idea for enterprise?

**TL;DR:** My team builds software for high-security clients (banks, government). We're considering replacing our inter-cluster HTTPS (REST) calls with STOMP over WebSockets (wss://) for a more message-driven architecture. I have some serious reservations and would love the community's opinion.

**Current Setup:** Multiple Kubernetes clusters, potentially in different regions, communicating via standard HTTPS.

**Proposed Change:** Move to persistent WebSocket connections running the STOMP messaging protocol, all secured by TLS.

**My Concerns:**

* **Security Inspection:** Our customers' Web Application Firewalls (WAFs) can inspect HTTP traffic for threats which won't be true of the new approach.
* **Monitoring & Logging:** With HTTPS, customers get rich access logs (path, status code, etc.) from our ingress controllers and service mesh. With WebSockets, the logs will just show "connection opened" and "connection closed," making it less transparent.
* **Operational Overhead:** Routing and load balancing is harder due to persistent connections.

This change will make our application much more performant, but will it be a blocker for our customers? Is there something that could be done to mitigate these concerns. I was thinking that we could reduce the duration of the persistent connections to a few minutes. It seems like this would at least help with the load balancing problem. What other things can be done? Is this acceptable or a no-go?

https://redd.it/1mnwl0r
@r_devops
Has anyone deployed AI generated stacks with Autocoder?

I’ve been exploring ways to streamline deployment workflows for small projects, and I've seen an Autocoder cc, which auto generates both code and infrastructure files. I’m curious how this compares to manual infrastructure setup from a DevOps perspective, especially in terms of reliability, security, and CI/CD integration. Has anyone here tried deploying something generated by tools like this? What was your experience?

https://redd.it/1mnxsjo
@r_devops
From QA to DevOps, is it possible and worth it?

Hello everyone! Need an advice, because I feel a little confused. I work as a AQA Python and I’ve noticed, that DevOps is very interesting for me, I’ve talked to some people who are in it and I liked their stories, I feel that I really need to dive into it. I’m not really experienced (1 year as a QA, 3-4 months as AQA) so I got some questions and if someone can answer them and clear my misunderstanding I would be so thankful!

1. Does it matter If I have no development background? In university I studied it, but i have no real experience
2. Does programming matter for DevOps and what’s the best choice: Python or Go, or maybe something else?
3. Is it even possible to make such switch or better don’t go into it? And is it worth it? I don’t think I like QA that much, but I have a fear that I wouldn’t be able to find job in DevOps without development experience
4. What’s the best pet projects a newbie in devops can do?

By the way, I have an experience with Docker (not that deep, basics), VM’s and a less with writing YAML files for CI/CD (even tried with secrets). And if there are people who did such switch, share your experience, it would be very interesting to know!

https://redd.it/1mnvapa
@r_devops
DevOps. 10yrs from now

What do you think it's going to look like? the basic need to deploy code is probably going to stay, but perhaps way more automated and diverse? Is it going to be focused on vis and governance?

what's your take and intuition?

https://redd.it/1mo04or
@r_devops
Want to Become a DevOps Engineer - Should I Start in TechOps or Dev Role as a fresher?

I’m a recent CSE graduate, currently deciding between waiting for a delayed software development offer or joining a TechOps role right now (L1/L2 production support, monitoring, incident/ticket handling, basic scripting). Which path is better if my goal is to become a DevOps engineer starting in TechOps or waiting for a dev role and transitioning from there?

My long-term goal is to move into a full-fledged DevOps role. I want to understand from people already in the field:

\- How feasible is it to move from TechOps to DevOps within 2–3 years?

\- What are the exact skills, tools, and certifications I should be focusing on from day 1 to make myself a strong DevOps candidate?

\- Which areas in my current TechOps work should I double down on to align with DevOps practices?

\- Are there common pitfalls I should avoid so I don’t get stuck in pure support work forever?

I’d really appreciate guidance from folks who’ve made this what worked for you, what didn’t, and what you’d do differently if you were starting over today.

Any insights or advice will mean a lot for me now.
Thanks in advance :)

https://redd.it/1mo21vp
@r_devops
Folder level access github

Hi guys,
I wanted to understand in github
We don't have option for folder level access.
What are the other ways?
In large enterprises how is this managed?
Can someone give ideas to explore more.
Apart from submodules what options we have.
Even if we use submodules, do we have to make changes in github workflow too?
Thanks for your time

https://redd.it/1mo4er9
@r_devops
Need suggestion on DevOps tech stack.

I have recently switched company . I have a total of 4 years of experience AWS DevOps. The company i have joined is a retail industry and the team i am part of purely works on aws networking.

I want to know is pure networking beneficial considering future growth. I am much more interested to work on application side and kubernetes, deployments etc.

They do have some level of terraform automation in place for the networking stuff, but i still feel pure networking is not my cup of tea, atleast for now.

Can anyone suggest on this?

https://redd.it/1mo7c3v
@r_devops
How do you measure automation effects?

I'm sure many of you are implementing agentic AI, AI assisted workflows, copilots, and so on, with the goal to improve labor efficiency. How do you measure that actually for your teams? Anything more intelligent and somewhat automated, beyond just manually capturing human efforts & satisfaction rates before and after?

https://redd.it/1mo8b92
@r_devops
[Question] CI/CD Design - Architecture Book Request

Hello fellow devops enthusiast

I’m looking for a solid book (or even an eBook) that goes beyond CI/CD basics and covers design patterns and architecture for real-world setups and could help me face the corporate BS i am facing with the Infra and system teams about environment and security and dev/prod segregation.

Ideally, it should include:

* Production vs development environment design.
* Jenkins agent-controller architecture and best practice.
* Patterns for scaling and securing Jenkins

Examples of integrating Jenkins with Git, Docker, Kubernetes, etc.

I’ve already read Continuous Delivery by Jez Humble, but I’m looking for something more practical, it doesn't matter if it covers Gitlab Runner or Github action, tbh i'm more interested in the architecture and design aspect.


Thank you.

https://redd.it/1mo7p5q
@r_devops
A move to AgentOps Organizations

The focus of many organizations will become human-driven orchestration of agentic agents vs human-driven execution. Our jobs will become the trainers and auditors of these agents fine tuning to fit the needs of the org. This will create flatter looking org charts, where small teams will have dedicated agents as the nucleolus of that department. Example:

Marketing agent,

Operations agent,

Research agent,

Finance agent,

DevoOs agent,

Security agent,

HR agent,

Compliance Agent, etc,

Us humans will be the crew members of those agents. Working cross-departments will be done between the agents not just us picking up the phone or holding a TEAMs meeting with long drawn out 30-60-90 day plans. Our focus will completely be on the agent making sure it operates well in creativity, ethics, judgement and relationship management.

We are at the cusp of this movement. The startups will have the easier time establishing this organizational model, followed by companies with robust r/D departments and finally older modeled organizations like government/federal agencies will eventually migrate.

What are your thoughts?

https://redd.it/1moawl0
@r_devops
7 real S3 screw-ups I see all the time (and how to fix them)

My post in r/aws was blowing up with so much value, so sharing here too!

S3 isn’t that expensive… until you ignore it for a few months. Then suddenly you’re explaining to finance why storage costs doubled.

Here’s the stuff I keep seeing over and over:

1. Data nobody touches - You’ve got objects sitting in Standard for years without a single access. Set up lifecycle rules to shove them into Glacier or Deep Archive automatically.
2. Intelligent-Tiering everywhere - Sounds great until you realize it has a per-object monitoring fee and moves to deep archive at a snail’s pace. Only worth it when access patterns are truly unpredictable.
3. API errors quietly eating your budget - 4xx and 5xx errors are way more common than people think. I’ve seen billions of them in a single day just from bad retry logic.
4. Versioning without cleanup - Turn it on without an expiration policy and you’ll pay to keep every single version forever.
5. Archiving thousands of tiny files - Those 1KB objects add up. Compact them before archiving, you can do it through the API, no need to download.
6. Backup graveyards - Backups that nobody touches but still sit in Standard storage. If you’re not reading them often, save them directly into a cheaper class, worst case - pay for the retrieval.
7. Pointless lifecycle transitions - Don’t store something in Standard for 1 day and then move it. Just put it in the right class from the start and skip the extra PUT fee.

Sounds obvious... but those fixes might be worth 50% of your S3 bill...

(Disclaimer: Not here to sell you anything, just sharing stuff I’ve learned working with a bunch of companies from small startups to huge enterprises after founding reCost. Hope it helps!)



https://redd.it/1moc5xo
@r_devops
Planning to Become a DevOps Engineer in 2025? Here’s What Actually Matters

I see a lot of people jumping straight into Docker and Kubernetes and then wondering why they feel lost. DevOps isn’t just “learn these 5 tools” it’s a mix of mindset, fundamentals, and the right tools at the right time. Here’s a breakdown of how I’d start if I was new in 2025.

1. Learn the Fundamentals First
Before you even touch fancy automation tools, make sure you actually understand the stuff you’ll be automating. That means:

Linux basics (file system, processes, permissions, services)

Networking (IP, DNS, HTTP/S, ports, routing, NAT, firewalls)

System administration (users, groups, package management, logs)

Bash scripting for automating simple tasks

Basic Python scripting (log parsing, API calls, automation scripts)

If you can’t explain what happens when you curl a URL or why a service isn’t starting, you’ll struggle later.

2. Version Control and CI/CD Are Core Skills
Every DevOps pipeline starts with Git. Learn branching, merging, pull requests, and resolving conflicts.

Then move into CI/CD (Continuous Integration/Continuous Deployment). Popular tools:

Jenkins

GitLab CI

GitHub Actions

CircleCI

You don’t just need to “click a deploy button” — understand pipeline stages, automated testing, build artifacts, and how to roll back if something breaks.

3. Containers and Orchestration
Containers are a big part of DevOps. Start with Docker:

Build images with Dockerfiles

Use volumes and networks

Work with multi-container apps via Docker Compose

Once you’re solid there, learn Kubernetes (K8s). Don’t rush this — it’s a lot. Focus on:

Pods, deployments, services

ConfigMaps and secrets

Scaling and rolling updates

Ingress and service discovery

You’ll also want to understand managed K8s services like AWS EKS, Azure AKS, or GCP GKE.

4. Cloud Skills Are Non-Negotiable
Pick one cloud provider to start: AWS, Azure, or GCP. AWS is the most common, but it’s fine to choose based on job market in your area.

Learn:

Compute (EC2)

Networking (VPC, subnets, security groups)

Storage (S3, EBS)

IAM (roles, policies, least privilege)

Then, learn how to deploy containers or Kubernetes clusters in the cloud.

5. Infrastructure as Code (IaC)
This is how you make cloud resources repeatable and version-controlled. Terraform is the most popular and works with all major clouds.

Learn how to:

Define infrastructure in .tf files

Use variables and modules

Apply and destroy infrastructure safely

Store state securely

6. Monitoring, Logging, and Alerting
If you build and deploy something but can’t see when it’s failing, you’re not doing DevOps.

Get hands-on with:

Prometheus + Grafana for metrics

ELK stack (Elasticsearch, Logstash, Kibana) for logging

Cloud-native tools like AWS CloudWatch or GCP Stackdriver

7. Security (DevSecOps Basics)
Security is now a core part of DevOps, not an afterthought. Learn to:

Scan code for vulnerabilities (Snyk, Trivy)

Manage secrets (Vault, AWS Secrets Manager)

Secure Docker images

Apply IAM best practices

8. Build Real Projects
Don’t just follow tutorials. Build something end-to-end, like:

A microservice app with Docker

CI/CD pipeline → Docker → Kubernetes → Cloud deployment

Terraform for infra provisioning

Monitoring + logging setup

Push everything to GitHub with a README that explains your setup.

9. Network With the Community
Join DevOps communities:

Reddit (r/devops, r/kubernetes, r/aws)

CNCF Slack channels

DevOps Discord servers

Local meetups or conferences

Ask questions, share your progress, and help others.

10. Stay Consistent & Keep Learning
DevOps tools evolve fast. Even once you land a job, you’ll keep learning. Read blogs, watch KubeCon talks, experiment in your home lab.

If you start from zero and commit a few hours per week, you could be job-ready in 6–8 months. The key is not to try and master everything at once — build layer by layer, and make sure each new tool you learn connects to something you already understand.

If you want a well-structured course & resource suggestions to
follow this roadmap step-by-step, DM me and I’ll share what worked for me and others breaking into DevOps.

https://redd.it/1moe2i9
@r_devops
Focus Career in DevOps

I have grown to have a strong interest in the world of DevOps and I keep seeing these "road maps" posted on LinkedIn and other threads. I'm curious from those who actually work in a DevOps focused role what a true development path would be.

Currently, I have focus in the following areas

\- Networking fundamentals (certified with CompTIA Net+ and hands on experience with Fortinet and some Cisco)

\- AWS cloud basics with hands on experience with EC2, S3, IAM and CloudWatch. I have noob level experience with Terraform as well.

\- Powershell scripting

\- Microsoft services (Exchange, MDM, SharePoint, Entra)

\- Windows Server

\- Active Directory

\- Linux basics


What areas should I add or consider learning besides the areas I am dedicating time to develop already? I heard Kubernetes and Docker was a good area but I have zero experience with containers so no idea where to even start.

https://redd.it/1moc4mq
@r_devops
Devops job market

Just curious how the devops job market is as compared to software engineering? Is it as bad a software engineering these days?

https://redd.it/1mok4we
@r_devops
Retraining into DevOps/cloud with no prior experience—Is “DevOps Beginners to Advanced with Projects” a solid starting point?

>Hey everyone, I’m looking to switch into a DevOps or cloud role for a better work–home balance and have zero background in IT or ops. I’ve found the Udemy course “DevOps Beginners to Advanced with Projects” (by Imran Teli). It’s a bestseller with 4.6 rating, updated August 2025, over 54 hours of lessons—tools include Linux, scripting, AWS, Jenkins, GitHub Actions, Ansible, Docker, Kubernetes, Terraform, etc.  .

>


>The hands-on, project-based format seems promising, but I wonder whether it’s too broad. Have any of you taken this course (or something similar)? Does it give a solid foundation? What additional resources or next steps would you recommend to truly understand the why behind the tools, and start applying them effectively in real-world scenarios?

>


>Appreciate any advice—even on hands-on labs, free resources, certification paths, or community groups would be really helpful.

https://redd.it/1mokol5
@r_devops
Are LangGraph + Temporal a good combo for automating KYC/AML workflows to cut compliance overhead?

I’m designing a compliance-heavy SaaS platform (real estate transactions) where every user role—seller, investor, wholesaler, title officer—has to pass full KYC/KYB, sanctions/PEP screening, and milestone-based rescreening before they can act.

The goal:

* Automate onboarding checks, sanctions rescreens, and deal milestone gating
* Log everything immutably for audit readiness (no manual report compilation)
* Trigger alerts/escalations if compliance requirements aren’t met
* Reduce the human compliance team’s workload by \~70% so they only handle exceptions

I’m considering using LangGraph to orchestrate AI agents for decisioning, document validation, and notifications, combined with Temporal to run deterministic workflows for onboarding, milestone checks, and partner webhooks (title/escrow updates).

Question to the community:

* Has anyone paired LangGraph (or similar LLM graph orchestration) with Temporal for production-grade compliance operations?
* Any pitfalls in using Temporal for long-lived KYC/AML processes (14-day onboarding timeouts, daily sanctions cron, etc.)?
* Does this combo make sense for reducing manual workload in a high-trust, regulated environment, or would you recommend another orchestration stack?

Looking for insights from anyone who’s run similar patterns in fintech, proptech, or other regulated SaaS.

https://redd.it/1mokg0f
@r_devops
Trading Support Engineer looking to transition into SRE/Devops after lay off. What are my chances?

I am currently weighing my options as I recently got laid off and I see no future in the support engineering role.


It really sucks to be in this position as I know that having different titles in my resume can hurt my chances because I am not going on a sensible trajectory or something\~

My experience:

In the past I have worked as a Quality Analyst for Facebook (2 years) under contract with WiPro, A testing engineer (2 years) for Facebook under Wipro, and a quality assurance engineer for a year at a lesser known company. In my current role as a Support engineer with 4 years of experience, I manage incidents, failovers, config management, troubleshooting kubernetes services, monitoring and alerting, approve releases and do rollbacks. I support a low-latency trading platform at a hedge fund and often have to investigate networking problems using Grafana and look at logs from all types of services.

Transition into Devops/SRE:

As I do my research I came across devops as the path to take when transitioning to SRE roles, but I don't have experience in the following: Cloud, Linux, Terraform, Deployments . I have basic experience with Python, SQL for data analytic projects, and use Grafana and Elk but I don't actually make the dashboards. I know how to use ArgoCD and have used Jenkins before although I forgot. I have exposure to most tools on a superficial level.

My plan:

I am considering doing the Cloud Computing and DevOps Certification Program from Purdue and Simplilearn to get experience in these areas. I think this is going to give me the guidance and structure I need and the hands on experience I am lacking as it's project heavy. After finishing I would take some AWS certs that are relevant to the role's I am applying.

My questions:

\- Has anyone heard of or taken this certification?
\- Is this line of work affected by the tech lay offs?
\- What are my chances of entering a well known company with my experience and the Certifications?
\- Is Support engineering -> DevOps or SRE a good transition path or are these not related?
\- Any advice anyone can give me as I navigate my options in DevOps and SRE?

Side note: I know my work is reactive and Devops SREis proactive. But i think it can help that I deal with live issues in production environments and the goal is to reduce down time?



https://redd.it/1mos4j2
@r_devops