Reddit DevOps
268 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Devops, CI/CD, Docker, etc. course

Hello,

I'm looking for a course that covers all DevOps concepts — both from a project-level perspective and, of course, the technical side like Docker, CI/CD, etc.


I found this course, which doesn’t seem bad:

https://www.coursera.org/professional-certificates/devops-and-software-engineering#courses

Plus, I could list an “IBM Certification” on LinkedIn.

What do you think?
Do you have any other course suggestions?

I’m also willing to pay, as long as it’s something well-structured and high quality.
Keep in mind that I work full time, so I don’t have time for 400,000-hour courses that explain things I’ll never use.

Thanks!

https://redd.it/1m333nw
@r_devops
How do you structure incident response in your team? Looking for real-world models

I recently wrote a blog post based on conversations with engineering leaders from Elastic, Amazon, Snyk, and others on how teams structure incident response as they scale.

We often hear about centralized vs. distributed models (ie., a dedicated incident command team vs. letting service teams handle their own outages). But in practice, most orgs blend the two, adopting hybrid models that vary based on:

* Severity of the incident
* Who owns coordination vs. fixing
* How mature or experienced teams are
* Who handles communication (devs vs. support/comms)

I'd love to hear from you:

**How is incident response handled on your team?**

* Do you have rotating incident commanders or just whoever’s on call?
* How do you avoid knowledge silos when distributed teams run their own incidents?
* Have you built internal tooling to handle escalation or severity transitions?

Would love to hear how other teams think about this.

\---

ps: here's the full post if you're curious about hybrid models: [https://rootly.com/blog/owning-reliability-at-scale-inside-the-hybrid-incident-models](https://rootly.com/blog/owning-reliability-at-scale-inside-the-hybrid-incident-models)

https://redd.it/1m2yqbu
@r_devops
Spectral lint demo for APIs

Hey 👋

I’ve put together a GitHub repo that showcases Spectral linting, specifically for APIs.

It’s to demo how the Spectral tooling can help DevOps and Dev teams identify OWASP violations in your OpenAPI specs as well as show how it can help enforce your own organisational guardrail and governances for your APIs (operation naming conventions for example). The repo has a good and bad example you can run against to see how Spectral works.

Additionally, I’ve put together a GitHub Action that triggers on PR to show how it can be used as part of your PR gates, as well as how you can shift left locally in VS Code for example.

Hopefully helps those unaware of the tool or aspiring devops people looking for a real world demo, free, that they can run on their own machine to get to grips with it!

If you find it useful, feel free to star it!

https://github.com/riosengineer/spectral-demo



https://redd.it/1m329uc
@r_devops
Docker-BuildAgent: One Build Image for Node, Angular, .NET, and More!

I am having deja-vu...I thought I posted this, now I cannot find it.

Hey devs! I just released a major update to Docker-BuildAgent – a flexible, all-in-one Docker image and build system for modern CI/CD pipelines.

What is it?

A pre-configured Docker image and build orchestrator (built on NUKE) for Node.js, Angular, .NET, and PowerShell projects.
Designed for GitHub Actions, but works with any CI/CD.
Handles Docker builds, Node/Angular builds, artifact packaging, versioning, and even Discord/GitHub notifications.

Key Features:

🐳 Docker image builds, tagging, and registry push
🟢 Node.js/Angular/React support (auto-detects package manager)
📝 Customizable build scripts and artifact copying
🔁 Reusable build logic via NUKE targets
💬 Discord & GitHub integration for notifications/releases
🧪 Dry-run mode for safe testing
Pre-installed: Node, Angular CLI, .NET 8 SDK, Docker CLI, PowerShell, Git, GitVersion, Nuke, and more

How do I use it?

Mount your project as `/workspace` and run `docker-build` or `node-build` (see Quick Start)
Customize with .build.scripts.build.copy, and env mapping files
Use the provided templates for Dockerfiles if you don’t have your own
Full CI/CD examples for GitHub Actions

Docs & More

Full Documentation
Customization options
Parameters & settings
Troubleshooting & FAQ

Why? I wanted a single, reproducible build environment for all my projects, with best practices and zero “works on my machine” issues. If you’re tired of maintaining separate build scripts and Dockerfiles for every stack, give it a try!

Feedback, questions, and PRs welcome! 🙌

https://github.com/The-Running-Dev/Docker-BuildAgent

https://redd.it/1m3e2qt
@r_devops
How would you deploy multiple clients in one k8s cluster using ArgoCD and kustomize?

I prefer kustomizations whenever possible, and I'm about to start using ArgoCD for the first time.

But how would you structure your Git repos in order to deploy multiple client instances of an application in k8s? Would you have one branch per client, one repo per client maybe? Other smart methods?

Let's say each client needs a tomcat instance and a database instance from mariadb operator. And will use some shared services like valkey for example.

https://redd.it/1m3fjb8
@r_devops
AI-driven burnout?

I left my desk today having accomplished a lot I guess, but working with AI tooling feels hollow for some reason. I’m still making technical design-related decisions and “writing” code if you can even call it that anymore. I ship a bit faster now and can get up to speed on new tools much faster. But it feels really mechanical. This could also be that I’ve been doing this job a decade and a half and maybe this is just natural burnout. I’m approaching 40, and have a ways to go in my career but I don’t think I can keep doing the same thing for another 20 years.

Building everything for, and with AI just has me questioning how useful is this work to society as a whole.

I’ve always loved computers and technology in and outside of work. But lately I’ve been really over it all.

https://redd.it/1m3nva7
@r_devops
My teenager son wants to learn devOps

Hello reddit! My teenager son wants to be a devops engineer and i need some tips or some resources. My background is mostly software development for the first decade and move up as architecture then lots of devops (mostly azure and gcp terraform and automation). Should I let him play with software development first then slowly into infra/devops like I do or let him do system networking/sysadmin stuff? My kid has some basic knowleged in coding from school and nothing else other than playing chess all day. 😁

https://redd.it/1m3p8h4
@r_devops
Is parallels desktop best option for devops on m1 mac?

Is parallels desktop best option for devops on m1 mac?

Any alternatives?

https://redd.it/1m3qr32
@r_devops
(Newbie Deployer) NGINX- Docker-Compose or K8s?

I am currently running 2 different docker-compose services on the same CVM (using different docker-compose files).

One is a .NET service running on .../8080, another is a FastAPI running on .../8000

(some of the FastAPI endpoints also call the .NET endpoints)

I'm looking to add NGINX because I need SSL for both services.

However, I don't know which is the better option:

1) Consolidate everything into a single Docker-Compose with NGINX in said docker compose
2) Setup K8s NGINX Ingress Controller, as well as use K8s pods to rout between the 2 different services based on outside traffic (?)

I'm not familiar with K8s at all (but I am interested to learn... just don't want to crash out because this project does have some sort of deadline).

Have only recently begun to feel a little teensy bit of confidence/familiarity with Docker.

Alternatively, are there any other options or progressions?

https://redd.it/1m3s0lp
@r_devops
How do you handle tagging repositories when it's time to release code?

One thing I've never really seen done, despite it always seeming like a good idea is tagging repositories for releases. Part of the reason I've never implemented it myself is that I don't know how to work around the following issues:

1. How do you actually tag the designated commit? Just through the git CLI? In the browser? Do you have a job for it?
2. How do you manage ancient tags and the associated job for releasing them? Admittedly this is biased by the CI/CD tools I've used, but all of them so far feature a build per branch, so in my experience, with nothing tidying old tags up, there'd be hundreds of build/release jobs? Is it usually a case of ignoring them and manually tidying them up?

For context, everywhere I've worked usually either does some nonsense sort of git flow (much more about giving the developers a feeling of safety rather than actually making anything safer), or just releasing from the top of main following the principle that commits pushes to main should already have been validated as safe. Great principle in my experience if you can get everyone to follow it.

If you're doing git tags for releases and you've solved these issues could you explain what you did? Could you also provide context for how often releases are performed and who actually does them?

https://redd.it/1m3rfgl
@r_devops
Can a container know the list of mounted volumes?

I have a an app that’s distributed as a Docker image and by default, it uses SQLite for simplicity. So the recommendation is to either use an external DB like Postgres, but if the user wants to keep it simple they can keep using SQLite.

The issue is that sometimes they forget to map the SQLite path to a host path, the container dies and the data is lost.

Any suggestions on how to alert the user (other than on documentation)?

https://redd.it/1m3taue
@r_devops
I analyzed 50k+ LinkedIn job posts to build job-focused DevOps Roadmaps

Hi Folks,

We've been working on roadmaps https://prepare.sh/roadmaps and figured we'd share it here to get some thoughts from the community.

All data is based on LinkedIn job postings (Jan 2025 - To Present). The main angle here is to land jobs or increase salary/total comp and imo the best way for this was to use recent job market data rather than listing every possible DevOps tool.

We built a trends system and analyzed tons of LinkedIn job posts based on what companies are actually hiring for (the system is live on our site too). Instead of one generic roadmap, we made separate ones for SRE, SysAdmin, MLOps, DevSecOps, Cloud Engineer, and classic DevOps. Each has actual courses linked to the topics.

The entire foundation courses are completely free. There's a small fee for advanced content to help cover server costs since they come with live environments - most are 1-click deployments of Kubernetes, Grafana, Prometheus, Postgres, Mongo, Kafka, Vault, etc.

Please lmk what you think!

https://redd.it/1m3vg3x
@r_devops
Can i work with devops?

I graduated last month and have an opportunity to study devops on an pretty good place. I know how to code using python and js (fullstack). What are your thoughts?

https://redd.it/1m3ya74
@r_devops
Suggestions for Observability & AIOps Projects Using OpenTelemetry and OSS Tools

Hey everyone,

I'm planning to build a portfolio of hands-on projects focused on Observability and AIOps, ideally using OpenTelemetry along with open source tools like Prometheus, Grafana, Loki, Jaeger, etc.

I'm looking for project ideas that range from basic to advanced and showcase real-world scenarios—things like anomaly detection, trace-based RCA, log correlation, SLO dashboards, etc.

Would love to hear what kind of projects you’ve built or seen that combine the above.

Any suggestions, repos, or patterns you've seen in the wild would be super helpful! 🙌

Happy to share back once I get some stuff built out!

https://redd.it/1m3xkwj
@r_devops
How to actually think as a DevOps and cloud engineer?

I'm new to this, 22 years old, graduated 2 weeks ago. I somehow managed to get my GCP Associate, AZ-104, SC-900, learned some tools and all, but I dunno... I still feel like I'm nothing.

I know you'll say "do projects and real things," but let's be honest , we all use AI or watch some tutorial from existing cloud architecture. Like, I dunno, I feel like I'm not a real engineer.

I want to actually think like a DevOps/cloud engineer but I'm struggling with imposter syndrome here. How do you move from just following tutorials to actually understanding and creating solutions and have that real thinking ?

https://redd.it/1m41z5q
@r_devops
multiple net interfaces handling

hi recently I was thinking about following case:

I have a linux destop machine that is plugged to network A via eth cable and has enabled wlan that connect to network B. both interfaces are up and runnig. How do I know what interface is currently used f.e. when I open the browser and enter a site or execute apt in terminal ?

https://redd.it/1m46ebk
@r_devops
Seen lot of good things about kodecraft. But price is too high for an unemployed person from india

Hi,
I have been a lurker here. Commented here and there. There is two website I can see popping up in comment, Kodecloud and kodecraft. While kodecloud is good for learning, but I saw kodecraft provides handson experience. Coming from a economically challenged background 97$ looks too much each month in price parity. Is there any way to get any discount in price?

https://redd.it/1m48sks
@r_devops
How do you use Go for scripting?

Dear Problem Solvers,

I use Bash, Python and JS at work and I kinda like the ability to call an npx command for something I’ve scripted in nodejs. It personally helps me a lot with pipelines and automation.

But I’m rather new in Go, and I was wondering how I could be using it for my tasks. Any tips or examples from your work?

Do you always need to do a “go build” in an earlier step on the pipeline to use that?

https://redd.it/1m49ro7
@r_devops
A growing wave of “AI SRE” tools - Are they production ready?

Recently, I met with a startup founder (through Rappo) who is working on an "AI SRE" platform. That led me down a rabbit hole of just how many tools are popping up in this space.

BACCA.AI – Is the first AI-native Site Reliability Engineer (SRE) to supercharge your on-call shift
 OpsVerse – Aiden, an agentic copilot that demystifies your DevOps processes
 TierZero – Your AI Infrastructure Engineer
 Cleric – The first AI for application teams that investigates like a senior SRE
 Traversal – Traversal is an AI-powered site reliability platform that automates root cause detection and remediation
 OpsCompanion – Chat-based assistant that streamlines runbooks and suggests resolutions.
 SRE.ai (YC F24) – AI agents automating DevOps workflows via natural language interfaces.
 parity-sre (YC) – World’s First AI SRE” for Kubernetes; auto‑investigates and triages alerts before engineers.
 Deductive AI – Code-aware reasoning engine building unified graphs to find root causes in petabytes of logs.
 Resolve AI – AI production engineer that cuts MTTR by 5x with autonomous troubleshooting.
 Fiberplane – Collaborative incident response notebooks, now supercharged with AI.
 RunWhen – 100x faster with Agentic AICurious to hear what the take is on these AI SRE tools?

Has anyone tried any of these? Also, are there any open-source alternatives out there?

https://redd.it/1m4egqq
@r_devops
Proxmox-GitOps - a Self-configuring GitOps Environment for Container Automation in Proxmox VE

Hi everyone, I wanted to share my GitOps project for my homelab, a self-configuring CI/CD environment for Proxmox:
https://github.com/stevius10/Proxmox-GitOps

Proxmox-GitOps is built to manage and deploy LXC containers in Proxmox, fully defined as code and easy to modify via Pull Request. Consistent, modular, and dynamically adapting to changing environments and base configurations.

A single command (and accepting the Pull Request in the Docker environment, ha) bootstraps the recursive deployment:

- The Docker-based environment pushes its own codebase as a monorepo, referencing modular components (containers you define are automatically integrated as submodules), each integrated into CI/CD. This triggers the pipeline.
- The pipeline then triggers itself — updating references, enforcing state, and continuing recursively.

Provisioning is handled via Ansible using the Proxmox API. Configuration is managed with Chef/Cinc cookbooks focused on application logic.
Shared configuration is applied consistently across all services. Changes to the base system propagate automatically.
It’s easily extensible, aiming to have all containers built the same way. There’s an explanation of how to do this in the README of the repository.

This project is still young and there are most likely some bugs. I built it primarily for my own homelab, but I’d like to develop it further. Would really appreciate your input – even (or especially) if you run into issues.
Thank you in advance for any interest or feedback you have 🙂


https://redd.it/1m4fwki
@r_devops