Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
🚀 Milestone Unlocked: 2K Stars! 🌟

🚀 Milestone Unlocked: 2K Stars! 🌟

My Cheat-Sheet Collection just hit 2,000 stars on GitHub!
Huge thanks to everyone who starred, shared, and contributed. Your support keeps this project growing. 🙌

If you haven't checked it out yet — it's a curated collection of high-quality PDF cheat sheets for developers, DevOps engineers, and tech enthusiasts. 📚💻

Feel free to explore, contribute, and share!
\#DevOps #CheatSheet #GitHub #OpenSource #Infosec #DevSecOps #Kubernetes #Linux

https://redd.it/1kuxk2d
@r_devops
Using an really long password to ssh into a VPS is it that bad?

If you generate a password with openssl like this:

openssl rand -base64 48

FyRFHjyJIgnl2g4DsDzv49ohmt7IQyKvGpv7UyAKwGLIJalPueMh9fxJVcGOTLsm


and use that to login into a VPS - is it that bad?

I've checked the generated string here:

https://bitwarden.com/password-strength/#Password-Strength-Testing-Tool

- It says it will take centuries to crack.


In addition, when you add a wrong password, the hosting company looks like it adds a fake delay of a few seconds until it shows you the password is wrong.

I'm sure that hosting will detect if someone tries to crack your vm after a dozen of failed tries and call you.

I know the proper way of doing this is to create a new user on the vm, disable login with password by changing a few files and add your ssh keys, but compared one step using passwd it doesn't look (for me) that it will be more secure.

What's the "security" ratio here? Strong password vs SSH keys


https://redd.it/1kuz8kz
@r_devops
Spacebar Counter Using HTML, CSS and JavaScript (Free Source Code) - JV Codes 2025

With the Spacebar Counter, users can interactively count each time they press the spacebar on their keyboard. You can use this tool to check your speed or to enjoy yourself, and in each case, you’ll see a powerful example of how event handling works in JavaScript.

I have released all the source code for free, and I’ve built it using modern structure and best programming habits to enable beginners and developers to learn easily.

Source: Spacebar Counter

https://redd.it/1kuzrzm
@r_devops
🛠️ Building a No-Nonsense DevOps Course – What Would You Want In It?

Hey r/devops,

I’ve been in the DevOps space for a number of years now — led automation efforts, scaled infra, managed CI/CD pipelines, and trained engineers along the way. Now, I’m planning to build a DevOps course — but not just another course.

I want to create something that cuts through the fluff — something grounded in real-world challenges, production lessons, and what it actually takes to succeed in a DevOps role today.

The usual “install Jenkins/K8s and deploy a to-do app” just doesn’t cut it anymore. So here’s what I’m thinking:
• Production-grade examples with real troubleshooting
• Topics like GitOps, FinOps, Platform Engineering, and team workflows
• Focus on mindset: how to think like a DevOps/infra engineer, not just use tools
• Optional deep dives for those who want to go beyond “just enough to deploy”

If you were taking a course like this, what would you want to see?
What’s missing in today’s DevOps content that you wish someone taught properly?

https://redd.it/1kv43zr
@r_devops
Best Docker registry with image housekeeping support

Hi all,

We’re looking to set up a private Docker registry for our company and one of our must-have features is automatic housekeeping — we need to delete old or unused images to manage disk usage effectively.

We use Jenkins for CI/CD, which pushes images frequently, so over time our registry gets cluttered with outdated builds and untagged layers. We'd like a solution that can:

Run scheduled or on-demand cleanup jobs

Support retention policies (e.g., keep last N images or delete images older than X days)

Ideally offer a web UI and/or API for managing images

Integrate well with Jenkins or at least not get in the way


We’re currently evaluating Harbor and Nexus, but open to other suggestions too. What are you using in production for this kind of setup? Any pros/cons we should know about?

Thanks!

https://redd.it/1kv5o1v
@r_devops
transition to a devops career and the importance of certifications in the career.

I have experience in support and some infrastructure (networks and basic Linux). What would be an ideal schedule to follow to make the most of my career transition?



Another question: do certifications like LPI have an important requirement to apply for these positions?

https://redd.it/1kv7rku
@r_devops
DevOps Buddy wanted! LeetCode, tech chats, open source & more!

Hey Reddit!

Looking for someone to team up with for DevOps stuff. I wanna get better at LeetCode, chat about cool tech, mess around with open-source projects, and just keep each other motivated.

I'm really into DevOps and trying to learn more about [mention something specific you're into, like Kubernetes or AWS]. LeetCode's on my list to boost my problem-solving.

If you're up for:
* LeetCode sessions: Let's tackle problems and share ideas.
* DevOps talks: Bouncing ideas around, discussing tools, or just complaining about YAML. 😉
* General tech chats: What's new? What's cool?
* Open source fun: Exploring or even contributing.
* Being accountability buddies: Keeping each other on track.

You don't have to be a guru, just enthusiastic about learning. We can link up online (Discord/Telegram, etc.) whenever works.

If this sounds like your jam, hit me up with a comment or a DM! Let's learn together.


https://redd.it/1kv8ryp
@r_devops
How I Automated My Infrastructure with Terraform

Hello everyone!
I wanted to share one of my more... questionable engineering decisions: I Terraformed my entire home network.

I've been managing my Mikrotik setup (router + switches + wireless) with Terraform for about a year now. Everything from VLANs to firewall rules is defined as code and version controlled.

All of the code is avaliable here: https://github.com/mirceanton/mikrotik-terraform/

Why Terraform for networking?
Honestly, because it's the tool I know. When I found out the RouterOS provider existed, I just had to try it. Probably not the most practical approach, but it's been a great learning experience!

The state management situation is... creative. Can't exactly use S3 when you might accidentally terraform your own internet connection away! I ended up going with local state + SOPS encryption + Git. Works, i guess, but it's definitely not textbook.

Oh, and the amount of terraform state mv commands I've run during refactoring... SO many. I can't just destroy and recreate resources because they are, quite literally, my internet connection. I don't think I've ever had to do this much state surgery... even at work.

The whole thing taught me a lot about both Terraform and networking. Sometimes picking an overly complicated approach is the best way to learn!

Made a video about it too, if you're interested, wwhereI go into my setup as well, not just the code https://youtu.be/86LRoxuU5kg

Anyone else using Terraform in non-conventional ways? Would love to hear about other creative use cases or approaches!

https://redd.it/1kv99c6
@r_devops
Learn by doing

I'm looking to team up with some like-minded individuals who have a basic grasp of various tools and are ready to jump into some exciting projects! I've got a few cool ideas we could start working on together.

If you're interested in collaborating and bringing some of these ideas to life, let's create a Discord server and get started

https://redd.it/1kvdbhj
@r_devops
Hiring Managers

1) What are some of the skills with the most demand right now and will stay in demand for the next 30 or so years?

2) How is the job market right now for Cloud/DevOps and SRE roles?

https://redd.it/1kvesqr
@r_devops
Bare metal K8s Cluster Inherited


We inherited an infrastructure consisting of 5 physical servers that make a k8s cluster. One master and four worker nodes. They also allowed load inside the master itself as well.

It is an ancient installation and the physical servers have either RAID-0 or single disk. They used OpenEBS Hostpath for persistent volumes for all the products.

Now, this is a development cluster but it contains important data. We have several small issues to fix, like:

- Migrate the PV to a distributed storage like NFS

- Make backups of relevant data

- Reinstall the servers and have proper RAID-1 ( at least )

We do not have much resources. We do not have ( for now ) a spare server.

We do have a NFS server. We can use that.

What are good options to implement to mitigate the problems we have? Our goal is to reinstall the servers using proper RAID-1 and migrate some PV to NFS so the data is not lost if we lose one node.

I listed some actions points:

- Use the NFS, perform backups using Valero

- Migrate the PVs to the NFS storage


At least we would have backups and some safety.

But how could we start with the servers that do not have RAID-1? The very master itself is single disk. How could we reinstall it and bring it back to the cluster?

The ideal would be able to reinstall server by server until all of them have RAID-1 ( or RAID-6 ). But how could we start. We have only one master and PV attached to the nodes themselves

Would be nice to convert this setup to proxmox or some virtualization system. But I think this is a second step.

Thanks!

https://redd.it/1kvdnb3
@r_devops
Scaling Postgres with Kubernetes, guide on partitioning sharding and replication

i have written a guide on setting up high availability Postgres cluster with sharding, replication and partitioning. Hope you find this helpful. 🐘



https://blog.sagyamthapa.com.np/scaling-postgresql-with-kubernetes

https://redd.it/1kvdc66
@r_devops
👍1
Developer to Devops resume review

I'm a backend developer with over 2.5 years of experience, and I’m looking to transition into a DevOps role. In my resume, the Developer and DevOps roles are listed under the same company. I’ve been involved in DevOps tasks for the past year, but there wasn’t much to learn beyond the tools I’ve already mentioned. That’s why I worked on personal projects to gain a deeper understanding.

Most of the DevOps skills I’ve acquired have been through these personal projects.

I’ve currently separated the Developer and DevOps roles into two parts on my resume, as I wasn’t sure how to present the experience correctly.

I would appreciate your guidance while keeping these points in mind. I’m open to omitting anything unnecessary and willing to add whatever is needed.

My resume below..
kindly review
https://i.postimg.cc/4x1BFCXw/IMG-20250523-225607.jpg

https://redd.it/1kviy4n
@r_devops
cheaper datadog alternative for APM?

Our datadog bill is starting to get eye watering for web APM purposes. We use datadog for web APM because we need insight into site code for a couple of python and nodejs services, and well.. they were the safe choice. But our data volume has gone up quite a bit over the past 4 months so i'm now tasked to evaluate other options.

We already use elastic for an internal service and we're happy with that, so that could be an option for logging. I'm open to ideas, Honeycomb, Sentry, Sumo Logic, Splunk, New Relic, Dynatrace, Grafana, Groundcover, whatever works. Cloud Metrics are cool but that's not what we use DD for. So if it can't do traces it's automatically a non-starter. Preferably no deep dev integration (or code change would be great).. we just don't have the resource got other fire fights to deal with. Open to database APM feature, good over postgresql work loads and then tying web apm traces to db traces.

Advice / input appreciated.

https://redd.it/1kvlssd
@r_devops
How I Blocked 95% of Web Attacks Using AWS WAF Blog


I recently wrote a blog post about securing web apps using AWS WAF, and how you can block up to 95% of common attacks (like SQL injection, XSS, bot traffic, and even basic DDoS) with just a few clicks in the AWS Console.

If you’re on AWS and haven’t tried WAF yet (or find it intimidating), this guide breaks it down step by step:

https://blog.prateekjain.dev/how-to-block-up-to-95-of-attacks-using-aws-waf-e2223efc1f55?sk=cc74156befaab48297655a00f352f4e6

https://redd.it/1kvm4gp
@r_devops
Best books/Courses to transition from Developper to Devops

Hello everyone,
i am a fullstack developper with 4 years of experience. I use Angular/Typescript for frontend and SpringBoot/Java for the backend.

I also have basic knowledge of Docker, basic knowledge of Jenkins (using the pipeline and writing basic templates), i also have Kubernetes Developer Certification and some knowledge in cloud (AWS basic services , and have azure fundamentals), and some linux basics.

I would like to transition from developer to Devops but i am a bit lost in what path to follow. So i would like some recommendation for couple of books or courses to help me transition to Devops.



PS: I know it depends, and maybe a bit subjective but any guide would help me understand.

Thank you!


https://redd.it/1kvoyoz
@r_devops
Build an incident response workflow with n8n + Prometheus

Hey guys,

I’m working on a monitoring setup that automates basic incident resolutions.

This is the visualization of the flow:

https://drive.google.com/file/d/1HiobPj50VZp1VylyqLTXLAeqDoJtrG\_x/view

I’m using Prometheus - Grafana for monitoring, Alertmanager to send alerts, and n8n to orchestrate a workflow, then an AWS Lambda function to restart the services. “Restart services” is a kind of demo action, you can customize it for your needs.

How does it work?

Prometheus: I configure some basic rules to alert when CPU/Memory exceeds a threshold. When the thresholds are exceeded, it will send a webhook to n8n system.
N8n flow: Get information, analyze the metrics, calculate the business hours or incident duration, and send alerts to Discord or escalate to PagerDuty.
AI agent (in n8n): I define a prompt to check for the input. I will consider the metrics and current contexts to decide whether to restart the services or not.
Lambda function: Receive the commands from AI agent and process if necessary. Currently, I grant it to restart an EC2 instance to make the service available again when the system overloaded.

I hope this helps you to apply an automated stack in your team. I’ve shared the example materials in those repositories:

One-click to set up Prometheus - Alert Manager - Grafana at

[
https://github.com/Bubobot-Team/monitoring-stack/tree/main/stacks/prometheus-stack](https://github.com/Bubobot-Team/monitoring-stack/tree/main/stacks/prometheus-stack)

N8n workflow in JSON format (just copy into your n8n dashboard): https://github.com/Bubobot-Team/automation-workflow-monitoring

Btw, just wondering, what recovery actions would you automate? (e.g., disk cleanup, rollback deployments). I would like to hear your feedback to improve the current flow.

https://redd.it/1kvqdph
@r_devops
Container is instance of image like in coding an object is instance of class?

class Dog {
String name;
int age;

Dog(String name, int age) {
this.name = name;
this.age = age;
}
}

// Creating multiple instances with different values
Dog dog1 = new Dog("James", 3);
Dog dog2 = new Dog("Bella", 5);

Docker

docker run -d --name app1 -e NAME=James -e AGE=3 mydogimage
docker run -d --name app2 -e NAME=Bella -e AGE=5 mydogimage



Is this true or I misunderstand

https://redd.it/1kvvp25
@r_devops
Atlassian Bamboo

Any devops who are still using this?

I’m 3 months into my promotion as devops engineer and have been given the keys to the bamboo kingdom.

It’s legacy and deprecated I believe. Also, with it being on premise it’s not the easiest to lab.

Interested in finding out who still uses this and how they find it?

I’m currently implanting a snyk integration for our code.

Thanks and have a wonderful day!

https://redd.it/1kvx0mg
@r_devops
Migration from GCP to OCI instances

I have 10+ servers on GCP which I want to migrate to oci. Some are production instances with live traffic and some are dev/testing servers. What is the best approach to migrate along with all the data. Is there a possibility of transferring snapshots?
GCP instances are running on centOS while the oci will run the Oracle linux images.
Any lead will be helpful

https://redd.it/1kvy85p
@r_devops
Questions about the LFS258 Kubernetes Course – Worth It for CKA Prep?

Hi everyone,

I'm looking into taking the **LFS258 - Kubernetes Fundamentals** course from the Linux Foundation, and I have a few questions for those who have taken it:

* Is the course mostly pre-recorded video lectures?
* Does it include hands-on labs and troubleshooting practice?
* Is it beginner-friendly for someone with **no prior Kubernetes experience**?
* Is it enough on its own to prepare for the **CKA (Certified Kubernetes Administrator)** exam?
* Would you recommend buying **just the course**, or going for the **bundle with the exam voucher**?
* Are there any known **discount codes or promotions** for this course?
* Lastly, would you say this course is a good choice for someone coming from a **Cloud Engineering background** and looking to transition into **DevOps**?

Appreciate any insights or advice you can share – thank you!

https://redd.it/1kw1ner
@r_devops