Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Ultimate DevOps Roadmap 2025 for Absolute Beginners

I have created a detailed blog on how to start your DevOps journey in 2025 with all the FREE resources at each step and with a proper time frame, if you are a beginner and to start your DevOps journey then this guide will help you a lot. Thanks.

DevOps Roadmap

https://redd.it/1iujyxy
@r_devops
embedz - Easy, dependency free embeds for Svelte and Vue.

Easy, dependency free embeds for Svelte and Vue. hey guys just wanted to showcase a component library I've been working for a few months, I have finally released a svelte version, I'm open to feedback as id love to improve and polish this project.

if you wanna check out the project here's the repo, also a star would be awesome :33333

GitHub \- Playground

# Installation

# Supports only Svelte for now, requires Svelte 5 and above
npm i @embedz/svelte

<script>
import { YouTube, Vimeo } from "@embedz/svelte";
</script>

<YouTube
id="KRVnaN29GvM"
posterquality="max"
/>

https://redd.it/1iuk5d2
@r_devops
Securing non-human identities, focusing on authorization - why and how

Hey devops people. There’s been quite a bit of talk about NHIs, especially around the security risks and vulnerabilities that NHIs present to orgs that OWASP has mentioned

Which is why I wanted to share a potential solution to some of those risks, with you all, in case it could be useful.

From the issues mentioned by OWASP - several of them (e.g. Overprivileged NHI) can relatively easily be avoided through the proper authorization of NHIs. 

But, it’s not that simple to authorize workloads in distributed systems, if you don’t have a centralized solution. For example, each service might end up implementing its own authorization logic, and define implicit trust boundaries with dependent systems. This would then create inconsistencies and increase the risk of security gaps. 

The solution I'd like to present that my team and I have worked on. (Disclaimer:I work at Cerbos - an authorization implementation and management solution.)

Instead of scattering access rules across different services, Cerbos centralizes policy management. Making authorization into a scalable, maintainable, and secure process. And hence, minimizes the complications of managing authorization for non-human identities. 

Here’s how it works:

1. Issue a unique identity to each workload. These identities are then passed in API requests, and used to determine authorization decisions.
2. Define authorization policies for non-human identities. 
3. Deploy Cerbos in your architecture (Cerbos supports multiple deployment models - sidecar, centralized PDP, serveless). Cerbos synchronizes policies across your environments, ensuring that every decision is consistent and up to date.
4. Access the Policy Decision Point (PDP) from anywhere in your stack to get authorization decisions.

The technical details on how to authorize NHIs with Cerbos can be found on this page.

If you think this type of solution would be helpful for you (or if it wouldn’t for any reason) I'd love to understand why.

https://redd.it/1iuqbv7
@r_devops
too long; automated: learn to automate unit tests, git tagging, Docker image building & pushing, integration tests and deployment to Cloud Run using GitHub Actions and Workload Identity Federation final part of the "one branch to rule them all series"

I couldn't find an in-depth guide on how to go from requirements gathering, through the implementation and testing, to the automations using CI/CD approach, so I created one: https://www.toolongautomated.com/posts/2025/one-branch-to-rule-them-all-4.html

I've tried to make it as comprehensive as possible, while keeping it conversational and simply fun.

The project I've worked on is:

How to deploy an app to multiple environments so that each env can run a different version of the application?

The implementation is fully open-sourced here: https://github.com/toolongautomated/tutorial-1

Enjoy and let me know what you think guys!

https://redd.it/1iusife
@r_devops
Made our production scaling more than 9x and image pulling by 290x faster

Should I blog it??? I am an intern and I somehow managed to pull this.

Image size ~2.6GB stored in ECR

Scaling / application starting time in EKS
Before: ~ 4 min and 20 sec
After: ~ 35 sec

Image pulling time
Before: ~ 1 min and 30 sec
After: ~ 280 milli sec

If you find it interesting lemme know ... I'll blog this weekend or post it here.

https://redd.it/1iuud94
@r_devops
DevOps in Censorship: Lessons from the TopSec Leak


A data leak from TopSec provides insights into DevOps practices in censorship.

Understanding how advanced technologies, such as Kubernetes and Docker, are leveraged by companies engaged in censorship can inform better security practices within the industry.

This leak illustrates the need for ethical considerations in the deployment of such technologies, urging industry professionals to reflect on their roles.

- Discusses DevOps tools used within censorship operations.

- Explores the need for ethical guidelines in technology deployment.

- Encourages DevOps professionals to consider the broader societal implications of their work.

(View Details on PwnHub)


https://redd.it/1iuv7s0
@r_devops
Why Interviews have become so one-sided nowadays

I have been giving interviews these days and have encountered so many instances where I found that the interviewers are not even trying to interact with interviewee. They are just starting the process start grilling like if they are facing their enemy and then in last with very less interest asking do you have any questions.

I had given lot of interviews in past but this time I'm seeing it completely different. They are looking for everything to be perfect in an hour call and based on that they are going to decide whether you're a fit or not.

Folks please add your thoughts.

https://redd.it/1iuwewc
@r_devops
On-Premise Minio Distributed Mode Deployment and Server Selection

Hi,

First of all, for our use case, we are not allowed to use any public cloud. Therefore, AWS S3 and such is not an option.

Let me give a brief of our use case. Users will upload files of size \~5G. Then, we have a processing time of 5-10 hours. After that, we do not actually need the files however, we have download functionality, therefore, we cannot just delete it. For this reason, we think of a hybrid object store deployment. One hot object store in compute storage and one cold object store off-site. After processing is done, we will move files to off-site object store.

On compute cluster, we use longhorn and deploy minio with minio operator in distributed mode with erasure coding. This solves hot object store.

However, we are not yet decided and convinced how our cold object store should be. The questions we have:
1. Should we again use Kubernetes as in compute cluster and then deploy cold object store on top of it or should we just run object store on top of OS?
2. What hardware should we buy? Let's say we are OK with 100TB storage for now. There are storage server options that can have 100TB. Should we just go with a single physical server? In that case deploying Kubernetes feels off.

Thanks in advance for any suggestion and feedback. I would be glad to answer any additional questions you might have.

https://redd.it/1iuy4xk
@r_devops
How does everyone handle versioning/releases with monorepos?

We are using Trunk Based Development & a monorepo setup for around 50 services.

Ideally, I would like to have each service individually versioned as having a version for all doesn't scale well, mainly around the fact it would trigger a release pipeline for every service, even if it has no changes.

How does everyone approach this around releases?

It is not scalable either to have the developers or owner cut a release branch for every single service release/service1/1.0.0 or release/service2/1.0.1 for example. It would take a while and would just be a tedious job.

How does everyone approach this situation?

I was thinking some sort of pre-release pipeline which runs git diff to determine which release branches should be cut, the only issues with this is figuring how to get the pipeline to determine which version should be bumped, we are using semver.

https://redd.it/1iuvs6y
@r_devops
Kubernetes Ingress Controller Guide

If you are interessted in learning how to expose services in Kubernetes, read through my new blog article! It's a step by step guide, how to setup an NGINX Ingress Controller via Helm charts.

Medium Blog Article Link

https://redd.it/1iv23xw
@r_devops
Private tf module registry still a thing?

Long story short, we have tons of terraform module re-use and copy/paste across repos and services, so we are looking to create a central module registry/monorepo.

Is this still what most folks are doing? Is this still an adequate way of providing self-service to some extent to product engineers without them having to worry about how their infrastructure is being provisioned.

I know there's a lot of new tooling and platforms in his space so curious as to what others are doing. Things move so fast so it always feels like we are doing things incorrectly.

Thanks

https://redd.it/1iv6tfl
@r_devops
Windows vs Linux on enterprise level

In which case scenarios is Windows Server better than Linux?

https://redd.it/1iv9gfh
@r_devops
My first web server

I am configuring a web server for the first time, I literally have a physical server in my hands and I am deploying web apps and REST APIs.

This is my first experience using any server OS so I choosed Windows Server, I know that it is probably not the safest or most efficient choice for a web server but I thought it was the fastest way to start and learn server concepts in aa practical way. This machine has 3 disks (1TB each), I used one for the OS and configured a RAID 1 for the other two.

As a web server in software level, I am just using an simple Express web server to deploy every single web application, and all the APIs that are deployed are also developed in Express so yeah, Express everywhere. I am using PM2 to handle node processes. When there are any code changes, I pull the code from Github, perform any task needed (building, installing dependencies, etc.), and reload the process. As the applications are used in the same local network, I create reules in the windows firewall defender to open the ports in which the web services or web applications are listening.

What should I do next to improve and learn in a good rythm? What would be the next step? My main priority is to learn about all fundamental concepts of a server in a practical way.

https://redd.it/1iv9ezf
@r_devops
Gitlab pipeline timeout when uploading security scan to defect dojo

Hi Everyone,

I am facing a issue trying to integrate defect dojo with my gitlab ci/cd.

Here is the breakdown:

I am using gitlab built in security scanning templates for dependency scanning,container scanning.

These template generate json reports after scanning.

I am using a python script to upload these json reports to defect dojo

From my local  machine we access mydomain.defectdojo.com via vpn

I can curl with with vpn enabled and upload results.

But in gitlab pipeline the requests api i use to upload throws connection timeout to  mycompany.defectdojo.com 

I also tried running direct curl in the pipeline but it showed  couldnt connect to server

Is this due to vpn not in pipeline ?

How can i fix this issue?



https://redd.it/1ivbcp2
@r_devops
Secure way to share flutter mobile app without sharing code

Hi, in my company we have to give our onboarding flutter app to the vendor whose trading app we’re using and intergate our app with theirs. Now is there way to share our apk in a way that they can integrate it but not get access to the code.

https://redd.it/1ivclgt
@r_devops
Azure RM API Deprecations in Q1 2025 – What It Means for Terraform Users

If you’re managing infrastructure with Terraform on Azure, Q1 2025 will bring preview API deprecations for Azure Resource Manager (Azure RM), including APIs for Azure Kubernetes Service (AKS) and other resources. Now is the time to check your provider versions and ensure compatibility.

# What’s Changing?

Azure RM provides a structured way to manage and deploy Azure resources. Microsoft frequently introduces preview APIs, but these can change, get deprecated, or be removed entirely. Terraform’s azurerm provider depends on these APIs, which means unexpected changes can break your infrastructure.

# What You Should Do

Identify the Azure services in your Terraform-managed infrastructure. Whether it’s AKS, Storage, App Services, or Databases, knowing what you rely on is the first step.
Check the API versions your provider is using. Terraform’s azurerm provider often includes preview APIs, making it important to track which ones are in use. Example: Containerservice APIs in version 3.105.0 link
.
Monitor upcoming API deprecations. Azure phases out older APIs regularly, and failing to update could lead to outages.
Review your Terraform provider versions. New releases may introduce breaking changes, so read the release notes before upgrading.
Test changes in a lower environment before deploying. Validate any updates in a controlled environment to avoid unexpected failures.

Keeping up with API deprecations is key to maintaining reliable Terraform deployments. If you haven’t reviewed your setup yet, now is the time.

https://redd.it/1ivf6ae
@r_devops
Bootstrapping CD for Terraform + Docker

TLDR: What's the best practice for managing infra with custom Docker based images using Terraform?

We primarily use GCP and for a lot of simple services we use Cloud Run with GAR (Google Artifact Registry) to store the Docker images.

To manage the infra, we generally use Terraform and we use GitHub Actions to do CI & CD.



Deployments to new environments comprise of the following steps:

1) [Terraform\] Create a new GAR repository that Docker can push to

2) [Docker\] Build and push the Docker Image on the newly created GAR and then

3) [Terraform\] Deploy the Cloud Run service which uses the GAR, along side any other infrastructure we might need.

This 3 step process is usually how our CD (GitHub Actions) is structured and how our "local" dev (i.e. personal dev projects) works, both usually running with just as the command runner.

Terraform needs to have a "bootstrap" environment which gets deployed in the first step, separate from the "main" one used in the third. Although, instead of using a separate bootstrap environment, you can also use -target to apply just the GAR but that has its own downsides imo (not a fan of partial apply, especially if bootstrap involves additional steps such as service account creation and IAM role assignment).


It's possible to avoid having two Terraform apply steps by doing one of the following:

\- Deploy the Cloud Run services manually using the gcloud CLI - but then you cannot manage it well via Terraform which can be problematic for certain situations.

\- Perform the bootstrap separately (perhaps manual operations?) so normal work doesn't require it - but this sounds like a recipe for non reproducible infra - might make disaster recovery painful

\- Run the docker commands as part of some terraform operator (using either a null resource with local exec or perhaps an existing provider such as kreuzwerker/terraform-provider-docker), but this might be slow for repetitive work and might just not integrate that well with Terraform



Any suggestions how we can do this better? For trivial services it's a lot of boilerplate stuff that needs to be written, and it just drains the fun out of it tbh. With some work I suppose it's possible to reuse some of the code, but we might put some unnecessary constrains and abstracting it right might take some work.

In a totally different world from my day job, my hobby NextJS apps are trivial to develop and a lot more fun. I can focus on the app code instead of all this samey stuff which adds 0 business value.

https://redd.it/1ivepjr
@r_devops
Am I Ready for DevOps?

I started off learning about DevOps soon after I got into self hosting and running my own homelab, fast forward a few years this has become my addiction. I work with VoIP currently and play around with Linux a bit for work but nothing with containers or DevOps tools, so i have just been learning with my homelab.

Anyways, Im sick of VoIP and my current role, and would like to start applying for some Jr DevOps roles but am curious from the people who actually do this as a job if you would think I am prepared enough just based on my homelab.

Personally I think i need to get better with Ansible, Kubernetes, adding more things to Terraform/OpenTofu, and learning coding languages, this is what I am working on currently.

All of the config can be located here https://git.mafyuh.dev/mafyuh/iac or on Github here https://github.com/Mafyuh/iac

Please critique and let me know what you think, this is my first time ever posting in DevOps so dont really know what to expect but id love to hear it all, good or bad. Thank you

https://redd.it/1ivhjrn
@r_devops
SPRING BOOT MICROSERVICES ISSUE : even when i deployed my spring boot microservices in Digital Ocean droplet , i am not able to use that ip address inside POSTMAN why ? is there any reason or i lack some information about this ? for eg. https://111.11.11.111:8082/register/user but i error coming,

help me please !! Could not send request
Error: connect ECONNREFUSED 111.11.11.1111:8082
i deployed all my microservices and they are running through digital ocean with .jar file but still this why???

https://redd.it/1ivhhfz
@r_devops