Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Recommend an automated build and deployment system for a small company

I've recently accepted a developer role at a (very) small company that sells a niche software product, in both SaaS and run-on-your-desktop variants. The company has been around for \~20 years, and all of their practices are from that era - EVERYTHING is completely manual, and done directly from developer machines, up to and including production deployments. There's little to no visibility of which software versions are running in which environment, no centralised repository for configuration, and so on.

There are only 3 IT people in the org - me, the dev who originally wrote the software, and an "IT Ops" guy who manages servers, databases, networks, and so on.

I've managed to sell the concept of automated builds and releases to management, and the next step is to write up a proposal including costs and benefits.

Where I'm now stuck is which automated build/deployment product to put into the proposal. The basic requirements are:

* Automated builds - codebase is 90% C#/.NET, with some exceptions - some C++ code for performance intensive stuff, and about half of the web code (Typescript / JQuery / React) is currently built using yarn.
* Support for \~30 applications, a few of which are software releases to customers but most of which are backend API's, web applications, or batch processing apps running on our (bare metal VM, no Kubernetes/Docker) infra.
* Support for a Windows-only environment, with apps running as a mix of console applications in the foreground, windows services, or web applications hosted in IIS.
* Ideally a simple UI showing a matrix of environments, applications, and software versions - something suitable for e.g. a product owner.
* Selectable versions and deployment targets with manual release triggers. We're a long (loooooong) way from true CI or CD. One-click stop/start/upgrade for our IT Ops guy, with dropdowns for app versions ideally driven from git tags in the associated repo (or similar).
* Email notifications of software releases to the broader team ("Application A version 3.x.y has just been released to VM1 in Production, release notes here <insert text from release notes file>").
* Constrained targets for each project - Application A should only be able to run on VM3 or VM5 in the prod environment, etc.
* Scriptable deployments, or even something e.g. YAML-based as long as custom plugins are possible in Python, C#, Powershell, etc.
* Affordable - our operating budget is low, as you might imagine.
* Simple and maintainable - we don't have a dedicated DevOps person, and our IT Ops guy isn't going to spend weeks or months traversing a steep learning curve.
* Eventually, support for automated tests and code quality checks - none of these exist right now and the codebase is a spaghetti mess, but that's something that will now be improving over time.

I'd previously stumbled across Octopus Deploy, which seemed to tick all of our boxes - but the recent price increases have now put it well out of our budget.

Any helpful recommendations gratefully received. And no, "find another job" isn't a helpful response in this instance :P. These folks are a joy to work for in many ways, just not this particular one - and at least they're open to improvement.

https://redd.it/1iubcm8
@r_devops
Cloud Provider that offers prepaid compute?

I want to host a pretty simple backend, in addition to a small sql database somewhere on the cloud. However I am worried to host this all on AWS or Google Cloud, as they ostensibly do not limit how much compute you can consume, they just auto scale it and then hit you with a big bill. I'm still relatively new to this so I do not want to end up like those students who accidentally setup some rogue EC2 instance that balloons to tens of thousands of dollars. I simply want a cloud provider where you prepay how much compute you want to use, and if you hit your prepaid limit, it just shuts down, no going into the red.

Or given this small setup, would it make more sense to not bother with the cloud at all, and spin up my own local server on raspberry pi ? Is all of the port forwarding, setup etc. significantly more complex than a cloud provider?

https://redd.it/1iuekh8
@r_devops
Should I use Terraform, AWS CDK, or bash scripts with aws cli???

What are your thoughts? Also, doesn't need to be only for aws, interested in hearing opinions from people working with gcp and azure as well, and comparing those apis with Terraform.

https://redd.it/1iue7u8
@r_devops
Hyperping vs. Better Stack vs. OneUptime for observability

Which one is better? Pricing is not the problem.

I am specifically interested in synthetic monitoring with playwright.

https://redd.it/1iugfm8
@r_devops
Community Powered Cloud based on TEEs

Since AMD SEV-SNP is now fairly easy to integrate on Linux, I believe that cloud will slowly start to move away from big centralized platforms. In order to start working with SNP, you need some Rust experience and I suggest starting with virtee: https://virtee.io/

AMD SEV-SNP is focused on creating Virtual Machines. VirTEE offers SNP integration for QEMU, and the old technology (SEV) is also integrated fine with libvirtd. Intel offers alternative technologies: Intel SGX (that offers containers, and that is older and more mature in terms of frameworks and implementations) and intel TDX (that offers VMs and is very new).

We made the decision to go down this path for our cloud start-up. We just created a testnet and are looking for feedback. If you would like to know more, I wrote a blogpost about it: https://medium.com/detee-network/so-we-have-a-testnet-now-2950de897ec6

https://redd.it/1iug77w
@r_devops
What do Systems Development Engineer do are they JUST testers??

I recently got mail from recruiters amd eu sovereign cloud they are hiring systems development engineer and i cleared oa and then i clear phone interview it was pretty easy but i am worried now i dont want to some kind of tester you see cam you please help

At AWS and its called systems development engineer managed operations role and i dont understand what it is i dont want to be a teaster and a looser i want to build stuff i want to go low level design stuff:dizzy_face:

Here's a link about the job and description [https://www.amazon.jobs/en/jobs/2874382/systems-development-engineer-managed-operations](https://www.amazon.jobs/en/jobs/2874382/systems-development-engineer-managed-operations)

Please help 🥺🙏

https://redd.it/1iuj5qd
@r_devops
Ultimate DevOps Roadmap 2025 for Absolute Beginners

I have created a detailed blog on how to start your DevOps journey in 2025 with all the FREE resources at each step and with a proper time frame, if you are a beginner and to start your DevOps journey then this guide will help you a lot. Thanks.

DevOps Roadmap

https://redd.it/1iujyxy
@r_devops
embedz - Easy, dependency free embeds for Svelte and Vue.

Easy, dependency free embeds for Svelte and Vue. hey guys just wanted to showcase a component library I've been working for a few months, I have finally released a svelte version, I'm open to feedback as id love to improve and polish this project.

if you wanna check out the project here's the repo, also a star would be awesome :33333

GitHub \- Playground

# Installation

# Supports only Svelte for now, requires Svelte 5 and above
npm i @embedz/svelte

<script>
import { YouTube, Vimeo } from "@embedz/svelte";
</script>

<YouTube
id="KRVnaN29GvM"
posterquality="max"
/>

https://redd.it/1iuk5d2
@r_devops
Securing non-human identities, focusing on authorization - why and how

Hey devops people. There’s been quite a bit of talk about NHIs, especially around the security risks and vulnerabilities that NHIs present to orgs that OWASP has mentioned

Which is why I wanted to share a potential solution to some of those risks, with you all, in case it could be useful.

From the issues mentioned by OWASP - several of them (e.g. Overprivileged NHI) can relatively easily be avoided through the proper authorization of NHIs. 

But, it’s not that simple to authorize workloads in distributed systems, if you don’t have a centralized solution. For example, each service might end up implementing its own authorization logic, and define implicit trust boundaries with dependent systems. This would then create inconsistencies and increase the risk of security gaps. 

The solution I'd like to present that my team and I have worked on. (Disclaimer:I work at Cerbos - an authorization implementation and management solution.)

Instead of scattering access rules across different services, Cerbos centralizes policy management. Making authorization into a scalable, maintainable, and secure process. And hence, minimizes the complications of managing authorization for non-human identities. 

Here’s how it works:

1. Issue a unique identity to each workload. These identities are then passed in API requests, and used to determine authorization decisions.
2. Define authorization policies for non-human identities. 
3. Deploy Cerbos in your architecture (Cerbos supports multiple deployment models - sidecar, centralized PDP, serveless). Cerbos synchronizes policies across your environments, ensuring that every decision is consistent and up to date.
4. Access the Policy Decision Point (PDP) from anywhere in your stack to get authorization decisions.

The technical details on how to authorize NHIs with Cerbos can be found on this page.

If you think this type of solution would be helpful for you (or if it wouldn’t for any reason) I'd love to understand why.

https://redd.it/1iuqbv7
@r_devops
too long; automated: learn to automate unit tests, git tagging, Docker image building & pushing, integration tests and deployment to Cloud Run using GitHub Actions and Workload Identity Federation final part of the "one branch to rule them all series"

I couldn't find an in-depth guide on how to go from requirements gathering, through the implementation and testing, to the automations using CI/CD approach, so I created one: https://www.toolongautomated.com/posts/2025/one-branch-to-rule-them-all-4.html

I've tried to make it as comprehensive as possible, while keeping it conversational and simply fun.

The project I've worked on is:

How to deploy an app to multiple environments so that each env can run a different version of the application?

The implementation is fully open-sourced here: https://github.com/toolongautomated/tutorial-1

Enjoy and let me know what you think guys!

https://redd.it/1iusife
@r_devops
Made our production scaling more than 9x and image pulling by 290x faster

Should I blog it??? I am an intern and I somehow managed to pull this.

Image size ~2.6GB stored in ECR

Scaling / application starting time in EKS
Before: ~ 4 min and 20 sec
After: ~ 35 sec

Image pulling time
Before: ~ 1 min and 30 sec
After: ~ 280 milli sec

If you find it interesting lemme know ... I'll blog this weekend or post it here.

https://redd.it/1iuud94
@r_devops
DevOps in Censorship: Lessons from the TopSec Leak


A data leak from TopSec provides insights into DevOps practices in censorship.

Understanding how advanced technologies, such as Kubernetes and Docker, are leveraged by companies engaged in censorship can inform better security practices within the industry.

This leak illustrates the need for ethical considerations in the deployment of such technologies, urging industry professionals to reflect on their roles.

- Discusses DevOps tools used within censorship operations.

- Explores the need for ethical guidelines in technology deployment.

- Encourages DevOps professionals to consider the broader societal implications of their work.

(View Details on PwnHub)


https://redd.it/1iuv7s0
@r_devops
Why Interviews have become so one-sided nowadays

I have been giving interviews these days and have encountered so many instances where I found that the interviewers are not even trying to interact with interviewee. They are just starting the process start grilling like if they are facing their enemy and then in last with very less interest asking do you have any questions.

I had given lot of interviews in past but this time I'm seeing it completely different. They are looking for everything to be perfect in an hour call and based on that they are going to decide whether you're a fit or not.

Folks please add your thoughts.

https://redd.it/1iuwewc
@r_devops
On-Premise Minio Distributed Mode Deployment and Server Selection

Hi,

First of all, for our use case, we are not allowed to use any public cloud. Therefore, AWS S3 and such is not an option.

Let me give a brief of our use case. Users will upload files of size \~5G. Then, we have a processing time of 5-10 hours. After that, we do not actually need the files however, we have download functionality, therefore, we cannot just delete it. For this reason, we think of a hybrid object store deployment. One hot object store in compute storage and one cold object store off-site. After processing is done, we will move files to off-site object store.

On compute cluster, we use longhorn and deploy minio with minio operator in distributed mode with erasure coding. This solves hot object store.

However, we are not yet decided and convinced how our cold object store should be. The questions we have:
1. Should we again use Kubernetes as in compute cluster and then deploy cold object store on top of it or should we just run object store on top of OS?
2. What hardware should we buy? Let's say we are OK with 100TB storage for now. There are storage server options that can have 100TB. Should we just go with a single physical server? In that case deploying Kubernetes feels off.

Thanks in advance for any suggestion and feedback. I would be glad to answer any additional questions you might have.

https://redd.it/1iuy4xk
@r_devops
How does everyone handle versioning/releases with monorepos?

We are using Trunk Based Development & a monorepo setup for around 50 services.

Ideally, I would like to have each service individually versioned as having a version for all doesn't scale well, mainly around the fact it would trigger a release pipeline for every service, even if it has no changes.

How does everyone approach this around releases?

It is not scalable either to have the developers or owner cut a release branch for every single service release/service1/1.0.0 or release/service2/1.0.1 for example. It would take a while and would just be a tedious job.

How does everyone approach this situation?

I was thinking some sort of pre-release pipeline which runs git diff to determine which release branches should be cut, the only issues with this is figuring how to get the pipeline to determine which version should be bumped, we are using semver.

https://redd.it/1iuvs6y
@r_devops
Kubernetes Ingress Controller Guide

If you are interessted in learning how to expose services in Kubernetes, read through my new blog article! It's a step by step guide, how to setup an NGINX Ingress Controller via Helm charts.

Medium Blog Article Link

https://redd.it/1iv23xw
@r_devops
Private tf module registry still a thing?

Long story short, we have tons of terraform module re-use and copy/paste across repos and services, so we are looking to create a central module registry/monorepo.

Is this still what most folks are doing? Is this still an adequate way of providing self-service to some extent to product engineers without them having to worry about how their infrastructure is being provisioned.

I know there's a lot of new tooling and platforms in his space so curious as to what others are doing. Things move so fast so it always feels like we are doing things incorrectly.

Thanks

https://redd.it/1iv6tfl
@r_devops
Windows vs Linux on enterprise level

In which case scenarios is Windows Server better than Linux?

https://redd.it/1iv9gfh
@r_devops
My first web server

I am configuring a web server for the first time, I literally have a physical server in my hands and I am deploying web apps and REST APIs.

This is my first experience using any server OS so I choosed Windows Server, I know that it is probably not the safest or most efficient choice for a web server but I thought it was the fastest way to start and learn server concepts in aa practical way. This machine has 3 disks (1TB each), I used one for the OS and configured a RAID 1 for the other two.

As a web server in software level, I am just using an simple Express web server to deploy every single web application, and all the APIs that are deployed are also developed in Express so yeah, Express everywhere. I am using PM2 to handle node processes. When there are any code changes, I pull the code from Github, perform any task needed (building, installing dependencies, etc.), and reload the process. As the applications are used in the same local network, I create reules in the windows firewall defender to open the ports in which the web services or web applications are listening.

What should I do next to improve and learn in a good rythm? What would be the next step? My main priority is to learn about all fundamental concepts of a server in a practical way.

https://redd.it/1iv9ezf
@r_devops
Gitlab pipeline timeout when uploading security scan to defect dojo

Hi Everyone,

I am facing a issue trying to integrate defect dojo with my gitlab ci/cd.

Here is the breakdown:

I am using gitlab built in security scanning templates for dependency scanning,container scanning.

These template generate json reports after scanning.

I am using a python script to upload these json reports to defect dojo

From my local  machine we access mydomain.defectdojo.com via vpn

I can curl with with vpn enabled and upload results.

But in gitlab pipeline the requests api i use to upload throws connection timeout to  mycompany.defectdojo.com 

I also tried running direct curl in the pipeline but it showed  couldnt connect to server

Is this due to vpn not in pipeline ?

How can i fix this issue?



https://redd.it/1ivbcp2
@r_devops