Reddit DevOps

Managing browser-heavy CI/CD tests without heavy containers any slick setups?

My CI pipeline relies widely on browser-based end-to-end tests (OAuth flows, payment redirects, multi-session scenarios). Containers and headless browsers work, but they're resource-intensive and sometimes inaccurate due to fingerprint differences.
Has anyone used tools that provide isolated, local browser sessions you can script or profile-test with minimal overhead?

https://redd.it/1ljw1a6
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views04:28

Reddit DevOps

[Feedback Wanted] Container Platform Focused on Resource Efficiency, Simplicity, and Speed

Hey r/devops! I'm working on a cloud container platform and would love to get your thoughts and feedback on the concept. The objective is to make container deployment simpler while maximizing resource efficiency. My research shows that only 13% of provisioned cloud resources are actually utilized (I also used to work for AWS and can verify this number) so if we start packing containers together, we can get higher utilization. I'm building a platform that will attempt to maintain \~80% node utilization, allowing for 20% burst capacity without moving any workloads around, and if the node does step into the high-pressure zone, we will move less-active pods to different nodes to continue allowing the very active nodes sufficient headroom to scale up.

My primary starting factor was that I wanted to make edits to open source projects and deploy those edits to production without having to either self-host or use something like ECS or EKS as they have a lot of overhead and are very expensive... Now I see that Cloudflare JUST came out with their own container hosting solution after I had already started working on this but I don't think a little friendly competition ever hurt anyone!

I also wanted to build something that is faster than commodity AWS or Digital Ocean servers without giving up durability so I am looking to use physical servers with the latest CPUs, full refresh every 3 years (easy since we run containers!), and RAID 1 NVMe drives to power all the containers. The node's persistent volume, stored on the local NVMe drive, will be replicated asynchronously to replica node(s) and allow for fast failover. No more of this EBS powering our databases... Too slow.

Key Technical Features:

* True resource-based billing (per-second, pay for actual usage)
* Pod live migration and scale down to ZERO usage using [zeropod](https://github.com/ctrox/zeropod)
* Local NVMe storage (RAID 1) with cross-node backups via [piraeus](https://piraeus.io/)
* Zero vendor lock-in (standard Docker containers)
* Automatic HTTPS through Cloudflare.
* Support for port forwarding raw TCP ports with additional TLS certificate generated for you.

Core Technical Goals:

1. Deploy any Docker image within seconds.
2. Deploy docker containers from the CLI by just pushing to our docker registry (not real yet): `docker push` [`ctcr.io/someuser/container:dev`](https://ctcr.io/someuser/container:dev)
3. Cache common base images (redis, postgres, etc.) on nodes.
4. Support failover between regions/providers.

Container Selling Points:

* No VM overhead - containers use \~100MB instead of 4GB per app
* Fast cold starts and scaling - containers take seconds to start vs servers which take minutes
* No cloud vendor lock-in like AWS Lambda
* Simple pricing based on actual resource usage
* Focus on environmental impact through efficient resource usage

Questions for the Community:

1. Has anyone implemented similar container migration strategies? What challenges did you face?
2. Thoughts on using Piraeus + ZeroPod for this use case?
3. What issues do you foresee with the automated migration approach?
4. Any suggestions for improving the architecture?
5. What features would make this compelling for your use cases?

I'd really appreciate any feedback, suggestions, or concerns from the community. Thanks in advance!

https://redd.it/1ljvhjs
@r_devops

GitHub

GitHub - ctrox/zeropod: pod that scales down to zero

pod that scales down to zero. Contribute to ctrox/zeropod development by creating an account on GitHub.

9 views05:28

Reddit DevOps

DevOps Roadman

Hello guys i really want to migrate to DevOps, but i struggle find a job. Here is some background of mine i am in the IT field 4+ years mainly dealing with networking equipment , Linux servers , firewalls , and IPS . I have self studied Python and also worked in home environment with Git , Docker and K8S (obviously not a pro) . Any tips at this point will be appreciated and also if you want to share your story how you become DevOps engineer be free to share . Thanks in advance !

https://redd.it/1ljy9lq
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views06:28

Reddit DevOps

Codeline baseline and mainline confision specially between codeline and baseline.

Mainline seems to be something that will be released. But codeline and baseline sound similar. What is the difference? Context git flow workflow

https://redd.it/1ljz6w8
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

10 views07:28

Reddit DevOps

DevOps/SE Starter Guide

Business Management graduate here working at a tech consulting company in the UK, looking to get into Project Management. My work do a lot of software engineering and DevOps, but my technical background is very limited, so I understand the financial aspects of projects but not the service delivery side.

Does anybody have recommendations of free courses (or even YouTube videos) to take to start from the beginning, most that I have tried assume you have some prior knowledge, to which I have basically none. Thanks!

https://redd.it/1ljzzsk
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

8 views08:28

Reddit DevOps

Am I literally the ONLY person who's hit this ArgoCD + Crossplane silent failure issue??

Okay, this is driving me absolutely insane. Just spent the better part of a week debugging what I can only describe as the most frustrating GitOps issue I've ever encountered.

The problem: ArgoCD showing resources as "Healthy" and "Synced" while Crossplane is ACTIVELY FAILING to provision AWS resources. Like, completely failing. AWS throwing 400 errors left and right, but ArgoCD? "Everything's fine! 🔥 This is fine! 🔥"

I'm talking about Lambda functions not updating, RDS instances stuck in limbo, IAM roles not getting created - all while our beautiful green ArgoCD dashboard mocks us with its lies.

The really weird part: I've been Googling this for DAYS and I'm finding basically NOTHING. Zero blog posts, zero Stack Overflow questions, zero GitHub issues that directly address this. It's like I'm living in some alternate dimension where I'm the only person running ArgoCD with Crossplane who's noticed that the health checks are fundamentally broken.

The issue is in the health check Lua logic - it processes status conditions in array order, so if Ready: True comes before Synced: False in the conditions array, ArgoCD just says "cool, we're healthy!" and completely ignores the fact that your cloud resources are on fire.

Seriously though - has NOBODY else hit this?

Are you all just... not using health checks with Crossplane?
Is everyone just monitoring AWS directly and ignoring ArgoCD status?
Am I the unluckiest person alive?
Did I stumble into some cursed configuration that nobody else uses?

I fixed it by reordering the condition checks (error conditions first, then healthy conditions), but I'm genuinely baffled that this isn't a known issue. The default Crossplane health checks that everyone copies around have this exact problem.

Either I'm missing something obvious, or the entire GitOps community is living in blissful ignorance of their deployments silently failing.

Please tell me I'm not alone here. PLEASE.

UPDATE: Fine, I wrote up the technical details and solution here because apparently I'm pioneering uncharted DevOps territory over here. If even ONE person hits this after me, at least there will be a record of it existing.

https://redd.it/1lk0wgz
@r_devops

Medium

The Silent Failure: Why Your ArgoCD + Crossplane Resources Show Healthy When They’re Not

How a subtle health check logic bug can make your GitOps pipeline lie to you

9 views09:28

Reddit DevOps

nbuild, Yet Another Ci/Cd.

nbuild in action: https://nappgui.com/builds/en/builds/r6349.html

Oriented to C/C++ projects based on CMake.
Written in ANSI C90 with NAppGUI-SDK.
Runs as a command line tool: `nbuild -n network.json -w workflow.json`
Works on a local network, no cloud bills.
Monolithic design, no scripting.
Splits large build jobs into priority queues.
Threading. Multiple runners in parallel.
SSH is the only requeriment on runners, apart from CMake and compilers.
Power on/off on demand. Supports VirtualBox, UTM, VMware, macOS bless.
Runners are preconfigured. No setup from scratch.
Supports legacy systems.
Generates HTML5/LaTeX/PDF project documentation with ndoc.
HTML5 build reports.
Open Source: https://github.com/frang75/nbuild

https://redd.it/1ljyfhz
@r_devops

GitHub

GitHub - frang75/nbuild: Yet Another Ci/Cd system

Yet Another Ci/Cd system. Contribute to frang75/nbuild development by creating an account on GitHub.

11 views10:28

Reddit DevOps

🛡️ RELIAKIT TL-15 Open-Source Chaos + Healing Framework for Planet-Grade Infrastructure

Built for resilience engineers, platform teams, and SREs who want more than just monitoring — they want autonomous recovery.

Let me know what you think — would love your input and improvements!

🔗 GitHub again:

https://github.com/zebadiee/reliakit-tl15

🤝 Looking For
• Feedback on architecture
• Contributors to test new zones
• Suggestions for AI drift detection features
• Adoption in real infrastructure setups

https://redd.it/1lk2lli
@r_devops

GitHub

GitHub - zebadiee/reliakit-tl15: ReliaKit TL-15 is an open-source, planet-grade resilience framework for distributed infrastructure.…

ReliaKit TL-15 is an open-source, planet-grade resilience framework for distributed infrastructure. It integrates automated DDoS protection, geo-aware routing, chaos engineering, and symbolic AI ho...

8 views11:28

Reddit DevOps

Getting a Remote Job is hard – Returning After Maternity Break

I’ve been working in an office-based DevOps role for 10 years. After a brief 2-month maternity leave, I hope to work remotely for at least a year to care for my newborn.

However, reality has hit hard — I’ve been actively applying on LinkedIn and over 20 other platforms for the past two months with zero responses.

I’ve tried all the common remote job sites people recommend, even registered on Toptal, freelancer.com, and many others, but they seem overwhelmed right now.

I’m not outdated — I have solid experience with AWS, GCP, Kubernetes, Linux, Jenkins, Argo, Kafka, and many other widely used tools.

Not sure if I’m doing something wrong or if the market is just this tough. If anyone has any advice, leads, or referrals, I’d deeply appreciate it.

https://redd.it/1lk58xc
@r_devops

Freelancer

Hire Freelancers & Find Freelance Jobs Online

Find & hire top freelancers, web developers & designers inexpensively. World's largest marketplace of 50m. Receive quotes in seconds. Post your job online now.

9 views13:28

Reddit DevOps

Any DevOps podcasts / newsletters / LinkedIn people worth following?

Hey everyone!

Trying to find some good stuff to follow in the DevOps world — podcasts, newsletters, LinkedIn accounts, whatever.

Could be deep tech, memes, hot takes, personal stories — as long as it’s actually interesting

If you've got any favorites I'd love to hear about them!

https://redd.it/1lk4l7t
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

7 views14:28

Reddit DevOps

Containerized PDF-OCR Workflow: Trying newly OCRFlux

Hey all, just wanted to share some notes after playing around with a containerized OCR workflow for parsing a batch of PDF documents - mix of scanned contracts, old academic papers, and some table-heavy reports. The goal was to automate converting these into plain Markdown or JSON, and make the output actually usable downstream.

Stack:
- Docker Compose setup with a few containers:
1. Self-hosted Tesseract (via tesseract-ocr/tesseract image)
2. A quick Nanonets test via API calls (not self-hosted, obviously, but just part of the pipeline)
3. Recently tried out OCRFlux - open source and runs on a 3B VLM, surprisingly lightweight to run locally

What I found:
- Tesseract
1. It's solid for raw text extraction from image-based PDFs.
2. Struggles badly with layout, especially multi-column text and anything involving tables.
3. Headers/footers bleed into the content frequently.
4. Works fine in Docker, barely uses any resources, but you'll need to write a ton of post-processing logic if you're doing anything beyond plain text.

- Nanonets (API)
1. Surprisingly good at detecting structure, but I found the formatting hit-or-miss when working with technical docs or documents with embedded figures.
2. Also not great at merging content across pages (e.g., tables or paragraph splits).
3. API is easy to use, but there’s always the concern around rate limits or vendor lock-in.
4. Not ideal if you want full control over the pipeline.

- OCRFlux
1. Was skeptical at first because it runs a VLM, but honestly it handled most of the pain points from the above two.
2. Deployed it locally on a 3090 box. Memory usage was high-ish (\~12-14GB VRAM during heavy parsing), but manageable.
3. What stood out:
- Much better page-reading order, even with weird layouts (e.g., 3-column, Chinese and English mixed PDFs). If the article has different levels of headings, the font size will be preserved.
- It merges tables and paragraphs across pages, which neither Tesseract nor Nanonets handled properly.
- Exports to Markdown that’s clean enough to feed into a downstream search/indexing pipeline without heavy postprocessing.

- Trade-offs / Notes:
1. Latency: Tesseract is fastest (obviously), OCRFlux was slower but tolerable (~5-6s per page). Nanonets vary depending on the queue/API delay.
2. Storage: OCRFlux’s container image is huge. Not a problem for my use, but could be for others.
3. Postprocessing effort: If you care about document structure, OCRFlux reduced the need for cleanup scripts by a lot.
4. GPU dependency: OCRFlux needs one. Tesseract doesn’t. That might rule it out for some people.

TL;DR: If you’re just OCRing receipts or invoices and want speed, Tesseract in a container is fine. If you want smarter structure handling (esp. for academic or legal documents), OCRFlux was way more capable than I expected. Still experimenting, but this might end up replacing a few things in my pipeline.

https://redd.it/1lk6qsx
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

10 views15:28

Reddit DevOps

Quick wins to cut your CI/CD costs in half

We've worked with countless teams on optimizing their CI/CD costs. Here are the biggest wins that work regardless of your setup:

Immediate fixes:

Switch to spot instances (60-90% cheaper, works fine for most builds)
Audit your runner idle time - most teams have runners sitting unused 70% of the time
Cache everything aggressively (npm, pip, docker layers, etc.)
Right-size your runners - most builds don't need 8-core machines

Slightly harder but bigger impact:

Parallelize tests intelligently
Kill zombie jobs (set proper timeouts)
Use build matrices only when necessary
Optimize Docker layer ordering

The spot instances thing alone cut one team's AWS bill by 65%. Most CI workloads handle interruptions just fine.

We wrote up a more complete guide if your interested: https://depot.dev/blog/how-to-reduce-cicd-costs-complete-optimization-guide

What's worked for you? Always looking for more optimization tricks.

https://redd.it/1lka3p0
@r_devops

Depot

How to reduce CI/CD costs: Complete optimization checklist

Most engineering teams overspend on CI/CD by 50% or more. Here's how to cut your build costs in half without slowing down deployments.

8 views16:28

Reddit DevOps

Apple Container: native support for containers on Mac is game changing, or 'meh'?

Apple recently released native support for containers. I've been trying it for local dev stuff like Postgres and Redis, and it is looking fast and lightweight.

Apple came late with this announcement, but I think it might be a big deal. Making the most out of Macs can be soon a reality for containerized apps in production. I have seen big vendors like Github using Mac Minis to run systems in production such as their CI/CD pipelines with Github Actions, maybe this will happen more now that containers are natively supported?

It still lacks support for many things we have in the Docker ecosystem (compose, orchestration tools, etc), but I hope they catch up with the latest docker compatible stuff soon.

What are your thoughts on it? Are you using it or planning to?

I built a terminal UI to make it easy to manage Apple containers. It is written in Go.
https://github.com/andreybleme/lazycontainer

https://redd.it/1lk5wmp
@r_devops

GitHub

GitHub - apple/container: A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written…

A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written in Swift, and optimized for Apple silicon. - apple/container

9 views17:28

Reddit DevOps

OpsGenie shutting down, Pagerduty or Rootly?

I sure as hell will not switch my entire workflow / ticketing system over to Atlassian LOL. but i get it, most companies they're targeting probably already have Atlassian contracts.

Stuff I need:

\- integrations with ASPM / DSPM (crowdstrike/groundcover).. i'm not writing lambda functions to convert one alert into another.

\- not charged arm and leg for phone calls

\- slack integration would be a massive plus.

\- good team modelling.

\- different on-call schedules and overrides. if can integrate with HR management system that'd save me so much time LOL

\- don't really care about the UI much, hopefully don't have to log-in more than a few times a month

pricing obviously cheaper better.

looks like both has "easy" migration, where they'll do it for us

thoughts?

https://redd.it/1lkcjxg
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views18:28

Reddit DevOps

Azure - VMSS undergoing maintenance.

Anyone else seeing this over and over today? Im in CentralUS and all my VMSSs are going into maintenance on and off for the last few hours.

https://redd.it/1lkc9m1
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

8 views19:28

Reddit DevOps

I have an interview for a Junior DevOps engineer position at EY, what to expect in interview?

So the interview is suppose to be strictly 30 minutes. My guess is it will mostly be behavioral type questions about my background. Does anyone have any experience with this? It's with the IT Risk and Compliance Team.

https://redd.it/1lkfy4z
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views20:28

Reddit DevOps

what is the best way to learn helm charts?

i have completed a helm charts course on cloud guru and i feel like i get the concept of it well enough but i wouldnt know where to even begin if i were to actually develop a helm chart for an application without using the public repo. which sucks because i have been tasked to do exactly that at work.

to those who are proficient at Helm, what was your learning method? how did you go from watching or reading about it to actually developing working charts?

https://redd.it/1lkh9zi
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views21:28

Reddit DevOps

study course or book to learn DevOps from zero to hero

I was googling and there are so many offerings on learning devops i wanted to come on here and ask what is the preferred way to start my journey.

my background is a network engineer, i have used ansible and netmiko python library to run simple repetitive tasks like backing up config on network gear.

thanks

https://redd.it/1lki900
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views22:28

Reddit DevOps

Transitioning from Cybersecurity to Cloud Architecture — Advice Welcome

I recently transitioned from a cybersecurity role into a cloud architect position and received a $17K raise—bringing my total comp to $115K. I’ve got around 3 years of experience, hold a master’s degree, and currently work as a Lead Associate with a TS/SCI clearance.

That said… I can’t shake the feeling that I’m still underpaid given my background, skills, and clearance. I'm looking ahead and trying to figure out what’s next in my journey.

Reddit—has anyone made a similar leap or been in this position before? What advice would you give someone trying to level up from here?

https://redd.it/1lkllx0
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views00:28

Reddit DevOps

Introducing DockedUp: A Live, Interactive Docker Dashboard in Your Terminal 🐳

Hello r/devops!

I’ve been working on DockedUp, a CLI tool that makes monitoring Docker containers easier and more intuitive. If you’re tired of juggling docker ps, docker stats, and switching terminals to check logs or restart containers, this might be for you!

## What My Project Does
DockedUp is a real-time, interactive dashboard that displays your Docker containers’ status, health, CPU, and memory usage in a clean, color-coded terminal view. It automatically groups containers by docker-compose projects and uses emojis to make status (Up 🟢, Down 🔴) and health (Healthy ✅, Unhealthy ⚠️) instantly clear. Navigate containers with arrow keys and use hotkeys to:
- l: View live logs
- r: Restart a container
- x: Stop a container
- s: Open a shell inside a container

## Target Audience
DockedUp is designed for developers and DevOps engineers who work with Docker containers and want a quick, unified view of their environment without leaving the terminal. It’s ideal for those managing docker-compose stacks in development or small-scale production setups. Whether you’re a Python enthusiast, a CLI lover, or a DevOps pro looking to streamline workflows, DockedUp is built to save you time and hassle.

## Comparison
Unlike docker ps and docker stats, which require multiple commands and terminal switching, DockedUp offers a single, live-updating dashboard with interactive controls. Compared to tools like Portainer (web-based) or lazydocker (another CLI), DockedUp is lightweight, focuses on docker-compose project grouping, and integrates emoji-based visual cues for quick status checks. It’s Python-based, easy to install via PyPI, and doesn’t need a web server, making it a great fit for terminal-centric workflows.

## Try It Out
It’s on PyPI and takes one command to install (I recommend pipx for CLI tools):

pipx install dockedup

Or:

pip install dockedup

Then run dockedup to start the monitor. Check out the GitHub repo for more details and setup instructions. If you like the project, I’d really appreciate a ⭐ on GitHub to help spread the word!

## Feedback Wanted!
I’d love to hear your thoughts—any features you’d like to see or issues you run into? Contributions are welcome (it’s MIT-licensed).

What’s your go-to way to monitor Docker containers?

Thanks for checking it out! 🚀

https://redd.it/1lkmf9d
@r_devops

GitHub

GitHub - anilrajrimal1/dockedup: A real-time, interactive CLI dashboard for monitoring Docker containers. View status, health,…

A real-time, interactive CLI dashboard for monitoring Docker containers. View status, health, CPU, and memory usage with a clean, color-coded interface. Supports docker-compose grouping and hotkeys...

11 views01:28

Reddit DevOps

SysDE at AWS worth it?

I'm in an interview loop with AWS for the Systems Development Engineer role building a new region.

My current experience is mainly in AWS, K8s, Python & Shell. The learning opportunities in my current role are great, despite the pay being average. My goal is to maximise my earning potential by getting into big tech, while also having access to learning opportunities, especially in dev side of devops.

Despite the pay at AWS being potentially great, the job description of the SysDE role seems very vague. I haven't been told much other than the fact that it involves Linux and some programmimg.

Anyone been a SysDE at AWS? What's the exact tech stack? How much dev work does it really involve? I'm not sure if doing mostly linux administration is worth the great pay package, if that were the case.

https://redd.it/1lknnyf
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

8 views02:28

About

Blog

Apps

Platform