Reddit DevOps
269 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Share sensitive data securely (Yopass, PasswordPusher alternative)

Hey everyone,
I’ve been working on a small side project to solve a common pain point, sharing sensitive data securely.

Introducing SecureShare \- Your Secret, Your Key, Our Link

🔐 Client-side encryption: Your data is encrypted in your browser using AES-256.
🧠 Zero-knowledge: The encryption key never touches the server.
🕓 Self-destruction: Choose between single-use or limited multiple views.

Get started:
https://secure.ardd.cloud


feedback is appreciated :)

https://redd.it/1mjxoxa
@r_devops
Build a Smart Search App with LangChain and PostgreSQL on Google Cloud

Build a Smart Search App with LangChain and PostgreSQL on Google Cloud

Enabling the pgvector extension in Google Cloud SQL for PostgreSQL, setting up a vector store, and using PostgreSQL data with LangChain to build a Retrieval-Augmented Generation (RAG) application powered by the Gemini model via Vertex AI. The application will perform semantic searches on a sample dataset, leveraging vector embeddings for context-aware responses. Finally, it will be deployed as a scalable API on Cloud Run using FastAPI and LangServe.

if you are interested check it out

https://medium.com/@rasvihostings/using-cloud-sql-for-postgresql-with-pgvector-and-langchain-for-semantic-search-b88a06a4e186

https://redd.it/1mjxwt6
@r_devops
How much of your job involves administering tools and user management?

My company has really thrown the kitchen sink at SaaS products. Every week a new one seems to be coming up and I'm struggling to keep track of it. We have SSO enabled for the majority of them, but there are some exceptions and we still need to do work in Google workspace when new ones need to be integrated or some group memberships need to be changed etc.

It often feels like I'm doing office IT rather than DevOps. We did used to have a security/office IT guy who was in charge of all this, but he had to scale his role back because he was too expensive and most of his duties were dumped onto us.

Are things like this a common occurrence? Do you consider managing tools and users as just part of the job as a platform/DevOps engineer?

https://redd.it/1mk0s1l
@r_devops
installing packages not available in linux repos

How do you install packages such OpenSSH in several machines when new versions are not available in linux repos (Alamlinux for exampl)? Compiling and installing in few machines is not complicated but if there are several machines it can be consuming repeating the same process. I have investigated about creating a rpm package or using FPM. What options do you recommends?
I am using Chef, for previous versions of OpenSSH it was very easy for my recipe install the package using package manager.

https://redd.it/1mk0byh
@r_devops
Is CloudQuery usable on-premises ?

I need a CMDB and a unified inventory for on-premises VMs and K8s pods.

Can CloudQuery be deployed on-premises to reach this goal ?

https://redd.it/1mk4asv
@r_devops
Write & Test Scripts faster -- Validate AI generated script's execution before copy pasting them

I created an AI script generator where you can create scripts (currently supports python / bash scripts) and test their execution before copy pasting them to your IDE / repo.

https://aiops.drdroid.io/script-generator

It’s free and no login is required. Would love to get feedback from folks here. :)

https://redd.it/1mk96sg
@r_devops
What are your biggest pain points and blockers

With everyobody using AI and no code these days developing has gotten so easy. Curious to know what type of problems yall run into these days now that many traditional problems are solved. Anything with developing, deployment, analytics, etc. My biggest blocker now is deployment.

https://redd.it/1mkc8h3
@r_devops
Following up on my 'Developer Toil' CLI: Your feedback helped shape v0.6.0, now with multi-service local envs.

Hey r/devops,

Thanks to everyone who weighed in on my post about tackling developer toil last week. Your real-world insights were invaluable.

Two main themes emerged from your feedback:

1. **Validation:** Yes, this is a real problem, and many of you have built similar, complex in-house solutions.
2. **The Challenge:** The hardest part isn't generating config; it's defining the "best practices" that go into it.

I took that to heart. While defining universal best practices is impossible, I realized I could build a flexible framework to help teams apply *their own*.

With that, I've just released **v0.6.0 of Open Workbench.** This update focuses on solving the local development piece of the puzzle for multi-service applications.

**Here’s how it addresses the workflow:**

* **Declarative Local Environments:** The new `workbench.yaml` acts as a single source of truth for defining all the services, components (e.g., gateways), and resources (DBs, caches) that make up your local development environment.
* **Automated Orchestration:** The `om compose` command reads the manifest and generates a full `docker-compose.yml` on the fly. This eliminates manual configuration and ensures consistency for every developer on the team.
* **Abstracted Dependencies:** The "Resource Blueprint" system allows developers to attach common infrastructure dependencies like PostgreSQL or Redis locally, with the system designed to target Terraform modules in the future.

**I'm looking for your operational insights on these changes:**

* Does this `workbench.yaml` approach seem like a scalable way to manage local environments?
* What operational blind spots or potential "gotchas" do you see in this workflow?
* How can this model better pave the way for a smooth transition to cloud deployments (e.g., Terraform generation)?

**Call for Contributors:**

Your feedback confirmed that many companies are solving this same problem internally. My goal is to build a robust, open-source alternative we can all share and improve. I'm looking for contributors interested in:

* **Platform Engineering:** Helping to shape the vision and architecture.
* **Infrastructure as Code:** Building out the Terraform generation capabilities.
* **Extensibility:** Defining more resource blueprints for tools like Kafka, RabbitMQ, or specific databases.

Let's build the tool we've all had to build in-house, but do it once, in the open.

**GitHub Repo:** [`https://github.com/jashkahar/open-workbench-platform`](https://github.com/jashkahar/open-workbench-platform)

Thanks for helping guide this project!

https://redd.it/1mkk5ll
@r_devops
What stack that is just reliable and requires minimal ops?

Hi everyone

I am curious. What's a stack that requires minimal devops and hand holding and yak shaving?

Is it php?

Can I just set unattended upgrades and leave a site running for years ?

https://redd.it/1mkoc60
@r_devops
Semantic Clinic — a reproducible map of AI failures (math-first, MIT, model-agnostic)

I’m publishing the Semantic Clinic as the canonical, MIT-licensed index for diagnosing and fixing AI failures with math, not folklore. It is a model-agnostic, pipeline-aware triage hub that you can apply to GPT, Claude, Gemini, local LLMs, single agents or multi-agent stacks. The single source of truth lives here:

Semantic Clinic (canonical link):
https://github.com/onestardao/WFGY/blob/main/ProblemMap/SemanticClinicIndex.md

OCR Legend Tesseract.js Author Starred my repo (WFGY on top now)
https://github.com/bijection?tab=stars

What it is.

Most failures are layered: OCR → parsing → chunking → embeddings → vector store → retriever → prompt assembly → LLM reasoning. One upstream distortion hides a downstream hallucination. The Clinic organizes these into reproducible failure families (prompting, retrieval/data, reasoning, memory/long-context, multi-agent/orchestration, infra/deploy, evaluation). Each family links to a precise fix page and acceptance criteria. No prompt tricks, no patchwork—every remedy is a structural intervention.

What we’ve shipped.

A field-tested Problem Map and Clinic that cover the common failure patterns devs actually hit in production (RAG drift, traceability gaps, logic collapse, memory fractures, agent conflicts, bootstrap/deploy deadlocks, etc.).
One-click sandboxes/Colabs (linked from the Clinic/Problem Map) that run the instruments without installation or private APIs.
A thin “TXT OS” operating layer (referenced from the Clinic) so any model can apply the engine with zero install.
Cold start to now: \~50+ days, \~360 from real users; growth driven by issue reports and fixes, not hype. We also maintain a running testimony of field saves: Hero Log → https://github.com/onestardao/WFGY/discussions/10

The mathematics (concise spec).
The Clinic is powered by three instruments and four repair operators. You don’t need to memorize the algebra to use them, but the math is public and consistent across pages.

ΔS (semantic stress). A scalar drift signal computed from embedding geometry; we use `ΔS = 1 − cos(I, G)` where I is the current view and G is the ground/anchor. Operational thresholds: `<0.40` stable, `0.40–0.60` transitional, `≥0.60` high risk. Probe questionretrieved context and contextexpected anchor to localize where meaning tears.
λ_observe (layered observability). A finite-state tag per layer: convergent (→), divergent (←), recursive (<>), chaotic (×). If upstream λ is stable and downstream flips divergent, the fault is at the boundary between those layers.
E\_resonance (coherence control). A rolling statistic on residual magnitude under correction; if E rises while ΔS stays high, perform a controlled reset and variance clamp.
Repair operators (WFGY modules).
BBMC — semantic residue minimization: reduce ‖B‖ with re-grounding and anchor re-specification.
BBPF — multi-path progression: explore/weight parallel semantic paths to avoid dead ends.
BBCR — collapse→rebirth control: detect failure at threshold and rebuild a safe bridge node.
BBAM — attention variance modulation: stabilize attention to prevent entropy melt in long or noisy contexts.

How you verify fixes.

Keep it falsifiable. Target ΔS ≤ 0.45 for direct QA after retrieval/prompt corrections; require λ to remain convergent across paraphrases; ensure E_resonance does not trend upward over longer windows; make retrieval traceable (cite lines snippets). If those conditions do not hold, you don’t “tune” more prompts—you change the structure (index metric/normalization, schema lock, bridge nodes, agent boundaries, boot order).

Reproducibility.
Everything in the Clinic is designed to run with fixed seeds and minimal
prerequisites. The Colab tools referenced from the Clinic make the probes and resets observable end-to-end. If you only copy one thing, copy the Clinic link above; it fans out to the families, fixes, and sandboxes.

Why this belongs in open source.
Open source doesn’t need another glossy “best practices” PDF. It needs an operational map you can run in public, verify on your stack, and argue about in issues. The Clinic is that map: math-first, license-clean, reproducible, and written to be forked, critiqued, and extended.

If this saves you a day in vector-store purgatory or a night chasing phantom jailbreaks, star the repo and drop a note in the Hero Log. We read every case because the failure patterns are the dataset.

Canonical link (again):

https://github.com/onestardao/WFGY/blob/main/ProblemMap/SemanticClinicIndex.md

MIT-licensed. Contributions, counter-examples, and adversarial tests are very welcome.

https://redd.it/1mktxxc
@r_devops
Employers of DevOps Engineers

I love being a DevOps Engineer. I like solving problems, learning about new stuff, understanding big systems, helping people, and getting paid pretty well.

You know what kinda sucks though? There's only certain kinds of employers that hire DevOps Engineers. Sometimes I'll think about who else I could work for, and then I'll be reminded that they don't have my role at that company.

For example, I live in a small-mid-sized town, far away from any big city. I work remotely. If I wanted to find a job locally I surely could. But it would most likely be as a systems engineer or something and it wouldn't pay nearly as well as what I'm making now.

Another example, I see some big company that has a reputation for being a good member of the community, doing charitable works, etc. Wouldn't it be neat to work for them? Oh, but they're a traditional retailer. They have IT for sure, but probably not programmers, let alone DevOps.

To work as a DevOps Engineer you usually have to work for somewhere fairly sizeable and either in a big city or remote for a place in a big city.

#firstworldproblems

https://redd.it/1mkwgr2
@r_devops
Infragram: C4 style architecture diagrams for Terraform

Hello everyone,

I'm working on Infragram, an architecture diagram generator for terraform. I thought to share it here and gather some early feedback from the community.

It's packaged as a vscode extension you can install from the marketplace. Once installed, you can generate two types of diagrams:

1) An architecture diagram which is a source representation.

2) A plan diagram which is a visual representation of your plan diff.

The diagrams are interactive and allow you to zoom in and out to see varying levels of detail for your infrastructure, a la the C4 Model. Also it runs completely offline, your code never leaves your machine.

I've put together a quick video to demo the concept, if you please.

You can also see these sample images 1, 2, 3, 4 to get an idea of what the diagrams look like.

Do check it out and share your feedback, would love to hear your thoughts.

https://redd.it/1mkxnc7
@r_devops
Daily Upskilling after office hours

I recently got into Devops and I'm preparing for certification which definitely demands consistency and good practice.

I am willing to connect with people from same field who can dedicatedly show up daily and study for at least an hour.

We can study and do project or anything related to devops on our timing and interests.

#Lets connect !!!

https://redd.it/1mkv9xt
@r_devops
Specializing in Kubernetes/OpenShift vs. going full DevOps

I see many “DevOps Engineer” roles mixing ops, dev, and tooling — feels like being spread too thin. I’m instead focusing on becoming highly skilled in Kubernetes/OpenShift (admin, architecture, security) while knowing enough tools like Git, CI/CD, automation, and monitoring to integrate with teams.

Do you think deep K8s/OpenShift specialization is a smart long-term move, or will it limit opportunities compared to a generalist DevOps path?

https://redd.it/1ml0gv4
@r_devops
Made an Azure Devops Pipeline Visualization tool, since I couldn't find one

We are doing a lot of complex pipeline stuff at work recently, moving stages around in pipelines and it is VERY easy to get them wrong. One big annoyance I had is that I realized in Azure DevOps, there is no way to preview your pipeline while experimenting on them, if you're using YAML pipelines!

The only way to visualize your new pipeline layout is to run your pipeline! That is no good.

So I wrote this single page app tool using the Konva JS Library which is awesome for drawing arrows and lines. It should work on any YAML file but I made it primarily for ADO. FoxDeploy - ADO Visualizer . I literally spent like an hour trying to draw lines on my own in native Canvas and JavaScript before giving up and using the Konva JS package instead, they had good docs

I used some chatGpt help to get this done in a reasonable amount of time, especially around parsing the Yaml files and all, so feel free to burn me at the stake for that if you need to.

Open an issue on the repo if you find a bug or want me to add some more features. No metrics, and no data leaves the container or is saved.


https://redd.it/1ml2lk7
@r_devops
How do you guys categorize all of your skills and known services on your resume?

After working in DevOps for a couple years and going back to update my resume, I realize I'm struggling to fit all of my relevant skills and finding a way to categorize them properly. There are so many services you pick up and learn that it really clogs up your resume. For context I'm talking about the Skills section of a resume, usually displayed at the top. I previously just had 'Languages:' and 'Technologies:' but I feel that I need to split up the technologies subheader into a couple of different things.

My tentative list is something like:
CI/CD: AWS, Kubernetes, Terraform, Helm, ArgoCD, Jenkins, Linux, Docker, GitHub Actions
Monitoring: Prometheus, Grafana, Fluentbit, Elasticsearch, Kibana
Languages: JavaScript, Python, SQL, Bash, Go, C/C++

Leftover skills I would like to include if possible:
IT/compliance skills: Entra ID (Azure AD), OAuth2, SSO, IAM
Other: Networking, Kafka, GCP (minimal), CDNs

Dev related things that I used previously but not anymore and already cut from my resume:
MySQL, MariaDB, MongoDB, nodeJS, .NET, C#

Then there are other things I see in job descriptions that I use and know but wouldn't really think to highlight on my resume like Bitbucket, Jira, YAML, json, Agile

Are any of things I cut worth keeping?
I already decided not to include AWS services individually or else the list would get too long, but are things like EKS, VPC, EC2, IAM, Lambda too important not to include or should those be relegated to the experience section? Just because I do see quite a few named AWS services in job descriptions.

Let me know your experience with this and what you guys would suggest, thanks!

https://redd.it/1ml2fx5
@r_devops
DevOps isn’t a job title, is it?

Most of the stuff I read here talks of “DevOps engineers”. We hired a DevOps engineer at my company, though he has since left. I’ve been reading up on the concepts a lot and it seems to me that it’s an approach/methodology or something like that. It doesn’t seem like a “job”, per se. E.g., just like “Agile engineer” does seem like a job to me.

A DevOps approach seems to view development and operations as part of the same effort, and not separate or opposing activities. Obviously there’s a lot of tooling and particular practice needed to do DevOps, lots of automation and monitoring, but it’s not clear to me that there is a role that should be called DevOps engineer. Am I thinking about this wrong?

https://redd.it/1ml4pxd
@r_devops
Reverse Proxy Deep Dive: Why Load Balancing at Scale is Hard

This is Part 4 in my deep dive series on reverse proxies in production. This post explores the real challenges of load balancing at scale: why simple round robin often falls short, handling uneven request loads, dynamic upstream changes, sticky sessions, and the complexities of proxy architecture.

It covers key topics like warm-up periods for hosts, local vs global load balancing views, common algorithms like least connections and consistent hashing, and practical challenges in large-scale environments.

If you manage load balancing or proxy infrastructure, I’d love to hear your thoughts or experiences with these challenges.

10-minute read here: https://startwithawhy.com/reverseproxy/2025/08/08/ReverseProxy-Deep-Dive-Part4.html
Previous parts cover connection management, HTTP parsing, and service discovery.

https://redd.it/1ml6vdu
@r_devops