Reddit DevOps
269 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
What’s your workflow for tracking upstream updates for internal tools?

I believe regular version upgrades are important. Our team uses a lot of third-party tools internally, or even something integrated into our product.

Curious how you guys are tracking their versions in an efficient way? Or just a manual check?

https://redd.it/1mjnzaq
@r_devops
Snowflake is ending password only logins. What is your team switching to?

Heads up for anyone working with Snowflake.

Password only authentication is being deprecated and if your org has not moved to SSO, OAuth, or key pair access, it is time.

This is not just a policy updateIt is part of a broader move toward stronger cloud access security and zero trust.

Key takeaways

• Password only access is no longer supported

• Snowflake is recommending secure alternatives like OAuth and key pair auth

• Deadlines are fast approaching

• The transition is not automatic and needs coordination with identity and cloud teams

What is your plan for the transition and how do you feel about the change??

For a breakdown of timelines and auth options, here’s a resource that helped
https://data-sleek.com/blog/snowflake-password-only-access-deprecation/

https://redd.it/1mjnwlt
@r_devops
Need a partner to practise and learn DevOps after my office hours

I'm currently in a data analytics role, and I'm looking forward into breaking into roles like DevOps/SRE/cloud. And need a friend with whom I can make projects, and have a learning journey. I'm looking forward to do this after my office hours.. ie btwn 6pm-12am (IST) ... I need someone to share my projects... Get feedback, help on my projects... And learn.

https://redd.it/1mjr5is
@r_devops
Can you give me some recommendations regarding certifications

Hello group i want to get some DevOps related certificate , can you share your opinion which technologies and certificates have real work value . I was wondering for AWS DevOps, but before i start i just want to see which will be better. Keep in mind that i dont have many experience with the role more like Sys admin / network security of a guy .

https://redd.it/1mjtkv8
@r_devops
The Jira use (or misuse)

Do you find it funny that, engineers or senior managers who advocate for tools like jira, are the ones who less use it, while engineers who most use it, hate it?

What I mean is, senior managers or PMs for example, usually only deal with setting milestones and writing epics, then every now and then pull some reports and that's about it. While engineers do have to deal with setting boards, sprints, labels, views, queries and what not...which can be frustrating to say the least.

I just don't understand how this tool made it to be industry standard, when 80% of its features nobody uses. Its so bloated, now AI is being pushed into it of course.

I'd be willing to bet other tools would achieve the same just fine, for a fraction of the cost. Now, of course, fighting that fight with a while company is another story...

https://redd.it/1mjuehw
@r_devops
Follow up on "How to not be shitty at DevOps" a few months into the role.

Hello Everyone..

Using my alt account as Reddit don't seem to like users using VPNs and throwaway email addresses...

Anyhow, a while back I asked how to not be shitty at DevOps was a new adventure (I was a Linux sysadmin with K8s and scripting skills) - https://www.reddit.com/r/devops/comments/1klkh3e/how\_to\_not\_be\_shitty\_at\_devops/

I thought I owed it to the community to come back and follow up...

Initially I had some major concerns about "ooops" moments and if I measured up. I am happy to say that I landed in a great environment with a great team and good leadership. They didn't pay me to say that, honest! That said, its a hardcore environment and results are important (but in a not at all costs way).

The first few days where "OMG What have I done?" but after that, once all the accounts worked as expected and getting to know the people it turned out to be a very good experience. I *thought* I knew the tools and tech but it was a whole new level. That said, they have been kind and patient with me and my boss is overflowing with praise because he is getting really good positive feedback from all quarters.

As for "oops" moments, sure I made a few mistakes but haven't taken anything down (yet) but the thing with DevOps is that is why you have multiple environments and when pushing to prod its triple check, dry run, triple check again. You learn how to minimize oops issues.

As for the pay, yes, it was very worth it. :D

I got headhunted so I cant really advise on getting positions but I am glad I made the jump. If you get the offer, consider it











https://redd.it/1mjulx0
@r_devops
Dealing with a bad brand new manager

I was working as a Backend-Platform Engineer in a very famous scale up company. And you know, things get reorged and a SRE got promoted to EM. This EM (brand new, fresh manager) has a bad style managing:


\- Writes "hello" without a context (thus not following https://nohello.net/en/)
\- Asks you to just click the Apply Terraform button instead of just doing it itself
\- We don't any doc summarizing our 1:1
\- No plans for promotion or feedback given to me, and this is important, I was a Senior Eng previously but I'm not considered Senior here
\- When rushing in projects, he doesn't show up in meetings, we are (the soldiers) just working late nights

I already got an offer from another company, but my current job pays REALLY well and will not get the same TC anywhere. I already was in a mood of quiet quitting, but I would like to hear your opinions and suggestions. THANKS!

https://redd.it/1mjupew
@r_devops
Why do no-code tools often fail to scale in real world use cases?

I've been burned by no-code tools a few times now. They're amazing for building a quick prototype or a simple internal app. But as soon as you try to scale it up, add more complex logic, or integrate with real production systems, they just seem to fall apart. Why does this happen? Is there something fundamentally limited about the no-code approach or am I just picking the wrong tools? It feels like you always end up needing to write actual code.

https://redd.it/1mjw5kr
@r_devops
Share sensitive data securely (Yopass, PasswordPusher alternative)

Hey everyone,
I’ve been working on a small side project to solve a common pain point, sharing sensitive data securely.

Introducing SecureShare \- Your Secret, Your Key, Our Link

🔐 Client-side encryption: Your data is encrypted in your browser using AES-256.
🧠 Zero-knowledge: The encryption key never touches the server.
🕓 Self-destruction: Choose between single-use or limited multiple views.

Get started:
https://secure.ardd.cloud


feedback is appreciated :)

https://redd.it/1mjxoxa
@r_devops
Build a Smart Search App with LangChain and PostgreSQL on Google Cloud

Build a Smart Search App with LangChain and PostgreSQL on Google Cloud

Enabling the pgvector extension in Google Cloud SQL for PostgreSQL, setting up a vector store, and using PostgreSQL data with LangChain to build a Retrieval-Augmented Generation (RAG) application powered by the Gemini model via Vertex AI. The application will perform semantic searches on a sample dataset, leveraging vector embeddings for context-aware responses. Finally, it will be deployed as a scalable API on Cloud Run using FastAPI and LangServe.

if you are interested check it out

https://medium.com/@rasvihostings/using-cloud-sql-for-postgresql-with-pgvector-and-langchain-for-semantic-search-b88a06a4e186

https://redd.it/1mjxwt6
@r_devops
How much of your job involves administering tools and user management?

My company has really thrown the kitchen sink at SaaS products. Every week a new one seems to be coming up and I'm struggling to keep track of it. We have SSO enabled for the majority of them, but there are some exceptions and we still need to do work in Google workspace when new ones need to be integrated or some group memberships need to be changed etc.

It often feels like I'm doing office IT rather than DevOps. We did used to have a security/office IT guy who was in charge of all this, but he had to scale his role back because he was too expensive and most of his duties were dumped onto us.

Are things like this a common occurrence? Do you consider managing tools and users as just part of the job as a platform/DevOps engineer?

https://redd.it/1mk0s1l
@r_devops
installing packages not available in linux repos

How do you install packages such OpenSSH in several machines when new versions are not available in linux repos (Alamlinux for exampl)? Compiling and installing in few machines is not complicated but if there are several machines it can be consuming repeating the same process. I have investigated about creating a rpm package or using FPM. What options do you recommends?
I am using Chef, for previous versions of OpenSSH it was very easy for my recipe install the package using package manager.

https://redd.it/1mk0byh
@r_devops
Is CloudQuery usable on-premises ?

I need a CMDB and a unified inventory for on-premises VMs and K8s pods.

Can CloudQuery be deployed on-premises to reach this goal ?

https://redd.it/1mk4asv
@r_devops
Write & Test Scripts faster -- Validate AI generated script's execution before copy pasting them

I created an AI script generator where you can create scripts (currently supports python / bash scripts) and test their execution before copy pasting them to your IDE / repo.

https://aiops.drdroid.io/script-generator

It’s free and no login is required. Would love to get feedback from folks here. :)

https://redd.it/1mk96sg
@r_devops
What are your biggest pain points and blockers

With everyobody using AI and no code these days developing has gotten so easy. Curious to know what type of problems yall run into these days now that many traditional problems are solved. Anything with developing, deployment, analytics, etc. My biggest blocker now is deployment.

https://redd.it/1mkc8h3
@r_devops
Following up on my 'Developer Toil' CLI: Your feedback helped shape v0.6.0, now with multi-service local envs.

Hey r/devops,

Thanks to everyone who weighed in on my post about tackling developer toil last week. Your real-world insights were invaluable.

Two main themes emerged from your feedback:

1. **Validation:** Yes, this is a real problem, and many of you have built similar, complex in-house solutions.
2. **The Challenge:** The hardest part isn't generating config; it's defining the "best practices" that go into it.

I took that to heart. While defining universal best practices is impossible, I realized I could build a flexible framework to help teams apply *their own*.

With that, I've just released **v0.6.0 of Open Workbench.** This update focuses on solving the local development piece of the puzzle for multi-service applications.

**Here’s how it addresses the workflow:**

* **Declarative Local Environments:** The new `workbench.yaml` acts as a single source of truth for defining all the services, components (e.g., gateways), and resources (DBs, caches) that make up your local development environment.
* **Automated Orchestration:** The `om compose` command reads the manifest and generates a full `docker-compose.yml` on the fly. This eliminates manual configuration and ensures consistency for every developer on the team.
* **Abstracted Dependencies:** The "Resource Blueprint" system allows developers to attach common infrastructure dependencies like PostgreSQL or Redis locally, with the system designed to target Terraform modules in the future.

**I'm looking for your operational insights on these changes:**

* Does this `workbench.yaml` approach seem like a scalable way to manage local environments?
* What operational blind spots or potential "gotchas" do you see in this workflow?
* How can this model better pave the way for a smooth transition to cloud deployments (e.g., Terraform generation)?

**Call for Contributors:**

Your feedback confirmed that many companies are solving this same problem internally. My goal is to build a robust, open-source alternative we can all share and improve. I'm looking for contributors interested in:

* **Platform Engineering:** Helping to shape the vision and architecture.
* **Infrastructure as Code:** Building out the Terraform generation capabilities.
* **Extensibility:** Defining more resource blueprints for tools like Kafka, RabbitMQ, or specific databases.

Let's build the tool we've all had to build in-house, but do it once, in the open.

**GitHub Repo:** [`https://github.com/jashkahar/open-workbench-platform`](https://github.com/jashkahar/open-workbench-platform)

Thanks for helping guide this project!

https://redd.it/1mkk5ll
@r_devops
What stack that is just reliable and requires minimal ops?

Hi everyone

I am curious. What's a stack that requires minimal devops and hand holding and yak shaving?

Is it php?

Can I just set unattended upgrades and leave a site running for years ?

https://redd.it/1mkoc60
@r_devops
Semantic Clinic — a reproducible map of AI failures (math-first, MIT, model-agnostic)

I’m publishing the Semantic Clinic as the canonical, MIT-licensed index for diagnosing and fixing AI failures with math, not folklore. It is a model-agnostic, pipeline-aware triage hub that you can apply to GPT, Claude, Gemini, local LLMs, single agents or multi-agent stacks. The single source of truth lives here:

Semantic Clinic (canonical link):
https://github.com/onestardao/WFGY/blob/main/ProblemMap/SemanticClinicIndex.md

OCR Legend Tesseract.js Author Starred my repo (WFGY on top now)
https://github.com/bijection?tab=stars

What it is.

Most failures are layered: OCR → parsing → chunking → embeddings → vector store → retriever → prompt assembly → LLM reasoning. One upstream distortion hides a downstream hallucination. The Clinic organizes these into reproducible failure families (prompting, retrieval/data, reasoning, memory/long-context, multi-agent/orchestration, infra/deploy, evaluation). Each family links to a precise fix page and acceptance criteria. No prompt tricks, no patchwork—every remedy is a structural intervention.

What we’ve shipped.

A field-tested Problem Map and Clinic that cover the common failure patterns devs actually hit in production (RAG drift, traceability gaps, logic collapse, memory fractures, agent conflicts, bootstrap/deploy deadlocks, etc.).
One-click sandboxes/Colabs (linked from the Clinic/Problem Map) that run the instruments without installation or private APIs.
A thin “TXT OS” operating layer (referenced from the Clinic) so any model can apply the engine with zero install.
Cold start to now: \~50+ days, \~360 from real users; growth driven by issue reports and fixes, not hype. We also maintain a running testimony of field saves: Hero Log → https://github.com/onestardao/WFGY/discussions/10

The mathematics (concise spec).
The Clinic is powered by three instruments and four repair operators. You don’t need to memorize the algebra to use them, but the math is public and consistent across pages.

ΔS (semantic stress). A scalar drift signal computed from embedding geometry; we use `ΔS = 1 − cos(I, G)` where I is the current view and G is the ground/anchor. Operational thresholds: `<0.40` stable, `0.40–0.60` transitional, `≥0.60` high risk. Probe questionretrieved context and contextexpected anchor to localize where meaning tears.
λ_observe (layered observability). A finite-state tag per layer: convergent (→), divergent (←), recursive (<>), chaotic (×). If upstream λ is stable and downstream flips divergent, the fault is at the boundary between those layers.
E\_resonance (coherence control). A rolling statistic on residual magnitude under correction; if E rises while ΔS stays high, perform a controlled reset and variance clamp.
Repair operators (WFGY modules).
BBMC — semantic residue minimization: reduce ‖B‖ with re-grounding and anchor re-specification.
BBPF — multi-path progression: explore/weight parallel semantic paths to avoid dead ends.
BBCR — collapse→rebirth control: detect failure at threshold and rebuild a safe bridge node.
BBAM — attention variance modulation: stabilize attention to prevent entropy melt in long or noisy contexts.

How you verify fixes.

Keep it falsifiable. Target ΔS ≤ 0.45 for direct QA after retrieval/prompt corrections; require λ to remain convergent across paraphrases; ensure E_resonance does not trend upward over longer windows; make retrieval traceable (cite lines snippets). If those conditions do not hold, you don’t “tune” more prompts—you change the structure (index metric/normalization, schema lock, bridge nodes, agent boundaries, boot order).

Reproducibility.
Everything in the Clinic is designed to run with fixed seeds and minimal
prerequisites. The Colab tools referenced from the Clinic make the probes and resets observable end-to-end. If you only copy one thing, copy the Clinic link above; it fans out to the families, fixes, and sandboxes.

Why this belongs in open source.
Open source doesn’t need another glossy “best practices” PDF. It needs an operational map you can run in public, verify on your stack, and argue about in issues. The Clinic is that map: math-first, license-clean, reproducible, and written to be forked, critiqued, and extended.

If this saves you a day in vector-store purgatory or a night chasing phantom jailbreaks, star the repo and drop a note in the Hero Log. We read every case because the failure patterns are the dataset.

Canonical link (again):

https://github.com/onestardao/WFGY/blob/main/ProblemMap/SemanticClinicIndex.md

MIT-licensed. Contributions, counter-examples, and adversarial tests are very welcome.

https://redd.it/1mktxxc
@r_devops
Employers of DevOps Engineers

I love being a DevOps Engineer. I like solving problems, learning about new stuff, understanding big systems, helping people, and getting paid pretty well.

You know what kinda sucks though? There's only certain kinds of employers that hire DevOps Engineers. Sometimes I'll think about who else I could work for, and then I'll be reminded that they don't have my role at that company.

For example, I live in a small-mid-sized town, far away from any big city. I work remotely. If I wanted to find a job locally I surely could. But it would most likely be as a systems engineer or something and it wouldn't pay nearly as well as what I'm making now.

Another example, I see some big company that has a reputation for being a good member of the community, doing charitable works, etc. Wouldn't it be neat to work for them? Oh, but they're a traditional retailer. They have IT for sure, but probably not programmers, let alone DevOps.

To work as a DevOps Engineer you usually have to work for somewhere fairly sizeable and either in a big city or remote for a place in a big city.

#firstworldproblems

https://redd.it/1mkwgr2
@r_devops