Reddit DevOps
269 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Reducing Infrastructure Friction; Web Hosting with Free Migration for Teams That Can’t Afford Downtime

Hey DevOps folks,

We know how critical stability, portability, and repeatability are when managing infrastructure especially in production environments. That’s why at UltaHost, we’ve doubled down on something simple but often neglected: offering Web Hosting with Free, Fully-Managed Migration, without compromising uptime or system integrity.

Too many engineering teams delay migration due to perceived complexity, potential downtime, or lack of internal bandwidth. We've worked with DevOps engineers across multiple verticals who were stuck on bloated legacy providers or hosting setups they’d long outgrown, not because they wanted to stay, but because migrating without incident felt like a luxury.

Here’s what we offer:

White-glove migration of complete stacks, databases, configs, cron jobs, SSLs, and custom setups (Docker, reverse proxies, etc.)
Pre-deployment testing to avoid post-move regression issues
Optimized environments for PHP, Node.js, Python, and static JAMstack workloads
No migration fees, ever because vendor lock-in through friction isn't our style

We’re not trying to replace your CI/CD pipeline or rewrite your infrastructure-as-code, but if you're hosting client-facing apps, dashboards, staging sites, or smaller services that still matter, we’re here to help you move them without pain.

If you’ve held back migrating because you’ve been burned before or just don’t want the operational hassle, let’s talk. We’ve built this service around actual use cases from engineers like you.

Would love to hear: What’s your biggest blocker when it comes to hosting transitions?

https://redd.it/1lcqrbv
@r_devops
Open Source Warp alternative for.. Everyone

Hi Good people of this subreddit.

We have recently created NTerm: Open Source Alternative to Warp.

Here's the gh: https://github.com/Neural-Nirvana/iota

Looking forward to your feedback and pulls. XOXO

https://redd.it/1lcsv84
@r_devops
Free CI/CD services

Hey there, I'm in the process of starting a dev agency, and I'm facing the age old problem that you can't get any clients without testimonials, and you can't get testimonials without clients :D


So, to fix this, I'm offering some free CI/CD services. Need a pipeline built to automatically deploy your webapp when you push on master? Do you have a pipeline that bearly works and breaks every few days? I'm open to taking a look at it and fixing it for free as long as you're open to giving me an honest testimonial at the end of it.

,
About myself. I have 10+ years of experience as a dotnet developer. My frontend framework of choice is Angular but I've dabbled in React. I've built multiple stable pipelines in Gitlab and Github.



My startup has the following pipelines:

\- Automatic deployment of Prod/Test (automatically deploys a dotnet api to ubuntu server when code is pushed)

\- Automatic deployment of mobile app into Android store

\- Automatic deployment of mobile app into iOS store (yes this was a huge pain to setup)



https://redd.it/1lcwvcp
@r_devops
Docker volume

I am studying up on Dockers and can't fully grab the difference between docker volumes and copy/workdir entries in the Dockerfile. Doesn't it do the same thing? The only difference that I can think of is that dockerfiles are created before containers, whereas volumes you insert in the existing containers. Is that right and there there other differences?

https://redd.it/1lcy6j0
@r_devops
Is your 1st level ops outsourced? Where and what do they do?

Hello,
As the title says, is your 1st level operations outsources? Where and what do they do?

I heard of public cloud accounts with hundreds of nodes. They must be monitored 24/7 (on-call), alerts provisioned (whatever the monitoring tool), dashboards to be build, reporting to be done, on boarding of new customers, maybe some IaC provisioning, .... How are these done in your team? I guess it depends on the infrastructure size also. Are these activities outsourced to other companies? If yes, what else do these 1st level ops team do (except the one mentioned above)?

https://redd.it/1lcz9si
@r_devops
What fatal mistake do you see in my resume? I am getting 0 ( ZERO ) response to any job applications

Hi there,

https://imgur.com/a/JbkWDs2

My resume ^^

Ive been applying to 100+ jobs and ive actually only had 1 call back. I am using a resume template that has worked for me before very well, and ive looked over my resume to see if theres any mistakes in it and im not seeing it.

I think its OK. Any reason why im not even getting calls for a junior position?


Please dont nitpick some random thing, im aware of the job market right now.

https://redd.it/1ld3vbj
@r_devops
Does anyone else get annoyed asking GPT for command syntax all the time?

Like when you need to remember if it's terraform plan -out=file or --out file and you have to open another tab and ask GPT?

Been using this tool called ops0-cli where you just say "plan terraform for production" and it gives you the actual command. Pretty neat for Ansible and AWS stuff too and others

Do you guys use GPT for command lookups or just suffer through the docs?

https://redd.it/1ld74y2
@r_devops
Career progression

Hi everyone, a couple months ago I was lucky enough to land a devops/infrastructure job at a f500 company. While I love the job, in this day age, you can never be too careful and I wanna make sure that I am setting myself up correctly in case if something were to happen.

Our current stack is Microsoft ADO for CICD, git and so on, AWS for our db’s/bunch of other stuff, and some misc stuff here and there

I have two major questions for you

1. Is it worth it to get certs? I would be looking at the CKA/CKAD for Kubernetes’s stuff, or AWS certifications.

2. Is it worth it to keep my LinkedIn/resume up-to-date on things that I do at the company, or should I do a mass update when I am ready to start looking for a new job?


Tyia

https://redd.it/1ld8wql
@r_devops
DevOps team in the AI era

It feels like in near future DevOps team will be busy building, supporting, maintaining remote MCP servers across different teams. Kinda become AI tool enablers.

I can imagine that request will be “team, we are starting a new project, so we need support for a new tool in MCP server” or “please fix a bug in this MCP because our ai client recently got wrong response”. CI/CD of MCP 😅 hallucinations monitoring dashboards

https://redd.it/1ldebv1
@r_devops
We reduced our Kubernetes costs by 40% using automation — here’s what helped most

In our Kubernetes clusters, we've been focusing a lot on cost optimisation. We wanted to share a few minor yet significant adjustments that we found to be effective (we'd love to know what else is working as well):
Developer namespaces were automatically reduced after business hours.
Appropriate pod requests and limits according to actual usage (no more 2Gi on idle jobs 😅)
Remaining debug pods, outdated replicas, and unused PVCs were cleaned up.
To cut down on noise, usage-based triggers were used in place of always-on alerts.

In addition to saving a tonne of engineering hours, Alertmend(https://alertmend.io/) helped us reduce idle resources by tying Prometheus metrics to cost insights and automatically running cleanup/scale workflows.
I'm curious about what other people are doing to save money over time, particularly if you're automating using Prometheus, scripts, or third-party tools.

https://redd.it/1ldfnsw
@r_devops
SREs – got 2 mins?

Working on a blog post about how (or if) AI is actually useful in incident management and observability. Trying to include thoughts from folks.

If you're an SRE or work on infra/on-call stuff, would love to hear from you. Even if your team hasn't touched AI tools yet, that’s super relevant.

**Form’s here (3-5 mins tops):**
👉 [https://docs.google.com/forms/d/e/1FAIpQLSc5Sxwv8ebPJD943xNKTZPKSkb0ECozEqrZzmjRy7K2AvRH4A/viewform](https://docs.google.com/forms/d/e/1FAIpQLSc5Sxwv8ebPJD943xNKTZPKSkb0ECozEqrZzmjRy7K2AvRH4A/viewform)

# A few things:

* No spam, no sales, just writing a blog.
* You can stay anonymous as there’s an option to be quoted if you're cool with that.
* Not asking for any infra details. Just your takes.

Will share the post here once it's live if folks are curious. Appreciate any responses 🙏

https://redd.it/1ldhrno
@r_devops
Who's using Backstage? What are your use cases?

Hey everyone,

I’m curious to hear if anyone is actively using [Backstage](https://backstage.io/) in production. I'm evaluating it for internal developer portals and wanted to get a better sense of real-world use cases.

* What are you using Backstage for?
* Which plugins do you rely on most?
* Any gotchas, lessons learned, or things you’d do differently?

Would really appreciate hearing about your setups — from solo dev projects to large orgs!

Thanks in advance 🙌

https://redd.it/1ldjjcu
@r_devops
Automation VS SOX Compliance - any insights?

I have been automating a lot of financial reporting for my employer using a variety of tools like Power Platform, ETL/ELT (Informatica, Snowflake, Azure Analysis Services I.E. AAS) etc.

Our accounting suite is SAP ECC (will likely migrate to S/4HANA by 2027).

And then our auditors yelped "SOX ITGCs/ITACs!"

(Sarbanes-Oxley Act Information Technology General/Application Controls, basically publicly traded companies need to disclose every single step in the data flow to auditors to guarantee data integrity between source and target.)

And they made it abundantly clear that automation cannot be done in case there is any sort of data flow that can affect data integrity, as it would have to be re-reviewed step by step each audit.

They (EY) make it seem like a black and white thing and frankly in a patronising manner. For instance, quarterly exports from SAP supported by printscreens from the moment of capture.

So what to do?

I am mainly looking into general insights, so do share. Sources on ITAC Controls would be even better. (ITGCs are straightforward, ISO 27001) but my issue in particular focuses on two parts:

1. SOX Compliance with middleware

We use both Informatica and Snowflake. Both offer SOX Compliance controls. None are set up yet.

But our issue is that we were previously working on Informatica - SQL Datawarehouse (AAS).

Now we are moving to Snowflake, but we are still using Informatica to move data from SAP to Snowflake.

I feel that is a step too many as it would require the same controls in both Informatica and Snowflake.

I also understand this is the only way to have continuous monitoring in place (as opposed to snapshots), which is where SOX 404 is going through from what I understand.

2. SOX Compliance without middleware

Limiting the data lineage from source (SAP) to target (audit report) is an obvious answer.

But now I want to play Devil's Advocate:

Do I have to do these repeatable steps manually?

Or:

Can't RPA do it?

Hypothetically (seriously I have NOT done this... yet), SUPPOSE if I were to implement automation through a mix of Python and maybe some Excel, then on the surface it would still look like I manually exported a quarterly report.

That way it is just a few repeatable steps automated through a form of RPA (Robotic Process Automation) under my username and without touching data integrity (no change to the source data).

And it could save the company hours. Seriously, we have one guy losing half a day each time he needs to do a datadump of SAP's ACDOCA table.

Auditors would not see the difference.

Okay I could also have the Python code audited, but is that really necessary when a process is automated on a user level?

SOX is supposed to be about controls, not manual tedium. That's not what they (EY) are having us believe however.

https://redd.it/1ldklhc
@r_devops
Critical Python Package Vulnerability Now Actively Exploited – CVE-2025-3248

There's a critical unauthenticated RCE vulnerability (CVSS 9.8) in Langflow (<1.3.0), a widely-used Python framework for building AI apps (70k+ GitHub stars, 21k+ PyPI downloads/week).

Link to blog post:
https://cloudsmith.com/blog/cve-2025-3248-serious-vulnerability-found-in-popular-python-ai-package

Attackers are actively exploiting this flaw to install the Flodrix DDoS botnet via the /api/v1/validate/code endpoint, which (incredibly) uses ast.parse() \+ compile() \+ exec() without auth.

If you're pulling anything from PyPI or running Langflow-based AI services exposed to the internet, you should check your versions now.

https://redd.it/1ldlfhg
@r_devops
Share your idea for my setup.

Hey r/devops!

I have my own freelancing company, and I would like to offer hosting to my clients. After studying options and considering my budget, I settled on Oracle Cloud and found that I can even have a free K8s cluster with 4 nodes. If you were in my position and had to set this up, while also serving some applications from this cluster, CI/CD them, and monitor their status. How would you tackle this?

https://redd.it/1ldm8dz
@r_devops
Flutter Developer Thinking of Switching to Cloud Engineering – Is It Worth It? Where to Start?

Hey everyone,

I’m currently working as a Flutter developer and have been in mobile app development for a while now. Lately, I’ve been really curious about Cloud Engineering — the idea of building scalable infrastructure, working with DevOps tools, and understanding cloud platforms like AWS, Azure, or GCP sounds exciting.

But honestly, I have no idea where to start.

Is it worth making the switch from Flutter to Cloud Engineering? How steep is the learning curve? And if I do want to start exploring, are there any beginner-friendly tutorials or roadmaps you’d recommend?

I’m not planning to completely abandon mobile development just yet, but I’d love to eventually land a role in cloud or DevOps. Any advice, insights, or resources would be super appreciated.

Thanks in advance!

https://redd.it/1ldoc40
@r_devops
How to commit a bugfix for PROD in main when few commits should not get transported?

Hello everyone,

lets say there is a main branch which has been deployed to Prod. Then there are additional commits pushed to main via pull requests. Now main is ahead of production by 2 commits. Then there is a bug found in Prod which requires an urgent fix. The fix is ready but not yet merged to the main branch. The condition is the 2 commits should not be moved to PROD but only the fix which came later after those 2 commits. how this can work out?


the Stack looks as below, better read from bottom to top:

\---BugFix (I want only this to get deployed and not 2 commits from wave1)

\---wave1 feature code

\---wave1 enhahcement

\---main (thats where wave0 exist and got deployed to PROD)

========================================

One possible solution is to comment the codes from 2 commits in a new commit with the fix and then deploy.

The other one is to create branches specific to releases such as release/wave0 and continue with main. At the end, create release/wave1 from main and start working on wave3 in main.

Are there any alternatives?
Thanks

https://redd.it/1ldp3bz
@r_devops
Infisical vs others

Thoughts on infisical.com?

Anyone using it in production?

Seems to me that it compares with AWS parameter store and HashiCorp vault



https://redd.it/1ldqikc
@r_devops
severe grafana CVE: patch now or forever hold your peace (CVE-2025-4123 Grafana)

there's a pretty significant cross-site scripting vulnerability in many versions of grafana...

'''
A cross-site scripting (XSS) vulnerability exists in Grafana caused by combining a client path traversal and open redirect. This allows attackers to redirect users to a website that hosts a frontend plugin that will execute arbitrary JavaScript. This vulnerability does not require editor permissions and if anonymous access is enabled, the XSS will work. If the Grafana Image Renderer plugin is installed, it is possible to exploit the open redirect to achieve a full read SSRF. The default Content-Security-Policy (CSP) in Grafana will block the XSS though the connect-src directive. This vulnerability is fixed in v10.4.18+security-01, v11.2.9+security-01, v11.3.6+security-01, v11.4.4+security-01, v11.5.4+security-01, v11.6.1+security-01, and v12.0.0+security-01
'''

https://nvd.nist.gov/vuln/detail/CVE-2025-4123
https://grafana.com/security/security-advisories/cve-2025-4123/
https://www.bleepingcomputer.com/news/security/over-46-000-grafana-instances-exposed-to-account-takeover-bug/

https://redd.it/1ldsg2x
@r_devops
I addressed the Fatal Mistake in my resume I got roasted for yesterday. Ty for 100+ responses

Hi everyone.

https://i.imgur.com/seBld3F.jpeg < - My new streamlined resume

---

Thank you for the 100+ constructive comments I got on my post yesterday.

Here -> What fatal mistake do you see in my resume? I am getting 0 ( ZERO ) response to any job applications


I think I've addressed most of it. I agree with the comments about it being an essay. We live in a weird time where I expect the AI machine to process my resume well before a human gets to it so I was trying to load as much info as possible in a 2 page resume. Devops is a field where we are doing new things basically everyweek and i feel like 50% of the stuff ive worked with isnt even on the resume lol.

BUt yes you guys are correct. Hope my new resume is better.

Is it a bit too light? looking forward to feeback thank you

https://redd.it/1ldu6tp
@r_devops
DB scripts! How do you handle that?

Hi guys good day. Hope you're doing well.

So I have worked in multiple projects and it seems that db scripts are the one thing that requires a lot of attention and human intervention. Would love to know -

1. How do you hadle db scripts using pipelines?
2. What are the most challenging part of implementation?
3. How do you take care of rollback of required?
4. What's the trickiest thing that you have ever done while designing db scripts pipelines?

https://redd.it/1lduujd
@r_devops