Reddit DevOps
269 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Anyone looking for a part-time devops/consultant with previous startup experience?

Hi,

If there’s anyone US or UK based that’s looking for a part time devops, I would be open for discussion.

I was a part of several startups, one of which skyrocketed and got acquired, and the other made it to some big fancy investments. We can talk more if anyone is interested in having me on their team.

I’m open to both engineering and consulting.

Best regards!

https://redd.it/1k7km7v
@r_devops
Looking for DevOps feedback

Hey all, I'm a developer @ Korbit AI and I was hoping to get some feedback from QA / Dev Ops engineers as to how we can make our reviews even more useful for this specific type of focus.


Currently we focus on these 8 categories: Functionality, Security, Performance, Error Handling, Readability, Logging, Design and Documentation.


My question is, as a dev ops engineer / qa, what are specific types of things our reviews can really focus on to help save time in this particular subject. We're planning on releasing a new feature called Korbit Policies, where you are able to tell Korbit specific things to flag ( example is like refactoring from one class to another and enforcing usage ).


Let me know and thank you in advanced.

https://redd.it/1k7r4n1
@r_devops
API Sprawl - issue for you or na?

Do y'alls bosses see API sprawl as a real problem? Or is just your problem? We need more discoverability for our APIs for sure, too many people doing too many things off in the corner. But I also need to make sure my boss sees it as a legit issue so that I can do something about it.

https://redd.it/1k7w38d
@r_devops
Hiring: Cold Email Deliverability & DevOps Specialist (High-Scale Infrastructure)

We're looking for a DevOps/Deliverability expert who lives and breathes cold email infrastructure.
Not just someone who's familiar — but someone who's built high-deliverability SMTP servers, optimized inbox placement at scale, and knows how to get emails delivered no matter what.

This is a right-hand role — not just a task-based position.
You'll be working directly with the founder to build, scale, and optimize our email infrastructure.

Who we're looking for:
Deep experience managing cold email SMTP infrastructure (PowerMTA, Postal, Mailcow, etc.)
Proven ability to hit the inbox at scale — across 100+ VMs and IPs
Strong DevOps/sysadmin background — building scalable, redundant systems
Hands-on experience managing IP reputation, rDNS, DKIM, SPF, DMARC, smart routing, etc.
Creative problem solver — can build and adapt systems to changing deliverability challenges
BONUS: You've built email warm-up and/or email verification tools (especially catch-all detection)

What you'll be responsible for:

Architecting and managing a large-scale cold email infrastructure
Developing an internal warm-up tool (we'll provide resources to accelerate it)
Building an internal email verification tool (with catch-all logic, bounce detection, etc.)
Managing 100+ VMs/IPs and continuously improving deliverability rates
Innovating ways to stay ahead of spam filters, blacklists, and reputation risks
Making sure we scale without deliverability or infrastructure breakdowns

What we are:
We're an email infrastructure software company built specifically for cold email at scale.
We care deeply about quality infrastructure — and we need someone who gets it, fast, and can build it right.

What we’re NOT looking for:
Someone who's only run basic transactional email servers
Someone who needs step-by-step instructions
Someone without experience running cold email systems at serious scale

If you're a builder who loves solving tough deliverability problems and wants to create something massive — let's talk.

https://redd.it/1k7wok3
@r_devops
What are your pain points in debugging kubernetes deployments?

The biggest pain point I have seen a lot are those frustrating scenarios where "everything looks healthy" but your system isn't working (like services not talking to each other properly or data not flowing correctly).

Would love to hear your debugging pain points and how we could make this more useful. Is this something you'd find valuable?

https://redd.it/1k7y9zi
@r_devops
Learn how to debug SQS consumers in Kubernetes without rebuilds

Debugging SQS consumers in Kubernetes isn't for the faint of heart. This guide shows how you can debug them locally using mirrord queue-splitting model, without disrupting production consumers.

Hope it will help you save some precious time =)

https://metalbear.co/guides/how-to-debug-sqs-consumers/?utm\_source=organic\_social&utm\_medium=reddit\_organic&utm\_campaign=reddit\_post

https://redd.it/1k7zrx5
@r_devops
Created DevOps Project... real-world, hands-on, esp. useful for people who look for a job.

I created hands on DevOps project to help people looking for a job or upskill to fill the gaps in practical knowledge.

I recently did bunch of interviews and I think it will help a lot. Even if you don't have time to do it, just go through the content, it is free. Now I know there are some things that are not covered there, but still it is great foundation for about 70% of daily tasks.

It is close to what is used in most of the companies I worked (but trimmed down to save resources). It is fully hands on, you build app, containerise, deploy, create ci/cd, template with helm, use kubernetes, use terraform and aws, create monitoring and list goes on..

here is the video where I talk about it: https://youtu.be/vtCW5IgJ9-A?si=8nfBu4vgN4uhdX-2

here is the project itself: https://prepare.sh/project/devops-foundational-project

https://redd.it/1k80zlj
@r_devops
Tool for docs generate and host

What tool you use for publish documentation ?like do docker kuber and etc

Now I have cicd what copy readme.md in one central project docs with versions by tag .

https://redd.it/1k81lmy
@r_devops
Blind posts are crazy

Guys, have you checked recently the Blind posts about job offers? Just went through some of the very recent posts and felt like we live in different dimensions. When here I see a lot of people struggling even to land an interview for a long time, some even for 2 years despite being experienced those guys are on the fence between, or even among a gargantuan TC offers. One guy posted about having 3 offers (Databricks, Meta, Google) on the table, with tremendous TC, and was looking for some second opinions, etc. It’s really crazy.
Of course, I’m happy for every single person who gets an offer, but at the same time, I feel sad for others who are struggling.
What is this gap about? There is no balance. Why do we have such a huge abyss between the communities in the same geolocation? What do you think about it?

https://redd.it/1k84mq9
@r_devops
From mobile dev to devops

Hello, I’m new here. Lately, I’ve been browsing Reddit to understand how hard the transition from software developer to DevOps is. I noticed that most people making the switch come from a backend background. I’m a native mobile developer with 2 years of experience, and I’m wondering—how difficult would it be for someone like me to move into DevOps? Would my experience be considered valuable, especially if I build DevOps projects on the side? Would HR see me as a good fit? I’d love to hear your thoughts.


https://redd.it/1k87xj4
@r_devops
Kubetail: Real-time Kubernetes logging dashboard, now with Search

Kubetail is an open-source, general-purpose logging dashboard for Kubernetes, optimized for tailing logs across multi-container workloads in real-time. The primary entry point for Kubetail is the kubetail CLI tool, which can launch a local web dashboard on your desktop or stream raw logs directly to your terminal.

I started working on this project two years ago after getting frustrated with the Kubernetes Dashboard's log viewer and I'm excited to share that we’ve added some new features, including search!

# What's new

# 🔍 Search

Now you can grep/search your container logs in real-time, right from the Kubetail web dashboard. Under-the-hood, search uses a super fast Rust executable that scans your raw log files on-disk in your cluster, then sends only relevant results back to your browser. Now you don’t have to download all your log records just to grep them locally anymore. The feature is live in our latest release candidate - try it out now here: https://www.kubetail.com/demo.

# 🖥️/🌐 Run on Desktop or in Cluster

Kubetail can run locally or inside your cluster. For local use, we built a simple CLI that starts the dashboard on your desktop (quick-start):

# Install
$ brew install kubetail

# Run
$ kubetail serve

It uses your local kubeconfig file to connect to your clusters and you can easily switch between them. You can also install Kubetail inside a cluster itself and access it from a web browser using kubectl proxy or kubectl port-forward (quick-start).

# 💻 Tail logs in the terminal

Sometimes you can't beat tailing logs in the terminal, so we added a powerful logs sub-command to the kubetail CLI tool that you can use to follow container logs or even fetch all the records in a given time window to analyze them in more detail locally (quick-start):

# Follow example
$ kubetail logs deployments/web --follow

# Fetch example
$ kubetail logs deployments/web \
--since 2025-04-20T00:00:00Z \
--until 2025-04-21T00:00:00Z \
--all > logs.txt

# 📐 Clean UI

We’ve worked hard to make Kubetail feel fast and intuitive. One feature that our users love is that multi-container logs are merged into a single timeline, color-coded by container—so you can track what’s happening across pods at a glance. Using simple controls you can quickly go to the beginning of the merged timeline, tail the ending, or scroll through the event timeline. Our goal is to make the most user-friendly Kubernetes logging tool so if you’re passionate about design and you love logs, we’d love your help! (Thanks victorchrollo14 and HarshDeep61034 for your recent contributions!)

# 🎯 Easy filtering

When something’s on fire in your cluster, you need to quickly isolate the issue—whether it’s tied to a specific region, node, or pod – so we added quick filters to help you narrow the log sources you're looking at. You can also filter by time to quickly narrow your debugging window to around the time an incident occurred. Soon we're planning on adding more filtering options like labels too so you can create your own groups of pods to filter on.

# ⏱️ Real-time

One of my original frustrations with the Kubernetes Dashboard is that it refreshes container logs every few seconds instead of just streaming data as it comes in, so we built Kubetail to be able to handle data in real-time. In the Kubetail web dashboard you can see messages as soon as they get written to your cluster. Kubetail also subscribes to messages from new containers automatically as soon as the container is started so you can track requests seamlessly as they jump between ephemeral containers even across workloads. That means I don’t need to keep multiple Kubernetes Dashboard
logging windows open any more!

# 🌙 Dark Mode

We didn't want users to get blinded when they opened up Kubetail, so we added a dark mode theme that picks up on your system preferences automatically. Hopefully streaming logs lines will be easier on the eyes now.

\---

If Kubetail has been useful to you, take a moment to add a star on Github and leave a comment. Your feedback will help others discover it and help us improve the project!

\---

Join our community on Discord for real-time support or just to say hi!

https://redd.it/1k8arks
@r_devops
How difficult is the process for publishing an app to the Android and Apple Store?

Hello All,

I've been working on a mobile game and am going to release it to the app store at some point.

I had a couple of questions about app publishing.

1. How much time does app publishing process take? Is it a lot of work? Seeing compliance lists such as https://developer.android.com/docs/quality-guidelines/core-app-quality#sc intimidates me.

Are they actually enforcing all these rules?

2. I see there are tools available like Runway, Tramline, FastLane that claim to make the deployment and publishing process easy.

Have any of you used these tools?

Do they help reduce time to publish and update or would I be better off writing scripts/github actions for this?

3. ⁠Do you know any tools that automate all this compliance stuff away?

Thanks a lot :)



https://redd.it/1k8bft1
@r_devops
The Easiest Way to Manage Multi-Container Apps (Perfect for Small Projects!)

Hey everyone! As part of my 60-Day ReadList Series #4: Simplifying Docker & Kubernetes.

This time, I break down Docker Compose. How it simplifies managing multi-container applications, Why it’s so useful, How to structure a docker-compose.yml, and some bonus tips like scaling, using environment variables, and networks.

Covered topics include:
1. Why Docker Compose is a must-have tool
2. Breakdown of docker-compose.yml structure
3. How volumes help persist container data
4. Scaling services with a single command
5. Managing environment-specific configs
6. Networking between containers

Perfect for someone who’s starting out with Docker and building small projects. Docker Compose handles things surprisingly well without the heavy lifting!

If you’ve been wanting to get more comfortable with Docker and want a beginner-friendly guide that’s actually practical, check it out. Docker Compose Made Simple: Deploying Multi-Container Applications in Minutes

Thanks for reading and supporting the series!

https://redd.it/1k8bdzu
@r_devops
Is it normal to feel overwhelmed at a new DevOps Job?

Hello, I just joined a multinational company. Their infra has already been setup and has fully matured. I feel overwhelmed on the stuff I have to learn and teams to communicate requests to, not to mention transitioning from unix terminals (Used to live in the terminal) to windows (Restrictions).




Some info about me, previously worked from a startup and previously a mid sized company (That also came from a startup). It was easy learning and building the infra of the two. And right now, I feel so weak.




Lemme know if you guys have any advice, I would highly appreciate it.



https://redd.it/1k8di8q
@r_devops
Seeking ideas for uni project for scalable and distributed systems course

Hi everyone,
I'm looking for some advice, as the title suggests.
I recently completed a course where we are now required to create a project, but my group and I have no idea what to work on.
I'm not sure if this is the right subreddit, but I'm hoping you all might have some suggestions!

Here are some of the tools and technologies we covered during the course: Spark, Apache Hadoop, Raft, Paxos, graphx, tlav, spark sql, kafka

We're not limited to only these tools — we can use anything we want.
If you have any project ideas or suggestions, we would be extremely grateful! Any input is welcomed!


Thanks so much in advance!

https://redd.it/1k8dfho
@r_devops
Is it hard to become a DevOps ? I have started doing my trainings. Am I heading to the wrong path? My background is electrical engineering. I need a lot of motivation from you guys. Please help and give me suggestions as much as possible.Thanks

Thanks

https://redd.it/1k8mrch
@r_devops
Question about excessive liability clause in B2B contract

Hey everyone,

I'm soon to start my first freelance contract as DevOps. While reviewing the contract I noticed one clause that set off some alarm bells. I was wondering if this is something that is common, or rather a red flag that should make me think again.
It goes like this:

The Provider (me) agrees to indemnify and hold the Client harmless in full from and against all Losses arising from or in connection with:
...
...
5.3. any failure to provide the Services to the satisfaction of the Client and/or End User.

There are, of course, quite a few other more specific clauses in addition to 5.3 that refer to omission and infringement of whatever, which I can accept since they are specific, but a clause referring to unlimited liability related to 'satisfaction' seems to me a bit too much.

Many thanks for the advice.

PS: I do already have Professional Liability Insurance

https://redd.it/1k8oswp
@r_devops
How to find industry best practices for rightsizing cloud resources based on usage metrics?

Hi everyone,

I'm currently trying to better understand how to rightsize cloud resources across different types of services — not just compute instances (VMs, containers), but also databases, caches, storage services, networking components, API gateways, and other PaaS offerings.

The main challenge I'm facing is:

How to decide, based on real usage metrics (CPU, memory, network throughput, requests, connections, etc.), when it makes sense to recommend downsizing or optimization?
In other words: What thresholds or best practices should be applied across different resource types?

For example:

For a PostgreSQL database: if average CPU usage stays consistently below X%, and connection counts remain below Y, downsizing might be appropriate.
For a Redis cache: if memory and CPU utilization are low over time, a smaller SKU or plan could be justified.
For load balancers or API gateways: if request volume and network throughput are much lower than provisioned capacity, resizing or tier adjustment could be considered.
For storage services: if IO or access rates are minimal, moving to a lower-cost tier could make sense.

My Questions:

1. Are there any reliable standards, best practice frameworks, or internal methodologies that define rightsizing thresholds for cloud services?
2. How do you determine safe and reasonable criteria for optimization across different service types?
3. Are there common "rules of thumb" that you or your organization use (e.g., "CPU usage consistently under 60% over 30 days → recommend downgrade")?
4. (Bonus) If you have cloud-provider-specific insights (AWS, Azure, GCP), I'd love to hear those too!

I've seen tools like Azure Advisor, AWS Compute Optimizer, and GCP Recommender, but they seem to mostly focus on compute resources (VMs, autoscaling groups) rather than PaaS services like managed databases, caches, networking, etc.

Any experiences, whitepapers, blog posts, internal heuristics, or rules of thumb would be highly appreciated!

Thanks a lot in advance! 🙏

https://redd.it/1k8pk92
@r_devops
Did Buildkite remove their developer plan (aka free plan)?

My previous employer used Buildkite and I liked it so I setup some personal projects and used Buildkite to play around with things. They used to have a free "developer" plan that allowed like 3 pipelines.

I hadn't touched it in a while and went to test some things the other day and it wanted me to pay for a plan, it looks like they consolidated to just a "pro" plan at like $30/month and an enterprise plan.

Anyone have any details on this?

https://redd.it/1k8r1zk
@r_devops