SRE / DevOps more exciting than full stack development?
looking for some vibes based career advice.
I'm currently a web dev at a f5000, 3 yoe, and kinda bored. Lately, I feel most engaged and satisfied when production bugs gets me into the zone, and I have to use all my mental energy to resolve the bug ASAP and make a meaningful difference to a user.
This happens about once a week for a few hours at a time. The rest of the time I'm babysitting GitHub copilot to do some CRUD ticket.
I know it's a pretty nice gig, grass is greener on the other side, etc etc. I am still interested in hearing some perspectives:
if you've moved from full stack web dev to SRE or DevOps, do you find the work more engaging? More secure? More lucrative? Is there downtime?
For more context, my company does not have dedicated SRE / DevOps roles. I'm planning ahead for if I get laid off, or decide to commit to upskilling for a 'better' job.
To be honest, I have a limited understanding of what SRE and DevOps roles involve. I imagine working with kubernetes, terraform, being on call a lot, etc. Do let me know if there's something I'm missing. TIA
https://redd.it/1mbv64v
@r_devops
looking for some vibes based career advice.
I'm currently a web dev at a f5000, 3 yoe, and kinda bored. Lately, I feel most engaged and satisfied when production bugs gets me into the zone, and I have to use all my mental energy to resolve the bug ASAP and make a meaningful difference to a user.
This happens about once a week for a few hours at a time. The rest of the time I'm babysitting GitHub copilot to do some CRUD ticket.
I know it's a pretty nice gig, grass is greener on the other side, etc etc. I am still interested in hearing some perspectives:
if you've moved from full stack web dev to SRE or DevOps, do you find the work more engaging? More secure? More lucrative? Is there downtime?
For more context, my company does not have dedicated SRE / DevOps roles. I'm planning ahead for if I get laid off, or decide to commit to upskilling for a 'better' job.
To be honest, I have a limited understanding of what SRE and DevOps roles involve. I imagine working with kubernetes, terraform, being on call a lot, etc. Do let me know if there's something I'm missing. TIA
https://redd.it/1mbv64v
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Started a newsletter digging into real infra outages - first post: Reddit’s Pi Day incident
Hey guys, I just launched a newsletter where I’ll be breaking down real-world infrastructure outages - postmortem-style.
These won’t just be summaries, I’m digging into how complex systems fail even when everything looks healthy. Things like monitoring blind spots, hidden dependencies, rollback horror stories, etc.
The first post is a deep dive into Reddit’s 314-minute Pi Day outage - how three harmless changes turned into a $2.3M failure:
Read it here
If you're into SRE, infra engineering, or just love a good forensic breakdown, I'd love for you to check it out.
https://redd.it/1mbo3oq
@r_devops
Hey guys, I just launched a newsletter where I’ll be breaking down real-world infrastructure outages - postmortem-style.
These won’t just be summaries, I’m digging into how complex systems fail even when everything looks healthy. Things like monitoring blind spots, hidden dependencies, rollback horror stories, etc.
The first post is a deep dive into Reddit’s 314-minute Pi Day outage - how three harmless changes turned into a $2.3M failure:
Read it here
If you're into SRE, infra engineering, or just love a good forensic breakdown, I'd love for you to check it out.
https://redd.it/1mbo3oq
@r_devops
Substack
The Reddit Pi Day Incident
How three innocent changes conspired to create a $2.3M disaster
DevOps Projects Feedback
Hi Reddit Fam!
I have been trying to create a portal which resonates with the actual project that people can do and get hands-on experience.
Now making the portal was not challenging but putting the quality project at one place is, the best way I thought of collecting the project was to target various certification examination and get the projects around it.
I have added few project, if you guys can just give me a feedback on them. And also what all more type of project I should put here? Any recommendations would be appreciated.
Website: https://bartman.ai/
Coupon code: DOCKERSEC
If something doesn’t work then let me know.
For now, I am focused on CKA certification for this week.
https://redd.it/1mc4uky
@r_devops
Hi Reddit Fam!
I have been trying to create a portal which resonates with the actual project that people can do and get hands-on experience.
Now making the portal was not challenging but putting the quality project at one place is, the best way I thought of collecting the project was to target various certification examination and get the projects around it.
I have added few project, if you guys can just give me a feedback on them. And also what all more type of project I should put here? Any recommendations would be appreciated.
Website: https://bartman.ai/
Coupon code: DOCKERSEC
If something doesn’t work then let me know.
For now, I am focused on CKA certification for this week.
https://redd.it/1mc4uky
@r_devops
BartMan
BartMan - AI Career Development Platform | Interview Prep & Job Matching
AI-powered career platform with video interview analysis, personalized job matching, and skill development. Get interview-ready and land your dream job with AI coaching.
Anyone integrated an AI code reviewer into your CI/CD?
We just rolled out CARE — an AI-powered plugin that performs code reviews directly in your CI/CD pipelines or locally.
It’s tailored for Guidewire/Gosu (but also supports Java or any other popular programming language) and integrates with Bitbucket/Git/Azure DevOps.
Instead of static rule checks, CARE does:
✅ Real-time feedback in MRs
✅ Unit test/code generation
✅ Inline responses to dev comments
✅ Seamless updates with new best practices
Trying to gauge: is DevOps moving toward proactive QA with AI, or is this still too early for most teams?
https://redd.it/1mc5obe
@r_devops
We just rolled out CARE — an AI-powered plugin that performs code reviews directly in your CI/CD pipelines or locally.
It’s tailored for Guidewire/Gosu (but also supports Java or any other popular programming language) and integrates with Bitbucket/Git/Azure DevOps.
Instead of static rule checks, CARE does:
✅ Real-time feedback in MRs
✅ Unit test/code generation
✅ Inline responses to dev comments
✅ Seamless updates with new best practices
Trying to gauge: is DevOps moving toward proactive QA with AI, or is this still too early for most teams?
https://redd.it/1mc5obe
@r_devops
sollers.eu
CARE – Code AI Review Excellence | Sollers
Do DevOps teams at newer companies still choose Terraform for IaC, or native IaC services (like CloudFormation/Bicep)?
Terraform has been the go to for companies with cloud resources across multiple platforms or migrating from onprem, because of its great cross platform support. But for newer startups or organisations starting out in the cloud, I’d say using platform specific IaC services is usually easier than picking up Terraform, and the platform integration is probably better too. Native tools also don’t require installing extra CLIs or managing state files.
If you're at a newer company or helping clients spin up infra, what are you using for IaC? Are platform native tools good enough now, or is Terraform still the default?
https://redd.it/1mc7p46
@r_devops
Terraform has been the go to for companies with cloud resources across multiple platforms or migrating from onprem, because of its great cross platform support. But for newer startups or organisations starting out in the cloud, I’d say using platform specific IaC services is usually easier than picking up Terraform, and the platform integration is probably better too. Native tools also don’t require installing extra CLIs or managing state files.
If you're at a newer company or helping clients spin up infra, what are you using for IaC? Are platform native tools good enough now, or is Terraform still the default?
https://redd.it/1mc7p46
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Free DevOps Tool Developer Experience Audit
I'm offering free developer experience audits specifically focused on DevOps tools.
My background: Helped dyrectorio (deployment orchestration and container management) and Gimlet (GitOps deployment) gain significant GitHub adoption through improved developer onboarding and documentation. Not affiliated with them anymore.
I specialize in identifying friction points in CI/CD pipelines, infrastructure tooling adoption, and developer-facing automation workflows.
What I'll analyze:
Developer onboarding for your DevOps tools
CI/CD pipeline user experience and documentation
Infrastructure-as-code developer workflows
Tool integration friction points
DM me if you'd like an audit of your developer-facing DevOps processes.
https://redd.it/1mc8qna
@r_devops
I'm offering free developer experience audits specifically focused on DevOps tools.
My background: Helped dyrectorio (deployment orchestration and container management) and Gimlet (GitOps deployment) gain significant GitHub adoption through improved developer onboarding and documentation. Not affiliated with them anymore.
I specialize in identifying friction points in CI/CD pipelines, infrastructure tooling adoption, and developer-facing automation workflows.
What I'll analyze:
Developer onboarding for your DevOps tools
CI/CD pipeline user experience and documentation
Infrastructure-as-code developer workflows
Tool integration friction points
DM me if you'd like an audit of your developer-facing DevOps processes.
https://redd.it/1mc8qna
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Problem when fetching image via api gateway
I'm trying to use KrakenD as an api gateway. I have this endpoint on a flask microservice (both the gateway the microservice are conteinerized)
/images/<date>/<hour>/<filename>
When I fetch the image with a direct connection there are no errors. When I use the endpoint on the gateway it gives back a 404 error. This is the endpoint. I have other endpoints but those work.
{
"endpoint": "/api/images/{date}/{hour}/{filename}",
"method": "GET",
"inputparams": [
"date",
"hour",
"filename"
],
"backend": [
{
"urlpattern": "/images/{date}/{hour}/{filename}",
"host":
"https://data_processor:8080"
}
]
}
This is the configuration of the endpoint.
https://redd.it/1mc9nq4
@r_devops
I'm trying to use KrakenD as an api gateway. I have this endpoint on a flask microservice (both the gateway the microservice are conteinerized)
/images/<date>/<hour>/<filename>
When I fetch the image with a direct connection there are no errors. When I use the endpoint on the gateway it gives back a 404 error. This is the endpoint. I have other endpoints but those work.
{
"endpoint": "/api/images/{date}/{hour}/{filename}",
"method": "GET",
"inputparams": [
"date",
"hour",
"filename"
],
"backend": [
{
"urlpattern": "/images/{date}/{hour}/{filename}",
"host":
"https://data_processor:8080"
}
]
}
This is the configuration of the endpoint.
https://redd.it/1mc9nq4
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Rollouts
Hello folks,
I want to understand how you guys handles the rollouts.
We are hosting services on Azure.
While rollout, we have few manual changes in app config, kv, DB, etc. and then push services one by one to AKS, how do you handles it, so that everybody will understand different approaches and can implement.
https://redd.it/1mc7v24
@r_devops
Hello folks,
I want to understand how you guys handles the rollouts.
We are hosting services on Azure.
While rollout, we have few manual changes in app config, kv, DB, etc. and then push services one by one to AKS, how do you handles it, so that everybody will understand different approaches and can implement.
https://redd.it/1mc7v24
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
If I hear "treat your platform as a product" one more time...
Let's just admit it that we've all been there:
You start with a clean slate. You build a platform tailored perfectly to your org.
Custom pipelines. Custom tooling. A CI/CD “stack” that makes sense to you.
And it works… until it doesn’t.
Suddenly, your internal platform is this black box only you and your team understand.
It’s brittle, hard to onboard new people to, impossible to scale cleanly, and when something breaks, you’re reinventing the wheel again.
We all say things like “our business is unique”, “our scale is different”, “our use case is too complex”. But in reality, the foundations are the same across the board.
https://redd.it/1mcc78f
@r_devops
Let's just admit it that we've all been there:
You start with a clean slate. You build a platform tailored perfectly to your org.
Custom pipelines. Custom tooling. A CI/CD “stack” that makes sense to you.
And it works… until it doesn’t.
Suddenly, your internal platform is this black box only you and your team understand.
It’s brittle, hard to onboard new people to, impossible to scale cleanly, and when something breaks, you’re reinventing the wheel again.
We all say things like “our business is unique”, “our scale is different”, “our use case is too complex”. But in reality, the foundations are the same across the board.
https://redd.it/1mcc78f
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Tried Jenkins again, was not that bad as I had in mind!
Hi everyone,
as the title says, I gave Jenkins another shot. The last time I used it was at my former company, with a pretty archaic setup: several VMs running Docker Engine, the Docker plugin to spin up workers, and some static servers for on-site deployments in a local datacenter. All of it glued together with some cool Ansible playbooks (still proud of those, ngl). The goal back then was to avoid the classic pet server scenario. If you know me personally, you probably know the company I worked for!
Now I gave it a fresh spin and I approached it with a Kubernetes-first mindset. Deployed everything via Helm charts and used the Kubernetes plugin. And since I like working with Pulumi (and work since then for them), I used that too. You could likely do the same with Terraform and the Kubernetes/Helm provider.
I wrote it all down here: https://www.pulumi.com/blog/jenkins-pulumi-2025-experience/
Any "old" DevOps tech you gave also a new lock/try?
https://redd.it/1mcg2kk
@r_devops
Hi everyone,
as the title says, I gave Jenkins another shot. The last time I used it was at my former company, with a pretty archaic setup: several VMs running Docker Engine, the Docker plugin to spin up workers, and some static servers for on-site deployments in a local datacenter. All of it glued together with some cool Ansible playbooks (still proud of those, ngl). The goal back then was to avoid the classic pet server scenario. If you know me personally, you probably know the company I worked for!
Now I gave it a fresh spin and I approached it with a Kubernetes-first mindset. Deployed everything via Helm charts and used the Kubernetes plugin. And since I like working with Pulumi (and work since then for them), I used that too. You could likely do the same with Terraform and the Kubernetes/Helm provider.
I wrote it all down here: https://www.pulumi.com/blog/jenkins-pulumi-2025-experience/
Any "old" DevOps tech you gave also a new lock/try?
https://redd.it/1mcg2kk
@r_devops
pulumi
I Tried Jenkins in 2025 with Pulumi: Here's How It Went
My hands-on experience using Jenkins with Pulumi in 2025. Learn about the setup, challenges, and key takeaways from this modern CI/CD approach.
Farewell to my dad
https://blog.mattsbit.co.uk/2025/07/23/dad/
I originally wrote the speach in my blog repo, just for writing purposes.
My dad's funeral was a couple of days ago and wondered, maybe, someone might appreciate it - either because they've lost their dad or it makes them appreciate their dad a little more.
Particularly in this community, as I assume you probably grew up with messing with computers and/or servers and probably had a similar influence from your dads.
https://redd.it/1mcheuv
@r_devops
https://blog.mattsbit.co.uk/2025/07/23/dad/
I originally wrote the speach in my blog repo, just for writing purposes.
My dad's funeral was a couple of days ago and wondered, maybe, someone might appreciate it - either because they've lost their dad or it makes them appreciate their dad a little more.
Particularly in this community, as I assume you probably grew up with messing with computers and/or servers and probably had a similar influence from your dads.
https://redd.it/1mcheuv
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Is there a proper way to get depot sizes on perforce ?
I wrote a script for our perforce server , but sooner after it crashed our server.
The server was a 4 CPU and 8GB RAM system that was stable. But after running my script it crashed the server (linux) . After our crash I doubled the CPU to 8 and RAM to 16GB .
Still wary of using my script below and asking how perforce admins query depot sizes safely.
depot_sizes.sh
—————————————————
\#!/bin/bashfor
depot in $(p4 depots | awk '{print $2}'); do
echo "Depot: $depot"
p4 sizes //$depot/... | awk '{total += $4} END {print " Total Size: " total " bytes\\n"}'
done
—————————————————
https://redd.it/1mcherq
@r_devops
I wrote a script for our perforce server , but sooner after it crashed our server.
The server was a 4 CPU and 8GB RAM system that was stable. But after running my script it crashed the server (linux) . After our crash I doubled the CPU to 8 and RAM to 16GB .
Still wary of using my script below and asking how perforce admins query depot sizes safely.
depot_sizes.sh
—————————————————
\#!/bin/bashfor
depot in $(p4 depots | awk '{print $2}'); do
echo "Depot: $depot"
p4 sizes //$depot/... | awk '{total += $4} END {print " Total Size: " total " bytes\\n"}'
done
—————————————————
https://redd.it/1mcherq
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
DevOps Confessions
Hey guys. just ran into something funny on YouTube, thought you might enjoy it.
Plus, AI videos are terrifying.
https://www.youtube.com/watch?v=Y1xIRAjzTjM
https://redd.it/1mcgq4j
@r_devops
Hey guys. just ran into something funny on YouTube, thought you might enjoy it.
Plus, AI videos are terrifying.
https://www.youtube.com/watch?v=Y1xIRAjzTjM
https://redd.it/1mcgq4j
@r_devops
YouTube
DevOps Anonymous - David S.
Check out more testimonials on https://zesty.co/lp/devops-anonymous
Test your database backups before they fail you in production
Hey devs! 👋
Just shipped BackupGuardian - tired of backup validation tools that only check syntax but don't actually test restoration.
This one spins up Docker containers and actually restores your entire backup to see what breaks. Supports PostgreSQL/MySQL + has a CLI for CI/CD.
Built it after a 3 AM incident where a "validated" backup was missing half the constraints 😅
Demo: https://www.backupguardian.org
Anyone else been burned by "good" backups before?
https://redd.it/1mctf97
@r_devops
Hey devs! 👋
Just shipped BackupGuardian - tired of backup validation tools that only check syntax but don't actually test restoration.
This one spins up Docker containers and actually restores your entire backup to see what breaks. Supports PostgreSQL/MySQL + has a CLI for CI/CD.
Built it after a 3 AM incident where a "validated" backup was missing half the constraints 😅
Demo: https://www.backupguardian.org
Anyone else been burned by "good" backups before?
https://redd.it/1mctf97
@r_devops
Backup Guardian
Backup Guardian | Database Backup Monitoring & Alerts
Monitor your database backups with real-time alerts, health checks, and automated validation. Ensure your data is always protected.
Good tip
I came across this tip and couldn't help but share it it's so good and useful
I think we should follow him and help him reach 10 followers
https://x.com/username_husan/status/1950368472258793538?s=46
https://redd.it/1mcwusn
@r_devops
I came across this tip and couldn't help but share it it's so good and useful
I think we should follow him and help him reach 10 followers
https://x.com/username_husan/status/1950368472258793538?s=46
https://redd.it/1mcwusn
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Are these real-time DevOps projects?
I came across this website which has DevOps projects. The author mentioned that those are real-time DevOps projects. There are 7 pages on this website and 37 projects(as of now). Could some experienced DevOps engineers please visit the link below and confirm if these are real-time projects if possible?
https://projects.prodevopsguytech.com/blog
Thanks for your valuable time.
https://redd.it/1mcyeig
@r_devops
I came across this website which has DevOps projects. The author mentioned that those are real-time DevOps projects. There are 7 pages on this website and 37 projects(as of now). Could some experienced DevOps engineers please visit the link below and confirm if these are real-time projects if possible?
https://projects.prodevopsguytech.com/blog
Thanks for your valuable time.
https://redd.it/1mcyeig
@r_devops
DevOps & Cloud Projects Showcase
A curated collection of real-time DevOps and Cloud projects, ranging from beginner to advanced levels. Built using Next.js and styled with Tailwind CSS, this showcase leverages modern web technologies to provide a fast, responsive, and interactive experience.
Using Vector search for Log monitoring / incident report management?
Hi I wanted to know if anyone in the DevOps community has used vector search / Agentic RAG for performing the following:
🔹 Log monitoring + triage
Some setups use agents to scan logs in real time, highlight anomalies, and even suggest likely root causes based on past patterns. Haven’t tried this myself yet, but sounds promising for reducing alert fatigue.
This agent could help reduce Mean Time to Recovery (MTTR) by analyzing logs, traces, and metrics to suggest root causes and remediation steps. It continuously learns from past incidents to improve future diagnostics.Stores structured incident metadata and unstructured logs as JSON documents. Embeds and indexes logs using Vector Search for similarity-based retrieval. High-throughput data ingestion + sub-millisecond querying for real-time analysis.
One might argue - why do you need a vector database for it? Storing logs as vector doesn't make sense. But I just wanted to see if anyone has a different opinion or even has an open source repository.
Also would love to know if we could use vector search for some other use-case apart from log monitoring - like incident management reporting
https://redd.it/1mczb0a
@r_devops
Hi I wanted to know if anyone in the DevOps community has used vector search / Agentic RAG for performing the following:
🔹 Log monitoring + triage
Some setups use agents to scan logs in real time, highlight anomalies, and even suggest likely root causes based on past patterns. Haven’t tried this myself yet, but sounds promising for reducing alert fatigue.
This agent could help reduce Mean Time to Recovery (MTTR) by analyzing logs, traces, and metrics to suggest root causes and remediation steps. It continuously learns from past incidents to improve future diagnostics.Stores structured incident metadata and unstructured logs as JSON documents. Embeds and indexes logs using Vector Search for similarity-based retrieval. High-throughput data ingestion + sub-millisecond querying for real-time analysis.
One might argue - why do you need a vector database for it? Storing logs as vector doesn't make sense. But I just wanted to see if anyone has a different opinion or even has an open source repository.
Also would love to know if we could use vector search for some other use-case apart from log monitoring - like incident management reporting
https://redd.it/1mczb0a
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Project N1netails
# 🧠 Story time:
I started building N1netails after a moment at work that really stuck with me. One of my production support teammates started flipping tables (literally) after getting a Splunk alert 15 minutes too late. By the time we were notified, the issue had already escalated. That experience got me thinking:
I actually like Splunk, but I also think there are some real problems with it:
1. High learning curve – You basically need to take a course just to be productive with Splunk. Because of this, most of our production support folks weren’t using it properly — or even at all.
2. Poor context – I’d get notified by a Splunk alert, but then I had to spend valuable time digging to figure out what actually went wrong. The alert itself wasn’t enough.
3. Query throttling – In big organizations, querying Splunk often means getting throttled. You’re hunting down a bug, and suddenly your queries stop loading. It’s frustrating and slows everything down.
4. Centralization – Again, great for security teams. But as a developer, I just want to be alerted on issues related to my services. Competing for Splunk resources across a large org is overkill if all I want is simple service-level alerting.
So that’s why I built N1netails.
The name comes from two ideas:
* N1 = Think “Big O” notation — O(1), O(n), etc. — but the goal is to get fast, direct insights. N=1.
* ne = Any
* Tails = Like tail -f, watching logs in real-time.
Put it all together and you get N1netails.
The goal? Get notified ASAP when something breaks in the systems that matter to me and my team.
As a developer, I don’t need a full-blown SIEM to monitor the entire company. I just want to know when my stuff is broken — and ideally have some help understanding what happened.
That’s why N1netails includes:
* A prebuilt dashboard (no setup required)
* Stack trace capture
* LLM assistance for debugging (through a helper named Inari)
I also made it easy to self-host. You can check it out here:
* SaaS: [https://app.n1netails.com](https://app.n1netails.com/)
* Docs: [https://n1netails.com](https://n1netails.com/)
* GitHub: [https://github.com/n1netails](https://github.com/n1netails)
Right now, it’s optimized for Java and Spring Boot, but I’m working on expanding support to other languages and platforms.
I know people will probably say, “Why make this? There are tools for this already.” And that’s fair. But I’m building this because I’ve used those tools, and I still believe there’s room for something better — or at least something simpler.
I’m not trying to replace Splunk. N1netails can supplement the tools you already use and help with the day-to-day debugging, triage, and monitoring that’s often overlooked.
N1netails is an open-source project that provides practical alerting and monitoring for applications. If you’re tired of relying on overly complex SIEM tools to identify issues — or if your app lacks alerting altogether — N1netails gives you a straightforward way to get notified when things break.
Thanks for reading. If you want to try it, give feedback, or contribute, check out the repo.
And feel free to leave your hate comments or tell me why you love Splunk. I don’t care. I’m building this because I believe there’s a better way to handle alerts — and I want to help others who feel the same.
https://redd.it/1mczyji
@r_devops
# 🧠 Story time:
I started building N1netails after a moment at work that really stuck with me. One of my production support teammates started flipping tables (literally) after getting a Splunk alert 15 minutes too late. By the time we were notified, the issue had already escalated. That experience got me thinking:
I actually like Splunk, but I also think there are some real problems with it:
1. High learning curve – You basically need to take a course just to be productive with Splunk. Because of this, most of our production support folks weren’t using it properly — or even at all.
2. Poor context – I’d get notified by a Splunk alert, but then I had to spend valuable time digging to figure out what actually went wrong. The alert itself wasn’t enough.
3. Query throttling – In big organizations, querying Splunk often means getting throttled. You’re hunting down a bug, and suddenly your queries stop loading. It’s frustrating and slows everything down.
4. Centralization – Again, great for security teams. But as a developer, I just want to be alerted on issues related to my services. Competing for Splunk resources across a large org is overkill if all I want is simple service-level alerting.
So that’s why I built N1netails.
The name comes from two ideas:
* N1 = Think “Big O” notation — O(1), O(n), etc. — but the goal is to get fast, direct insights. N=1.
* ne = Any
* Tails = Like tail -f, watching logs in real-time.
Put it all together and you get N1netails.
The goal? Get notified ASAP when something breaks in the systems that matter to me and my team.
As a developer, I don’t need a full-blown SIEM to monitor the entire company. I just want to know when my stuff is broken — and ideally have some help understanding what happened.
That’s why N1netails includes:
* A prebuilt dashboard (no setup required)
* Stack trace capture
* LLM assistance for debugging (through a helper named Inari)
I also made it easy to self-host. You can check it out here:
* SaaS: [https://app.n1netails.com](https://app.n1netails.com/)
* Docs: [https://n1netails.com](https://n1netails.com/)
* GitHub: [https://github.com/n1netails](https://github.com/n1netails)
Right now, it’s optimized for Java and Spring Boot, but I’m working on expanding support to other languages and platforms.
I know people will probably say, “Why make this? There are tools for this already.” And that’s fair. But I’m building this because I’ve used those tools, and I still believe there’s room for something better — or at least something simpler.
I’m not trying to replace Splunk. N1netails can supplement the tools you already use and help with the day-to-day debugging, triage, and monitoring that’s often overlooked.
N1netails is an open-source project that provides practical alerting and monitoring for applications. If you’re tired of relying on overly complex SIEM tools to identify issues — or if your app lacks alerting altogether — N1netails gives you a straightforward way to get notified when things break.
Thanks for reading. If you want to try it, give feedback, or contribute, check out the repo.
And feel free to leave your hate comments or tell me why you love Splunk. I don’t care. I’m building this because I believe there’s a better way to handle alerts — and I want to help others who feel the same.
https://redd.it/1mczyji
@r_devops
N1Netails
N1netails provides powerful alert management and monitoring tools to help you streamline incident response. Integrate with Slack, Teams, Discord, and Telegram for real-time notifications.
Building a Game
Im looking for a devs to design a game from A to Z. Html based with crypto wallet connection and remote playing. Contact me for more details
https://redd.it/1mcynfo
@r_devops
Im looking for a devs to design a game from A to Z. Html based with crypto wallet connection and remote playing. Contact me for more details
https://redd.it/1mcynfo
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Why Git Branching Strategy Matters in Database DevOps?
Hey folks,
I've been working a lot with CI/CD and GitOps lately, especially around databases and wanted to share some thoughts on Git branching strategies that often cause more harm than good when managing schema changes across environments.
🔹 The problem:
Most teams use a separate Git branch for each environment (like
🔹 What works better:
A trunk-based model with a single
🔹 GitOps and DBs:
Applying GitOps principles to database deployments — version-controlled, auditable, automated via CI/CD, goes a long way toward reducing fragility. Especially in teams scaling fast or operating in regulated environments.
If you're curious, I wrote a deeper blog post that outlines common pitfalls and tactical takeaways:
👉 Choosing the Right Branching Strategy for Database GitOps
Would love to hear how others are managing DB schemas in Git and your experience with GitOps for databases.
https://redd.it/1md2gw1
@r_devops
Hey folks,
I've been working a lot with CI/CD and GitOps lately, especially around databases and wanted to share some thoughts on Git branching strategies that often cause more harm than good when managing schema changes across environments.
🔹 The problem:
Most teams use a separate Git branch for each environment (like
dev, qa, prod). While it seems structured, it often leads to merge conflicts, missed hotfixes, and environment drift — especially painful in DB deployments where rollback isn’t trivial.🔹 What works better:
A trunk-based model with a single
main branch and declarative promotion through pipelines. Instead of splitting branches per environment, you can use tools to define environment-specific logic in the changelog itself.🔹 GitOps and DBs:
Applying GitOps principles to database deployments — version-controlled, auditable, automated via CI/CD, goes a long way toward reducing fragility. Especially in teams scaling fast or operating in regulated environments.
If you're curious, I wrote a deeper blog post that outlines common pitfalls and tactical takeaways:
👉 Choosing the Right Branching Strategy for Database GitOps
Would love to hear how others are managing DB schemas in Git and your experience with GitOps for databases.
https://redd.it/1md2gw1
@r_devops
Harness.io
Database DevOps: Fix Git Before It Breaks Production
Learn why per-environment Git branching breaks database deployments and how a trunk-based, context-driven GitOps approach restores reliability, speed, and confidence.
I built a local AI assistant like ChatGPT that runs 100% offline – No data leaks, no internet, just private intelligence
Hey folks,
I’m a developer and I recently built a **fully offline AI assistant** that you can run on your local computer — kind of like a private Jarvis, but with zero cloud dependencies.
It uses a locally hosted LLM (like LLaMA 3 or Mistral) + offline voice support + a simple desktop interface. You can talk to it, ask it to code, explain stuff, summarize files, or even build dev tools — all **without your data ever leaving your computer**.
# Features:
* 100% local: no internet or cloud access needed
* Can understand and generate code
* Voice input and output (like Jarvis)
* No OpenAI, no Google, no Gemini — just your own secure AI
* Perfect for devs, sysadmins, and companies that banned ChatGPT
I made this because I saw so many developers being blocked from using ChatGPT at work due to **data privacy rules**. This solves that.
Would love to hear what you think — and happy to give early access or even walk you through setting it up.
https://redd.it/1md3zkd
@r_devops
Hey folks,
I’m a developer and I recently built a **fully offline AI assistant** that you can run on your local computer — kind of like a private Jarvis, but with zero cloud dependencies.
It uses a locally hosted LLM (like LLaMA 3 or Mistral) + offline voice support + a simple desktop interface. You can talk to it, ask it to code, explain stuff, summarize files, or even build dev tools — all **without your data ever leaving your computer**.
# Features:
* 100% local: no internet or cloud access needed
* Can understand and generate code
* Voice input and output (like Jarvis)
* No OpenAI, no Google, no Gemini — just your own secure AI
* Perfect for devs, sysadmins, and companies that banned ChatGPT
I made this because I saw so many developers being blocked from using ChatGPT at work due to **data privacy rules**. This solves that.
Would love to hear what you think — and happy to give early access or even walk you through setting it up.
https://redd.it/1md3zkd
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community