A simple, self-hosted Sentry alternative you can install in 5 minutes (with just one command!)
Hey folks 👋
I got fed up with monthly bills and SaaS lock-in, and I needed a better way to track errors in my apps, so I built Telebugs. It’s an error tracker you pay for once, host yourself, and actually own. It took me 3.5 months of solo Rails work, and I’m really happy with the results.
It’s compatible with Sentry SDKs, so it probably supports your language or framework of choice.
It’s built for people who just want something that works without the headache. Setup is dead simple: one command and you’re rolling in 5 minutes. It automatically sets up your server with an SSL certificate. All you need to do is specify the domain you want it to run on.
It catches your errors, keeps everything on your machine, and doesn’t bug you with upsells or surprise fees.
**Tech stack:**
* Rails 8 + Hotwire + TailwindCSS
* SQLite (yep)
* Runs in a single Docker container
* Compatible with Sentry SDKs
* Push + email alerts (needs to be enabled explicitly)
* Rule-based data cleanup
* No analytics, no third-party calls
Happy to answer any questions here, or over email. Cheers!
[https://telebugs.com/](https://telebugs.com/)
https://redd.it/1kcc8qg
@r_devops
Hey folks 👋
I got fed up with monthly bills and SaaS lock-in, and I needed a better way to track errors in my apps, so I built Telebugs. It’s an error tracker you pay for once, host yourself, and actually own. It took me 3.5 months of solo Rails work, and I’m really happy with the results.
It’s compatible with Sentry SDKs, so it probably supports your language or framework of choice.
It’s built for people who just want something that works without the headache. Setup is dead simple: one command and you’re rolling in 5 minutes. It automatically sets up your server with an SSL certificate. All you need to do is specify the domain you want it to run on.
It catches your errors, keeps everything on your machine, and doesn’t bug you with upsells or surprise fees.
**Tech stack:**
* Rails 8 + Hotwire + TailwindCSS
* SQLite (yep)
* Runs in a single Docker container
* Compatible with Sentry SDKs
* Push + email alerts (needs to be enabled explicitly)
* Rule-based data cleanup
* No analytics, no third-party calls
Happy to answer any questions here, or over email. Cheers!
[https://telebugs.com/](https://telebugs.com/)
https://redd.it/1kcc8qg
@r_devops
Telebugs
Telebugs is a privacy-friendly self-hosted error tracking tool. Track errors and exceptions in your apps, get instant notifications, and keep data secure.
Running Virtual Desktops on AWS or Azure.
What are some options for running virtual desktops to test desktop applications on AWS?
This could potentially scale up to hundreds or thousands of virtual environments. The main use-case is to test our desktop application.
I know that AWS offers workspace and Azure has AVD.
What are some other potential solutions?
https://redd.it/1kce4yr
@r_devops
What are some options for running virtual desktops to test desktop applications on AWS?
This could potentially scale up to hundreds or thousands of virtual environments. The main use-case is to test our desktop application.
I know that AWS offers workspace and Azure has AVD.
What are some other potential solutions?
https://redd.it/1kce4yr
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Meta: How do you all use AI? I'm totally not trying to find ideas for a startup
To not appear too suspicious, I'm going to start this post by talking a little bit about how I, too, am slightly suspect of AI, but that any "reasonable person" would at least give it a try. (And, we all want to be considered reasonable, right?) I've also clearly never searched for similar topics in this subreddit, and don't really have any interest in engaging with the subreddit community at all aside from making this post.
Then, I'll talk a little bit about how I want AI to do some "simple tasks" for me, like... well... literally all of my job. But the existing tools are a little bit piecemeal, leading me to...
...my super awesome tech demo that's just a wrapper for ChatGPT, and a totally coy call-for-action for people to try it out, along with a request for suggestions.
Oh, and I really like to sprinkle emojis into my post, like these: ✨💻🔎🙅♂️
\---------
/s
Seriously, can we get some moderation on this kind of nonsense? Our subreddit was already being invaded by people with 0 YOE who couldn't hack SWE interviews and thought that devops would be an "easy" alternative, and now it's being invaded by people who think they can AI-away everything and want to pitch their "one tool to rule them all" idea.
edit: the number of people thinking that I'm seriously asking how they use AI, rather than trying to point out the flood of AI-related spam we're getting, is somewhat bemusing.
https://redd.it/1kce06y
@r_devops
To not appear too suspicious, I'm going to start this post by talking a little bit about how I, too, am slightly suspect of AI, but that any "reasonable person" would at least give it a try. (And, we all want to be considered reasonable, right?) I've also clearly never searched for similar topics in this subreddit, and don't really have any interest in engaging with the subreddit community at all aside from making this post.
Then, I'll talk a little bit about how I want AI to do some "simple tasks" for me, like... well... literally all of my job. But the existing tools are a little bit piecemeal, leading me to...
...my super awesome tech demo that's just a wrapper for ChatGPT, and a totally coy call-for-action for people to try it out, along with a request for suggestions.
Oh, and I really like to sprinkle emojis into my post, like these: ✨💻🔎🙅♂️
\---------
/s
Seriously, can we get some moderation on this kind of nonsense? Our subreddit was already being invaded by people with 0 YOE who couldn't hack SWE interviews and thought that devops would be an "easy" alternative, and now it's being invaded by people who think they can AI-away everything and want to pitch their "one tool to rule them all" idea.
edit: the number of people thinking that I'm seriously asking how they use AI, rather than trying to point out the flood of AI-related spam we're getting, is somewhat bemusing.
https://redd.it/1kce06y
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
DevOps Related Conferences?
My boss wants to send me to a conference or two this year. Initially I suggested MS Ignite but the timing didn't work out. What are some other conferences that would be of value to a devsevops engineer with a background leaning harder on the ops side than the others?
https://redd.it/1kcgglf
@r_devops
My boss wants to send me to a conference or two this year. Initially I suggested MS Ignite but the timing didn't work out. What are some other conferences that would be of value to a devsevops engineer with a background leaning harder on the ops side than the others?
https://redd.it/1kcgglf
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Virtual Desktop Testing Environment on AWS / Azure
I'm currently researching solutions for running virtual desktop environments specifically to test desktop applications on AWS. We're looking at potentially scaling up to hundreds or even thousands of concurrent virtual desktop environments, so scalability, manageability, and cost-effectiveness are key considerations.
We're aware of solutions like AWS WorkSpaces and Azure Virtual Desktop (AVD), but I'm curious about other viable options or alternative approaches that teams here might be using successfully.
Specifically:
What solutions have you successfully deployed for high-volume desktop application testing?
Are there effective alternatives to AWS WorkSpaces or Azure Virtual Desktop?
How do these solutions handle provisioning, automation (e.g., Terraform, Ansible, CircleCI integration), and multi-OS support (Windows, Linux, macOS)?
Are there particular tools or third-party services you've found effective for automating large-scale testing environments?
Any insights, experiences, or recommendations would be greatly appreciated.
Thanks in advance!
https://redd.it/1kcg6gl
@r_devops
I'm currently researching solutions for running virtual desktop environments specifically to test desktop applications on AWS. We're looking at potentially scaling up to hundreds or even thousands of concurrent virtual desktop environments, so scalability, manageability, and cost-effectiveness are key considerations.
We're aware of solutions like AWS WorkSpaces and Azure Virtual Desktop (AVD), but I'm curious about other viable options or alternative approaches that teams here might be using successfully.
Specifically:
What solutions have you successfully deployed for high-volume desktop application testing?
Are there effective alternatives to AWS WorkSpaces or Azure Virtual Desktop?
How do these solutions handle provisioning, automation (e.g., Terraform, Ansible, CircleCI integration), and multi-OS support (Windows, Linux, macOS)?
Are there particular tools or third-party services you've found effective for automating large-scale testing environments?
Any insights, experiences, or recommendations would be greatly appreciated.
Thanks in advance!
https://redd.it/1kcg6gl
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Anyone running .http test files in their pipes?
I've got a load of tests already written as http files and i'd like a way to run them when i release. So, I'm after something like newman.
Anyone got anything please?
https://redd.it/1kcg5w9
@r_devops
I've got a load of tests already written as http files and i'd like a way to run them when i release. So, I'm after something like newman.
Anyone got anything please?
https://redd.it/1kcg5w9
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Audit tool using ebpf
Hey folks,
I'm building an open-core tool that uses eBPF to generate audit-grade logs from Linux systems and containers — primarily for companies that need to comply with SOC 2, PCI-DSS, or HIPAA.
It traces kernel-level events like process execution, file access, network connections etc. It can export compliance reports. I am seeing it as a modern version of
Its a hobby project in rust now. I would like to know if any of you would find this type of tool useful.
Thanks !
https://redd.it/1kcl49l
@r_devops
Hey folks,
I'm building an open-core tool that uses eBPF to generate audit-grade logs from Linux systems and containers — primarily for companies that need to comply with SOC 2, PCI-DSS, or HIPAA.
It traces kernel-level events like process execution, file access, network connections etc. It can export compliance reports. I am seeing it as a modern version of
auditdIts a hobby project in rust now. I would like to know if any of you would find this type of tool useful.
Thanks !
https://redd.it/1kcl49l
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Experience Setting Up a High-Availability Private Cloud with MinIO Clusters
I recently wrote about my experience building a private cloud storage solution using MinIO in clustered mode. The goal was to achieve S3-compatible, highly available object storage for internal workloads — without relying on public cloud vendors.
The article covers setup, replication, scalability, and some operational lessons learned around HA, persistence, and bucket policies.
If you’re exploring self-hosted alternatives to S3 or interested in resilient storage for on-prem DevOps, I’d love to hear your thoughts or experiences.
Read the article 👉🏻 https://medium.com/@yassine.ramzi2010/revolutionizing-private-cloud-storage-with-minio-clusters-3cc4bd87c6c9
https://redd.it/1kcmb8g
@r_devops
I recently wrote about my experience building a private cloud storage solution using MinIO in clustered mode. The goal was to achieve S3-compatible, highly available object storage for internal workloads — without relying on public cloud vendors.
The article covers setup, replication, scalability, and some operational lessons learned around HA, persistence, and bucket policies.
If you’re exploring self-hosted alternatives to S3 or interested in resilient storage for on-prem DevOps, I’d love to hear your thoughts or experiences.
Read the article 👉🏻 https://medium.com/@yassine.ramzi2010/revolutionizing-private-cloud-storage-with-minio-clusters-3cc4bd87c6c9
https://redd.it/1kcmb8g
@r_devops
Medium
🚀 Revolutionizing Private Cloud Storage with MinIO Clusters
How MinIO is Redefining Object Storage for Large-Scale Enterprises
Guide Hardening Docker Images with Trivy, seccomp, and Linux Capabilities
As part of a DevSecOps initiative, I explored practical ways to secure Docker images in CI/CD pipelines. This post walks through using Trivy for vulnerability scanning, applying seccomp profiles, and minimizing Linux capabilities to reduce attack surfaces.
It’s a hands-on guide focused on security without compromising portability or automation.
If you’re working on container hardening, DevSecOps practices, or simply tightening security
https://medium.com/@yassine.ramzi2010/%EF%B8%8F-devsecops-in-action-hardening-your-docker-images-with-trivy-seccomp-and-capabilities-292365a5bd79
https://redd.it/1kcm9qi
@r_devops
As part of a DevSecOps initiative, I explored practical ways to secure Docker images in CI/CD pipelines. This post walks through using Trivy for vulnerability scanning, applying seccomp profiles, and minimizing Linux capabilities to reduce attack surfaces.
It’s a hands-on guide focused on security without compromising portability or automation.
If you’re working on container hardening, DevSecOps practices, or simply tightening security
https://medium.com/@yassine.ramzi2010/%EF%B8%8F-devsecops-in-action-hardening-your-docker-images-with-trivy-seccomp-and-capabilities-292365a5bd79
https://redd.it/1kcm9qi
@r_devops
Medium
🛡️ DevSecOps in Action: Hardening Your Docker Images with Trivy, Seccomp, and Capabilities
In today’s DevSecOps world, securing your Docker images is not just a nice-to-have — it’s a critical step in delivering secure…
Asking for help in implementing a monitoring application?
I'm a junior sofware dev and I want to create a semi-real time monitoring for my application (minor delays are allowed <15min). My application produces a bunch of events with the following states:
I'm stumped on how to approach this problem. My initial poc implementation dumps raw events to a timescale database, and then a web api polls and processes it according to some set interval. The implementation is not performant as I expected, and I want to improve it.
After browsing the internet, I've read up that the ELK stack is commonly used for alert/ monitoring stuff. But I was wondering if this could be applied to my situation. Afaik elastic is just a key value store and kibana is just a visualization tool/ dashboard for said data.
Can this be done with ELK? If not, what are other better approaches/ architectures that I can consider using.
Links to resources would be helpful and I would also appreciate some input from someone that did a similar task before . Thank you!
https://redd.it/1kcru5c
@r_devops
I'm a junior sofware dev and I want to create a semi-real time monitoring for my application (minor delays are allowed <15min). My application produces a bunch of events with the following states:
queued, error, processed, to_be_requeued. I want to track if the state goes to the error state. At the same time, I want to track if an order got queued but didn't get to the processed state (maybe due to an application bug). This will be flagged as an error if the timestamp exceeds some threshold.I'm stumped on how to approach this problem. My initial poc implementation dumps raw events to a timescale database, and then a web api polls and processes it according to some set interval. The implementation is not performant as I expected, and I want to improve it.
After browsing the internet, I've read up that the ELK stack is commonly used for alert/ monitoring stuff. But I was wondering if this could be applied to my situation. Afaik elastic is just a key value store and kibana is just a visualization tool/ dashboard for said data.
Can this be done with ELK? If not, what are other better approaches/ architectures that I can consider using.
Links to resources would be helpful and I would also appreciate some input from someone that did a similar task before . Thank you!
{
"user": "mel",
"order_id": "0001",
"event-type": "queued",
"message": {
"timestamp": <unix_time>"
}
},
{
"user": "mel",
"order_id": "0002",
"event-type": "queued",
"message": {
"timestamp": <unix_time>"
}
},
{
"user": "mel",
"order_id": "0003",
"event-type": "queued",
"message": {
"timestamp": <unix_time>"
}
},
{
"user": "mel",
"order_id": "0001",
"event-type": "error",
"message": {
"timestamp": <unix_time>"
}
},
{
"user": "mel",
"order_id": "0002",
"event-type": "processed",
"message": {
"timestamp": <unix_time>"
}
},
{
"user": "mel",
"order_id": "0003",
"event-type": "to_be_requeued",
"message": {
"timestamp": <unix_time>"
}
},
{
"user": "mel",
"order_id": "0003",
"event-type": "queued",
"message": {
"timestamp": <unix_time>"
}
},
{
"user": "mel",
"order_id": "0003",
"event-type": "processed",
"message": {
"timestamp": <unix_time>"
}
},
https://redd.it/1kcru5c
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Tech Support to DevOps?
I'm currently working for a Software-Development company which owns their products/solutions as a Tech-Fuctional support engineer for one of those. This was my first real job and it's been around 3 years.
Right now, I'm looking to jump onto a more technical role, I'm very interested in Networking (CCNA in progress), programming, scripting, server management, and automation. I'm just wondering how hard it is to land a DevOps job, I've applied to some vaccants but HR simply say that despite having some of the requirements of the role, the managers wouldn't consider me due to the lack of experience in a DevOps role.
I'd love to some day land a job as a DevOps Engineer, I don't mind working for it and having that as a medium/long-term objective. I was actually looking for advise or suggestions from people knowing the field. What role or job would you say will help me at this point? What could be a good next-step to start pointing my career to DevOps? Also, in your experience, how feasible it's to make this jump I'm trying to do?
https://redd.it/1kcshr5
@r_devops
I'm currently working for a Software-Development company which owns their products/solutions as a Tech-Fuctional support engineer for one of those. This was my first real job and it's been around 3 years.
Right now, I'm looking to jump onto a more technical role, I'm very interested in Networking (CCNA in progress), programming, scripting, server management, and automation. I'm just wondering how hard it is to land a DevOps job, I've applied to some vaccants but HR simply say that despite having some of the requirements of the role, the managers wouldn't consider me due to the lack of experience in a DevOps role.
I'd love to some day land a job as a DevOps Engineer, I don't mind working for it and having that as a medium/long-term objective. I was actually looking for advise or suggestions from people knowing the field. What role or job would you say will help me at this point? What could be a good next-step to start pointing my career to DevOps? Also, in your experience, how feasible it's to make this jump I'm trying to do?
https://redd.it/1kcshr5
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Exploring An AI‑Powered DevOps Copilot Enabling One‑Click Production Deployments for Startups and Scale‑Ups
Hey r/devops 👋🏻
**TL;DR** – I’m hacking on *DevOps Agent*, an AI‑driven ChatOps tool that turns “deploy my app” into a one‑line command for lean teams. I’m still at prototype / wait‑list stage and would love feedback from anyone who’s felt the pain of getting an MVP into a reliable production environment.
# Why I’m building this
After a few tours as a DevOps engineer, I noticed the same pattern at scale‑ups:
* Spinning up a prototype is easy; wiring prod‑grade CI/CD takes days (or weeks).
* DevOps talent is scarce/expensive, and outsourcing often adds more complexity.
* A single mis‑configured Helm chart on Friday = sleeper‑cell outage on Monday.
I wondered: **what if ChatGPT‑style natural language could drive infra?**
# What the agent does (early prototype)
bash
# Slack / terminal demo
> @DevOpsAgent deploy --auto --env=staging
🔎 Scanning repo…
📦 Generating Docker & Helm manifests
☁️ Provisioning GKE cluster (europe-west1)
🚀 Deployed in 3m42s | cost est: $12.10/mo
**Under the hood**
* Reads GitHub/GitLab repo → detects language, DB, queue, etc.
* Generates Dockerfiles + Kubernetes/Helm manifests.
* Uses Terraform to spin up AWS / GCP / Azure (your choice).
* Streams cost + health metrics back into chat.
* Lets you roll back or scale via u/DevOpsAgent `scale redis 2x`
**Current status**
* Early Proof‑of‑concept in Encore + VoltAgent + WebContainers + Pulumi
* Can deploy a Node.js / Mongo demo app to GKE & tear it down.
* Private wait‑list live at [**devopsagent.dev**](https://devopsagent.dev) (very bare‑bones)
**Stuff I’m stuck on / would love input**
1. **Ephemeral environments** – What’s the nicest UX you’ve seen for per‑PR previews?
2. **Security guardrails** – Which “sane defaults” would you enable first? (IAM, image scanning, …)
3. **Pricing** – If this saved you a DevOps hire, what’s a sensible monthly tier?
4. **Interface** – Slack/Teams bot vs CLI plugin vs web dashboard: which would you actually use
# How you can help
* **Tear the idea apart** – What’s missing / unrealistic?
* **Share horror stories** – Your worst deploy nightmares help me design guardrails.
Thanks for reading! Any feedback—brutal or kind—totally welcome. 🙏
Alex – [devopsagent.dev](https://devopsagent.dev)
https://redd.it/1kcxee3
@r_devops
Hey r/devops 👋🏻
**TL;DR** – I’m hacking on *DevOps Agent*, an AI‑driven ChatOps tool that turns “deploy my app” into a one‑line command for lean teams. I’m still at prototype / wait‑list stage and would love feedback from anyone who’s felt the pain of getting an MVP into a reliable production environment.
# Why I’m building this
After a few tours as a DevOps engineer, I noticed the same pattern at scale‑ups:
* Spinning up a prototype is easy; wiring prod‑grade CI/CD takes days (or weeks).
* DevOps talent is scarce/expensive, and outsourcing often adds more complexity.
* A single mis‑configured Helm chart on Friday = sleeper‑cell outage on Monday.
I wondered: **what if ChatGPT‑style natural language could drive infra?**
# What the agent does (early prototype)
bash
# Slack / terminal demo
> @DevOpsAgent deploy --auto --env=staging
🔎 Scanning repo…
📦 Generating Docker & Helm manifests
☁️ Provisioning GKE cluster (europe-west1)
🚀 Deployed in 3m42s | cost est: $12.10/mo
**Under the hood**
* Reads GitHub/GitLab repo → detects language, DB, queue, etc.
* Generates Dockerfiles + Kubernetes/Helm manifests.
* Uses Terraform to spin up AWS / GCP / Azure (your choice).
* Streams cost + health metrics back into chat.
* Lets you roll back or scale via u/DevOpsAgent `scale redis 2x`
**Current status**
* Early Proof‑of‑concept in Encore + VoltAgent + WebContainers + Pulumi
* Can deploy a Node.js / Mongo demo app to GKE & tear it down.
* Private wait‑list live at [**devopsagent.dev**](https://devopsagent.dev) (very bare‑bones)
**Stuff I’m stuck on / would love input**
1. **Ephemeral environments** – What’s the nicest UX you’ve seen for per‑PR previews?
2. **Security guardrails** – Which “sane defaults” would you enable first? (IAM, image scanning, …)
3. **Pricing** – If this saved you a DevOps hire, what’s a sensible monthly tier?
4. **Interface** – Slack/Teams bot vs CLI plugin vs web dashboard: which would you actually use
# How you can help
* **Tear the idea apart** – What’s missing / unrealistic?
* **Share horror stories** – Your worst deploy nightmares help me design guardrails.
Thanks for reading! Any feedback—brutal or kind—totally welcome. 🙏
Alex – [devopsagent.dev](https://devopsagent.dev)
https://redd.it/1kcxee3
@r_devops
devopsagent.dev
DevOps Agent - Ship to Prod in Minutes
AI-powered DevOps platform that turns any GitHub repo into a production deployment on Google Cloud in minutes.
Business scaling up - what cloud provider should we use?
Our business is scaling rapidly — we’re currently handling millions of unique requests per week, and this number continues to grow. At the moment, we’re hosted on DigitalOcean, paying approximately €400 per month for the following infrastructure:
* One small Redis server for caching
* Four medium ARM nodes in two data centers
* One MySQL database with two replicas
However, we’re now facing significant performance issues due to unoptimized application code. Our stack includes Symfony (backend), MySQL (database), and a partially VueJS-powered frontend.
# Key Problems
1. **Blocking Requests:** When User A and User B make simultaneous requests, User B is delayed until User A's request completes. If our code executes a long-running operation (e.g., 20 seconds), the server is locked during that time, triggering Cloudflare’s load balancer to mark it as unhealthy. I initially suspected this was related to MySQL’s transaction isolation level (TIL), but DigitalOcean doesn’t allow us to change this setting. Regardless, with our current code inefficiencies, this issue is likely to worsen.
2. **Lack of Scalable Architecture:** We're not using Kubernetes or any dynamic scaling solution. Our infrastructure consists of a fixed number of servers behind Cloudflare’s load balancer. This will likely become a bottleneck as we grow.
# What We Need to Do
1. **Optimize the Application Code:** We need to refactor our backend to avoid inefficient loops and rely more on optimized database queries.**Question:** Does Symfony block concurrent requests by design? Is there a way to configure Symfony or PHP-FPM to handle multiple requests more efficiently? Or is it more likely that MySQL's transaction behavior is the real bottleneck? Would it be hard to migrate to PostgreSQL and is it really that much faster?
2. **Improve Infrastructure & Scalability:** We need a more robust and flexible server architecture with proper failover and autoscaling capabilities.**Question:** Which cloud providers would you recommend for scalable and reliable database hosting? Our primary concern is database performance and availability. Thanks to Cloudflare’s load balancer, we’re flexible with server location and even open to transitioning to Kubernetes.
We’re aiming to stay ahead of any major issues that could impact our platform’s stability. Any advice or insights would be greatly appreciated.
https://redd.it/1kcx1iw
@r_devops
Our business is scaling rapidly — we’re currently handling millions of unique requests per week, and this number continues to grow. At the moment, we’re hosted on DigitalOcean, paying approximately €400 per month for the following infrastructure:
* One small Redis server for caching
* Four medium ARM nodes in two data centers
* One MySQL database with two replicas
However, we’re now facing significant performance issues due to unoptimized application code. Our stack includes Symfony (backend), MySQL (database), and a partially VueJS-powered frontend.
# Key Problems
1. **Blocking Requests:** When User A and User B make simultaneous requests, User B is delayed until User A's request completes. If our code executes a long-running operation (e.g., 20 seconds), the server is locked during that time, triggering Cloudflare’s load balancer to mark it as unhealthy. I initially suspected this was related to MySQL’s transaction isolation level (TIL), but DigitalOcean doesn’t allow us to change this setting. Regardless, with our current code inefficiencies, this issue is likely to worsen.
2. **Lack of Scalable Architecture:** We're not using Kubernetes or any dynamic scaling solution. Our infrastructure consists of a fixed number of servers behind Cloudflare’s load balancer. This will likely become a bottleneck as we grow.
# What We Need to Do
1. **Optimize the Application Code:** We need to refactor our backend to avoid inefficient loops and rely more on optimized database queries.**Question:** Does Symfony block concurrent requests by design? Is there a way to configure Symfony or PHP-FPM to handle multiple requests more efficiently? Or is it more likely that MySQL's transaction behavior is the real bottleneck? Would it be hard to migrate to PostgreSQL and is it really that much faster?
2. **Improve Infrastructure & Scalability:** We need a more robust and flexible server architecture with proper failover and autoscaling capabilities.**Question:** Which cloud providers would you recommend for scalable and reliable database hosting? Our primary concern is database performance and availability. Thanks to Cloudflare’s load balancer, we’re flexible with server location and even open to transitioning to Kubernetes.
We’re aiming to stay ahead of any major issues that could impact our platform’s stability. Any advice or insights would be greatly appreciated.
https://redd.it/1kcx1iw
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Should we use Grafana open source in a medium company
I work at a medium-sized company using New Relic for observability. We ingest over 80GB of data monthly, run 20+ services across production and staging, and use MongoDB. While New Relic covers logs, metrics, traces and MongoDB well, it’s getting too expensive.
We’re considering switching to Grafana, Prometheus, and OpenTelemetry to handle all our monitoring needs, including MongoDB. But setting up Grafana has been a lot of manual work. There aren’t many good, maintained open-source dashboards—especially for MongoDB—and building them from scratch takes time.
I also read that as data and dashboards grow, Grafana can slow down and require more powerful machines, which adds cost and complexity. That makes us question if it’s worth switching. For a medium-sized company, is moving to open source really viable, or are the long-term setup and maintenance costs just as high?
Is anyone running Grafana OSS at scale? Does it handle large volumes well in practice?
https://redd.it/1kcz9e5
@r_devops
I work at a medium-sized company using New Relic for observability. We ingest over 80GB of data monthly, run 20+ services across production and staging, and use MongoDB. While New Relic covers logs, metrics, traces and MongoDB well, it’s getting too expensive.
We’re considering switching to Grafana, Prometheus, and OpenTelemetry to handle all our monitoring needs, including MongoDB. But setting up Grafana has been a lot of manual work. There aren’t many good, maintained open-source dashboards—especially for MongoDB—and building them from scratch takes time.
I also read that as data and dashboards grow, Grafana can slow down and require more powerful machines, which adds cost and complexity. That makes us question if it’s worth switching. For a medium-sized company, is moving to open source really viable, or are the long-term setup and maintenance costs just as high?
Is anyone running Grafana OSS at scale? Does it handle large volumes well in practice?
https://redd.it/1kcz9e5
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Is OpenTelemetry ready to monitor my (and your) infra today?
OpenTelemetry has come a long way in the context of distributed tracing and also provides crazy correlation level with logs, traces and metrics. But OTel as a project has been growing and is way more powerful than just doing distributed tracing today.
The awareness around OTel for infra monitoring is very less. Folks mostly use prometheus, which is great, but if you are using OTel for traces, logs etc - maybe you should give it a shot for infra monitoring as well.
That said, OTel for infra is still expanding with new receivers etc being added.
As a medium to spread awareness on this, and to help anyone looking for a shift from prom or already using OTel trying to decrease the silos, I wrote a blog that broadly discusses,
1/ how you can use OTel for monitoring your VMs, K8s clusters and pods easily
2/ if OTel is ready to monitor your infra
3/ how to switch to OTel from Prometheus [pretty easy with the prometheus receiver\]
Link to the blog here
https://redd.it/1kcye6b
@r_devops
OpenTelemetry has come a long way in the context of distributed tracing and also provides crazy correlation level with logs, traces and metrics. But OTel as a project has been growing and is way more powerful than just doing distributed tracing today.
The awareness around OTel for infra monitoring is very less. Folks mostly use prometheus, which is great, but if you are using OTel for traces, logs etc - maybe you should give it a shot for infra monitoring as well.
That said, OTel for infra is still expanding with new receivers etc being added.
As a medium to spread awareness on this, and to help anyone looking for a shift from prom or already using OTel trying to decrease the silos, I wrote a blog that broadly discusses,
1/ how you can use OTel for monitoring your VMs, K8s clusters and pods easily
2/ if OTel is ready to monitor your infra
3/ how to switch to OTel from Prometheus [pretty easy with the prometheus receiver\]
Link to the blog here
https://redd.it/1kcye6b
@r_devops
SigNoz
Is OpenTelemetry ready for Infra Monitoring?
OpenTelemetry has made infratsructure monitoring easy to get started with and comes with options for kubernetes cluster and pod monitoring as well. OpenTelemetry also makes it possible to achieve correlation with application monitoring as well.
AWS SAA-C03 Exam Traps That Almost Failed Me (And How to Dodge Them)
Hello comrades!
I cleared my AWS SAA exam recently and made an article about my journey and what common pitfalls to avoid :)
I hope this helps anyone who's planning to take up the examination soon :)
Please feel to add anything I might have missed :)
https://medium.com/@nageshrajcodes/aws-saa-c03-exam-traps-that-almost-failed-me-and-how-to-dodge-them-08c41ed73e2a?sk=cea7f9606ce910a723b4064b2a48c8d9
I wish you all the very best :')
Thank you :)
https://redd.it/1kd0ghv
@r_devops
Hello comrades!
I cleared my AWS SAA exam recently and made an article about my journey and what common pitfalls to avoid :)
I hope this helps anyone who's planning to take up the examination soon :)
Please feel to add anything I might have missed :)
https://medium.com/@nageshrajcodes/aws-saa-c03-exam-traps-that-almost-failed-me-and-how-to-dodge-them-08c41ed73e2a?sk=cea7f9606ce910a723b4064b2a48c8d9
I wish you all the very best :')
Thank you :)
https://redd.it/1kd0ghv
@r_devops
Medium
AWS SAA-C03 Exam Traps That Almost Failed Me (And How to Dodge Them)
I scored 825/1000 on my AWS SAA-C03 exam — but only after falling face-first into every trap AWS could throw at me. Here’s how to avoid…
Help creating a whatsapp bot
Hi, im trying to create a bot for my company that grabs files from a sharepoint folder and sends them through whatsapp when asked. i have 0 experience, whats the easiest way to do it? my job kind of depends on this
edit* i can use only copilot IA, for privacy policies
https://redd.it/1kd2t6z
@r_devops
Hi, im trying to create a bot for my company that grabs files from a sharepoint folder and sends them through whatsapp when asked. i have 0 experience, whats the easiest way to do it? my job kind of depends on this
edit* i can use only copilot IA, for privacy policies
https://redd.it/1kd2t6z
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Which DevOps repositories need contributions?
I don't think I am the only one that has a little bit of a spare time in their life and would love to help out on a DevOps project in need.
What are your favorite ones? Which repositories need just a little bit more love, whether writing documentation, improving runtime or adding features?
https://redd.it/1kd41pq
@r_devops
I don't think I am the only one that has a little bit of a spare time in their life and would love to help out on a DevOps project in need.
What are your favorite ones? Which repositories need just a little bit more love, whether writing documentation, improving runtime or adding features?
https://redd.it/1kd41pq
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Thoughts on asdf
I ran into this tool a few years back and didn't give it much thought (I ended using pyenv at that time)
But now I am juggling a few projects that require different versions for different things. Enter asdf. It is not ultra intuitive but in a nutshell:
1. list and get the plugins you need
2. list and install the versions you need
3. set the required versions for your project
You can use it to build images in CI. Talk to databases of different version. Install pesky tools that require a specific version of Python. The world is your oyster.
If you haven't tried it, I highly recommend it. If you are new/junior, definitely learn it!
Question to the seniors: Do you use asdf? Any alternatives? Cautionary tales? Suggestions?
https://redd.it/1kd4m8y
@r_devops
I ran into this tool a few years back and didn't give it much thought (I ended using pyenv at that time)
But now I am juggling a few projects that require different versions for different things. Enter asdf. It is not ultra intuitive but in a nutshell:
1. list and get the plugins you need
2. list and install the versions you need
3. set the required versions for your project
You can use it to build images in CI. Talk to databases of different version. Install pesky tools that require a specific version of Python. The world is your oyster.
If you haven't tried it, I highly recommend it. If you are new/junior, definitely learn it!
Question to the seniors: Do you use asdf? Any alternatives? Cautionary tales? Suggestions?
https://redd.it/1kd4m8y
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How do you manage upgrades in a multi-tenant environment where every team does their own thing and "dev downtime" is treated like a production outage?
We support dozens of tenant teams (with more being added every quarter), each running multiple apps with wildly different languages, package versions, and levels of testing. There's very little standardization, and even where we're able to create some, inevitably some team comes along with a requirement and leadership authorizes a one-off alternatively deployed solution with little thought given to the long term maintenance and suitability of said solution. The org's mantra is "don't get in the developers' way," which often ends up meaning: no enforcement, very few guardrails, and no appetite for upgrades or maintenance work that might introduce any friction.
Our platform team is just two people (down from seven a year ago), responsible for everything from cost savings to network improvements to platform upgrades. What happens, over and over again, is this:
1. We test an upgrade thoroughly against our own infrastructure apps and roll it out.
2. Some tenant apps break—often because they're using ancient libraries, make assumptions about networking, or haven’t been tested in years.
3. We get blamed, the upgrade gets rolled back, and now we're on the hook to fix it.
4. We try to schedule time with the tenant teams to reproduce issues in a lower environment, but even their "dev" environments are treated like production. Any interruption is considered "blocking development."
5. Scheduling across dozens of tenants takes weeks or months. The upgrade gets deprioritized as "too expensive" in terms of engineer hours. We get a new top-down initiative and the last one is dropped into tech debt purgatory.
6. A few months later, we try again—but now we have even more tenants and more variables. Rinse and repeat.
It’s exhausting. We’re barely keeping the lights on, constantly writing docs and tickets for upgrades we never actually deliver. Meanwhile, many of these tenant teams have been around for a decade and are just migrating onto our systems. Leadership has promised them we won’t “get in their way,” which leaves us with zero leverage to enforce even basic testing or compatibility standards.
We’re stuck between being responsible for reliability and improvement… and having no authority to actually enforce the practices that would lead to either.
How do you manage upgrades in environments like this? Is there a way out of this loop, or is the answer just "wait for enough systems to break that someone finally cares"?
https://redd.it/1kd6srk
@r_devops
We support dozens of tenant teams (with more being added every quarter), each running multiple apps with wildly different languages, package versions, and levels of testing. There's very little standardization, and even where we're able to create some, inevitably some team comes along with a requirement and leadership authorizes a one-off alternatively deployed solution with little thought given to the long term maintenance and suitability of said solution. The org's mantra is "don't get in the developers' way," which often ends up meaning: no enforcement, very few guardrails, and no appetite for upgrades or maintenance work that might introduce any friction.
Our platform team is just two people (down from seven a year ago), responsible for everything from cost savings to network improvements to platform upgrades. What happens, over and over again, is this:
1. We test an upgrade thoroughly against our own infrastructure apps and roll it out.
2. Some tenant apps break—often because they're using ancient libraries, make assumptions about networking, or haven’t been tested in years.
3. We get blamed, the upgrade gets rolled back, and now we're on the hook to fix it.
4. We try to schedule time with the tenant teams to reproduce issues in a lower environment, but even their "dev" environments are treated like production. Any interruption is considered "blocking development."
5. Scheduling across dozens of tenants takes weeks or months. The upgrade gets deprioritized as "too expensive" in terms of engineer hours. We get a new top-down initiative and the last one is dropped into tech debt purgatory.
6. A few months later, we try again—but now we have even more tenants and more variables. Rinse and repeat.
It’s exhausting. We’re barely keeping the lights on, constantly writing docs and tickets for upgrades we never actually deliver. Meanwhile, many of these tenant teams have been around for a decade and are just migrating onto our systems. Leadership has promised them we won’t “get in their way,” which leaves us with zero leverage to enforce even basic testing or compatibility standards.
We’re stuck between being responsible for reliability and improvement… and having no authority to actually enforce the practices that would lead to either.
How do you manage upgrades in environments like this? Is there a way out of this loop, or is the answer just "wait for enough systems to break that someone finally cares"?
https://redd.it/1kd6srk
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Memcached Docker Images (as small as 124 KB!) – Feedback Wanted
I wanted to share a project I’ve been working on: a suite of Docker images for Memcached 1.6.38 that I’ve stripped down to the bare minimum—optimized specifically for containerized environments. These images are scratch-based, TCP-only, and fully configurable using environment variables via patched code(no CLI args needed, but still supported).
Thanks.
🔗 GitHub: https://github.com/johnnyjoy/memcached-docker
🔗 Docker Hub: https://hub.docker.com/r/tigersmile/memcached
https://redd.it/1kd6quk
@r_devops
I wanted to share a project I’ve been working on: a suite of Docker images for Memcached 1.6.38 that I’ve stripped down to the bare minimum—optimized specifically for containerized environments. These images are scratch-based, TCP-only, and fully configurable using environment variables via patched code(no CLI args needed, but still supported).
Thanks.
🔗 GitHub: https://github.com/johnnyjoy/memcached-docker
🔗 Docker Hub: https://hub.docker.com/r/tigersmile/memcached
https://redd.it/1kd6quk
@r_devops
GitHub
GitHub - johnnyjoy/memcached-docker: Dockerized memcached 393kb on AMD64
Dockerized memcached 393kb on AMD64. Contribute to johnnyjoy/memcached-docker development by creating an account on GitHub.