Wasps With Bazookas v2 - A Distributed http/https load testing system
# What the Heck is This?
Wasps With Bazookas is a distributed swarm-based load testing tool made up of two parts:
Hive: the central coordinator (think: command center)
Wasps: individual agents that generate HTTP/S traffic from wherever you deploy them
You can install wasps on as many machines as you want — across your LAN, across the world — and aim the swarm at any API or infrastructure you want to stress test.
It’s built to help you measure actual performance limits, find real bottlenecks, and uncover high-overhead services in your stack — without the testing tool becoming the bottleneck itself.
# Why I built it
As you can tell, I came up with the name as a nod towards its inspiration bees with machine guns
I spent months debugging performance bottlenecks in production systems. Every time I thought I found the issue, it turned out the load testing tool itself was the bottleneck, not my infrastructure.
This project actually started 6+ years ago as a Node.js wrapper around wrk, but that had limits. I eventually rewrote it entirely in Rust, ditched wrk, and built the load engine natively into the tool for better control and raw speed.
# What Makes This Special?
# The Hive Architecture
🏠 HIVE (Command Center)
↕️
🐝🐝🐝🐝🐝🐝🐝🐝
Wasp Army Spread Out Across the World (or not)
↕️
🎯 TARGET SERVER
Hive: Your command center that coordinates all wasps
Wasps: Individual load testing agents that do the heavy lifting
Distributed: Each wasp runs independently, maximizing throughput
Millions of RPS: Scale to millions of requests per second
Sub-microsecond Latency: Precise timing measurements
Real-time Reporting: Get results as they happen
I hope you enjoy WaspsWithBazookas! I frequently create open-source projects to simplify my life and, ideally, help others simplify theirs as well. Right now, the interface is quite basic, and there's plenty of room for improvement. I'm excited to share this project with the community in hopes that others will contribute and help enhance it further. Thanks for checking it out and I truly appreciate your support!
https://redd.it/1lv5r5q
@r_devops
# What the Heck is This?
Wasps With Bazookas is a distributed swarm-based load testing tool made up of two parts:
Hive: the central coordinator (think: command center)
Wasps: individual agents that generate HTTP/S traffic from wherever you deploy them
You can install wasps on as many machines as you want — across your LAN, across the world — and aim the swarm at any API or infrastructure you want to stress test.
It’s built to help you measure actual performance limits, find real bottlenecks, and uncover high-overhead services in your stack — without the testing tool becoming the bottleneck itself.
# Why I built it
As you can tell, I came up with the name as a nod towards its inspiration bees with machine guns
I spent months debugging performance bottlenecks in production systems. Every time I thought I found the issue, it turned out the load testing tool itself was the bottleneck, not my infrastructure.
This project actually started 6+ years ago as a Node.js wrapper around wrk, but that had limits. I eventually rewrote it entirely in Rust, ditched wrk, and built the load engine natively into the tool for better control and raw speed.
# What Makes This Special?
# The Hive Architecture
🏠 HIVE (Command Center)
↕️
🐝🐝🐝🐝🐝🐝🐝🐝
Wasp Army Spread Out Across the World (or not)
↕️
🎯 TARGET SERVER
Hive: Your command center that coordinates all wasps
Wasps: Individual load testing agents that do the heavy lifting
Distributed: Each wasp runs independently, maximizing throughput
Millions of RPS: Scale to millions of requests per second
Sub-microsecond Latency: Precise timing measurements
Real-time Reporting: Get results as they happen
I hope you enjoy WaspsWithBazookas! I frequently create open-source projects to simplify my life and, ideally, help others simplify theirs as well. Right now, the interface is quite basic, and there's plenty of room for improvement. I'm excited to share this project with the community in hopes that others will contribute and help enhance it further. Thanks for checking it out and I truly appreciate your support!
https://redd.it/1lv5r5q
@r_devops
GitHub
GitHub - Phara0h/WaspsWithBazookas: Its like bees with machine guns but way more power
Its like bees with machine guns but way more power - Phara0h/WaspsWithBazookas
Release cycles, ci/cd and branching strategies
For all mid sized companies out there with monolithic and legacy code, how do you release?
I work at a company where the release cycle is daily releases with a confusing branching strategy(a combination of trunk based and gitflow strategies). A release will often have hot fixes and ready to deploy features. The release process has been tedious lately
For now, we mainly 2 main branches (apart from feature branches and bug fixes). Code changes are first merged to dev after unit Tests run and qa tests if necessary, then we deploy code changes to an environment daily and run e2es and a pr is created to the release branch. If the pr is reviewed and all is well with the tests and the code exceptions, we merge the pr and deploy to staging where we run e2es again and then deploy to prod.
Is there a way to improve this process? I'm curious about the release cycle of big companies
https://redd.it/1lv6brv
@r_devops
For all mid sized companies out there with monolithic and legacy code, how do you release?
I work at a company where the release cycle is daily releases with a confusing branching strategy(a combination of trunk based and gitflow strategies). A release will often have hot fixes and ready to deploy features. The release process has been tedious lately
For now, we mainly 2 main branches (apart from feature branches and bug fixes). Code changes are first merged to dev after unit Tests run and qa tests if necessary, then we deploy code changes to an environment daily and run e2es and a pr is created to the release branch. If the pr is reviewed and all is well with the tests and the code exceptions, we merge the pr and deploy to staging where we run e2es again and then deploy to prod.
Is there a way to improve this process? I'm curious about the release cycle of big companies
https://redd.it/1lv6brv
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Advice Needed Robust PII Detection Directly in the Browser (WASM / JS)
Hi everyone,
I'm currently building a feature where we execute SQL queries using DuckDB-WASM directly in the user's browser. Before displaying or sending the results, I want to detect any potential PII (Personally Identifiable Information) and warn the user accordingly.
Current Goal:
- Run PII detection entirely on the client-side, without sending data to the server.
- Integrate seamlessly into existing confirmation dialogs to warn users if potential PII is detected.
Issue I'm facing:
My existing codebase is primarily Node.js/TypeScript. I initially attempted integrating Microsoft Presidio (Python library) via Pyodide in-browser, but this approach failed due to Presidio’s native dependencies and reliance on large spaCy models, making it impractical for browser usage.
Given this context (Node.js/TypeScript-based environment), how could I achieve robust, accurate, client-side PII detection directly in the browser?
Thanks in advance for your advice!
https://redd.it/1lv72bs
@r_devops
Hi everyone,
I'm currently building a feature where we execute SQL queries using DuckDB-WASM directly in the user's browser. Before displaying or sending the results, I want to detect any potential PII (Personally Identifiable Information) and warn the user accordingly.
Current Goal:
- Run PII detection entirely on the client-side, without sending data to the server.
- Integrate seamlessly into existing confirmation dialogs to warn users if potential PII is detected.
Issue I'm facing:
My existing codebase is primarily Node.js/TypeScript. I initially attempted integrating Microsoft Presidio (Python library) via Pyodide in-browser, but this approach failed due to Presidio’s native dependencies and reliance on large spaCy models, making it impractical for browser usage.
Given this context (Node.js/TypeScript-based environment), how could I achieve robust, accurate, client-side PII detection directly in the browser?
Thanks in advance for your advice!
https://redd.it/1lv72bs
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
DataDog synthetics are the best but way over priced. Made something better and free
After seeing DataDog Synthetics pricing, I built a distributed synthetic monitoring solution that we've been using internally for about a year. It's scalable, performant, and completely free.
Current features:
Distributed monitoring nodes
Multi-step browser checks
API monitoring
Custom assertions
Coming soon:
Email notifications (next few days)
Internal network synthetics
Additional integrations
Open sourcing most of the codebase
If you need synthetic monitoring but can't justify enterprise pricing, check it out: https://synthmon.io/
Would love feedback from the community on what features you'd find most useful.
https://redd.it/1lv8xlz
@r_devops
After seeing DataDog Synthetics pricing, I built a distributed synthetic monitoring solution that we've been using internally for about a year. It's scalable, performant, and completely free.
Current features:
Distributed monitoring nodes
Multi-step browser checks
API monitoring
Custom assertions
Coming soon:
Email notifications (next few days)
Internal network synthetics
Additional integrations
Open sourcing most of the codebase
If you need synthetic monitoring but can't justify enterprise pricing, check it out: https://synthmon.io/
Would love feedback from the community on what features you'd find most useful.
https://redd.it/1lv8xlz
@r_devops
Best way to continue moving into devops from helpdesk?
I’ve looked over some of the roadmaps, and I know I already have some of the knowledge, so I was curious what I have already done/what I should do to continue to move down the career path to get into devops. Below are some of the things I am considering as I am moving down this career path.
1) I have graduated about a year ago with a degree in computer science. During this time I was exposed to several coding languages including C, Java, and most importantly (in my opinion) python
2) I have an A+ certification and am almost finished studying for my network+
3) As stated in the title, I currently work in a helpdesk position. I have only been there about 4 months, but during that time I have been writing some basic powershell scripts to help automate tasks in Active Directory, and I’ve written one major script in python that helps ticket creation go a bit smoother (nothing fancy, it’s really just a way to format text as a lot of what we do is copying and pasting information, but it works)
4) I currently have a homelab. A lot of what I do is based around docker containers that each run their own web application. I won’t pretend I am super familiar with docker but it is something I have used a decent amount
5) I have used sql, as well as some nosql languages such as neo4j. I’ve also hosted a sql database on aws but that was a while ago and it would take me a while to do it again.
Is there anything else that I could do to further my knowledge? Any other certifications or intermediate career jumps I could make before landing a dev ops position? I’m a little bit lost so any help would be appreciated
https://redd.it/1lvbncd
@r_devops
I’ve looked over some of the roadmaps, and I know I already have some of the knowledge, so I was curious what I have already done/what I should do to continue to move down the career path to get into devops. Below are some of the things I am considering as I am moving down this career path.
1) I have graduated about a year ago with a degree in computer science. During this time I was exposed to several coding languages including C, Java, and most importantly (in my opinion) python
2) I have an A+ certification and am almost finished studying for my network+
3) As stated in the title, I currently work in a helpdesk position. I have only been there about 4 months, but during that time I have been writing some basic powershell scripts to help automate tasks in Active Directory, and I’ve written one major script in python that helps ticket creation go a bit smoother (nothing fancy, it’s really just a way to format text as a lot of what we do is copying and pasting information, but it works)
4) I currently have a homelab. A lot of what I do is based around docker containers that each run their own web application. I won’t pretend I am super familiar with docker but it is something I have used a decent amount
5) I have used sql, as well as some nosql languages such as neo4j. I’ve also hosted a sql database on aws but that was a while ago and it would take me a while to do it again.
Is there anything else that I could do to further my knowledge? Any other certifications or intermediate career jumps I could make before landing a dev ops position? I’m a little bit lost so any help would be appreciated
https://redd.it/1lvbncd
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
My aws ubuntu instance status checks failed twice
I did-not set any cloud watch restarts. Last week all of a sudden my aws instance status checks failed.
After restarting the instance it started working.
And then when i checked the logs. I found this
‘’’
amazon-ssm-agent405: ... dial tcp 169.254.169.254:80: connect: network is unreachable
systemd-networkd-wait-online: Timeout occurred while waiting for network connectivity
‘’’
It was working fine. Then last night the same instance it failed again. This time the errors
‘’’
Jul 8 15:36:25 systemd-networkd352: ens5: Could not set DHCPv4 address: Connection timed out
Jul 8 15:36:25 systemd-networkd352: ens5: Failed
‘’’
This is the command i used to get the logs:
grep -iE "oom|panic|killed process|segfault|unreachable|network|link down|i/o error|xfs|ext4|nvme" /var/log/syslog | tail -n 100
Why is this happening?
https://redd.it/1lvbqq3
@r_devops
I did-not set any cloud watch restarts. Last week all of a sudden my aws instance status checks failed.
After restarting the instance it started working.
And then when i checked the logs. I found this
‘’’
amazon-ssm-agent405: ... dial tcp 169.254.169.254:80: connect: network is unreachable
systemd-networkd-wait-online: Timeout occurred while waiting for network connectivity
‘’’
It was working fine. Then last night the same instance it failed again. This time the errors
‘’’
Jul 8 15:36:25 systemd-networkd352: ens5: Could not set DHCPv4 address: Connection timed out
Jul 8 15:36:25 systemd-networkd352: ens5: Failed
‘’’
This is the command i used to get the logs:
grep -iE "oom|panic|killed process|segfault|unreachable|network|link down|i/o error|xfs|ext4|nvme" /var/log/syslog | tail -n 100
Why is this happening?
https://redd.it/1lvbqq3
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Do you prefer fixed-cost cloud services or a hybrid pay-as-you-grow model?
Hey everyone,
I’m curious about how people feel when it comes to pricing models for cloud services.
For context:
Some platforms offer a fixed-cost, SaaS-like approach. You pay a predictable monthly fee that covers a set amount of resources (CPU, RAM, bandwidth, storage, etc.), and you don’t have to think much about scaling until you hit hard limits.
Others may offer a hybrid model. You pay a base fee for a certain resource allocation, but you can add more resources on demand (extra CPU, RAM, storage, bandwidth, etc.), and pay for that usage incrementally.
My questions:
As a developer or business owner, which model do you prefer and why?
Any horror stories or success stories with either approach?
I’d love to hear real-world experiences - whether you’re running personal projects, SaaS apps, or large-scale deployments.
Thanks in advance for your thoughts!
https://redd.it/1lvdtd1
@r_devops
Hey everyone,
I’m curious about how people feel when it comes to pricing models for cloud services.
For context:
Some platforms offer a fixed-cost, SaaS-like approach. You pay a predictable monthly fee that covers a set amount of resources (CPU, RAM, bandwidth, storage, etc.), and you don’t have to think much about scaling until you hit hard limits.
Others may offer a hybrid model. You pay a base fee for a certain resource allocation, but you can add more resources on demand (extra CPU, RAM, storage, bandwidth, etc.), and pay for that usage incrementally.
My questions:
As a developer or business owner, which model do you prefer and why?
Any horror stories or success stories with either approach?
I’d love to hear real-world experiences - whether you’re running personal projects, SaaS apps, or large-scale deployments.
Thanks in advance for your thoughts!
https://redd.it/1lvdtd1
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
What does the cloud infrastructure costs at every stage of startup look like?
So, I am writing a blog about what happens to the infrastructure costs as startups scale up. This is not the exact topic, as I'm still researching and exploring. But I needed help from you to understand what, as a startup, the infrastructure costs look like at every stage. At early, growth, and mature stages. It would be great if I could get a detailed explanation of everything that happened.
Also, if you know of any research that took place on this topic, pls share that with me.
And if someone is willing to do so, help me structure this blog properly. Suggest other sections that should definitely be there.
https://redd.it/1lvf23u
@r_devops
So, I am writing a blog about what happens to the infrastructure costs as startups scale up. This is not the exact topic, as I'm still researching and exploring. But I needed help from you to understand what, as a startup, the infrastructure costs look like at every stage. At early, growth, and mature stages. It would be great if I could get a detailed explanation of everything that happened.
Also, if you know of any research that took place on this topic, pls share that with me.
And if someone is willing to do so, help me structure this blog properly. Suggest other sections that should definitely be there.
https://redd.it/1lvf23u
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Has anyone taken this AI-readiness infra quiz?
Found this 10-question quiz that gives you a report on how AI-ready your infrastructure is.
Questionaire link : https://lnk.ink/bKmPl
It touches on things like developer self-service and platform engineering — felt like it's leaning a bit in that direction. Curious if anyone else took it and what you thought of your results. Are these kinds of frameworks useful or just more trend-chasing?
https://redd.it/1lvhaea
@r_devops
Found this 10-question quiz that gives you a report on how AI-ready your infrastructure is.
Questionaire link : https://lnk.ink/bKmPl
It touches on things like developer self-service and platform engineering — felt like it's leaning a bit in that direction. Curious if anyone else took it and what you thought of your results. Are these kinds of frameworks useful or just more trend-chasing?
https://redd.it/1lvhaea
@r_devops
Any tools to automatically diagram cloud infra?
Are there any tools that will automatically scan AWS, GCP, Azure and diagram what is deployed?
So far, I have found CloudCraft from Datadog, but this only supports AWS and its automatically diagraming is still in beta (AFAIK).
I am considering building something custom for this - but judging from the lack of tools that support multi-cloud, or only support manual diagraming, I wonder if I am missing some technical limitation that prevent such tools form being possible.
https://redd.it/1lvjpwo
@r_devops
Are there any tools that will automatically scan AWS, GCP, Azure and diagram what is deployed?
So far, I have found CloudCraft from Datadog, but this only supports AWS and its automatically diagraming is still in beta (AFAIK).
I am considering building something custom for this - but judging from the lack of tools that support multi-cloud, or only support manual diagraming, I wonder if I am missing some technical limitation that prevent such tools form being possible.
https://redd.it/1lvjpwo
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Terraform at Scale: Smart Practices That Save You Headaches Later
https://medium.com/@DynamoDevOps/terraform-at-scale-smart-practices-that-save-you-headaches-later-part-1-7054a11e99db
https://redd.it/1lvkwa0
@r_devops
https://medium.com/@DynamoDevOps/terraform-at-scale-smart-practices-that-save-you-headaches-later-part-1-7054a11e99db
https://redd.it/1lvkwa0
@r_devops
Medium
Terraform at Scale: Smart Practices That Save You Headaches Later (Part 1)
You don’t need more theory; what you really need is the practical stuff that counts when you’re building and scaling infrastructure with…
What are your tips for long running migrations and how to handle zero downtime deployments with migrations that transform data in the database or data warehouse?
Suppose you're running CD to deploy with zero-downtime, and you're deploying a Laravel app proxied with NGINX
Usually this can be done by writing new files to a new directory under ./releases, like ./releases/1001and then symlinking the new directory so that NGINX feeds requests to its PHP code
This works well, but if you need to transform millions of rows, with some complex long running queries, what approach would you use, to keep the app online, yet avoid any conflicts?
Do large scale apps have some toggle for a read only mode? if so, is each account locked, transformed, then unlocked? any best practices or stories from real world experience is appreciated.
Thanks
https://redd.it/1lvix7m
@r_devops
Suppose you're running CD to deploy with zero-downtime, and you're deploying a Laravel app proxied with NGINX
Usually this can be done by writing new files to a new directory under ./releases, like ./releases/1001and then symlinking the new directory so that NGINX feeds requests to its PHP code
This works well, but if you need to transform millions of rows, with some complex long running queries, what approach would you use, to keep the app online, yet avoid any conflicts?
Do large scale apps have some toggle for a read only mode? if so, is each account locked, transformed, then unlocked? any best practices or stories from real world experience is appreciated.
Thanks
https://redd.it/1lvix7m
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Why is drift detection/correction so important?
Coming from a programming background, I'm struggling to understand why Terraform, Pulumi and friends are explicitly designed to detect and correct so-called cloud drift.
Please help me understand, why cloud drift such a big deal for companies these days?
Back in the day (still today) database migrations were the hottest thing since sliced bread, and they assumed that all schema changes would happen through the tool (no manual changes through the GUI). Why is the expectation any different for cloud infrastructure deployment?
Thank you for your time.
https://redd.it/1lvn6pj
@r_devops
Coming from a programming background, I'm struggling to understand why Terraform, Pulumi and friends are explicitly designed to detect and correct so-called cloud drift.
Please help me understand, why cloud drift such a big deal for companies these days?
Back in the day (still today) database migrations were the hottest thing since sliced bread, and they assumed that all schema changes would happen through the tool (no manual changes through the GUI). Why is the expectation any different for cloud infrastructure deployment?
Thank you for your time.
https://redd.it/1lvn6pj
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Tiny statically-linked nginx Docker image (~432KB, multi-arch, FROM scratch)
Hey all,
I wanted to share a project I’ve been working on: [nginx-micro](https://github.com/johnnyjoy/nginx-micro). It’s an ultra-minimal, statically-linked nginx build, packaged in a Docker image FROM scratch. On amd64, it’s just **\~432KB**—compared to nearly 70MB for the official image. Multi-arch builds (arm64, arm/v7, 386, ppc64le, s390x, riscv64) are supported.
**Key points:**
* Built for container-native environments (Kubernetes, Compose, CI/CD, etc.)
* No shell, package manager, or writable FS—just the nginx binary and config
* *Only* HTTP and FastCGI (for PHP-FPM) are included—no SSL, gzip, or proxy modules
* Runs as root (for port 80), but worker processes drop to `nginx` user
* Default config and usage examples provided; custom configs are supported via mount
* Container-native logging (stdout/stderr)
**Intended use:**
For internal use behind a real SSL reverse proxy (Caddy, Traefik, HAProxy, or another nginx). Not intended for public-facing or SSL-terminating deployments.
**Use-cases:**
* Static file/asset serving in microservices
* FastCGI for PHP (WordPress, Drupal, etc.)
* Health checks and smoke tests
* CI/CD or demo environments where you want minimal surface area
**Security notes:**
* No shell/interpreter = much lower risk of “container escape”
* Runs as root by default for port 80, but easily switched to unprivileged user and/or high ports
I’d love feedback from the nginx/devops crowd:
* Any features you wish were included?
* Use-cases where a tiny nginx would be *too* limited?
* Is there interest in an image like this for other internal protocols?
Full README and build details here: [https://github.com/johnnyjoy/nginx-micro](https://github.com/johnnyjoy/nginx-micro)
Happy to answer questions, take suggestions, or discuss internals!
https://redd.it/1lvptij
@r_devops
Hey all,
I wanted to share a project I’ve been working on: [nginx-micro](https://github.com/johnnyjoy/nginx-micro). It’s an ultra-minimal, statically-linked nginx build, packaged in a Docker image FROM scratch. On amd64, it’s just **\~432KB**—compared to nearly 70MB for the official image. Multi-arch builds (arm64, arm/v7, 386, ppc64le, s390x, riscv64) are supported.
**Key points:**
* Built for container-native environments (Kubernetes, Compose, CI/CD, etc.)
* No shell, package manager, or writable FS—just the nginx binary and config
* *Only* HTTP and FastCGI (for PHP-FPM) are included—no SSL, gzip, or proxy modules
* Runs as root (for port 80), but worker processes drop to `nginx` user
* Default config and usage examples provided; custom configs are supported via mount
* Container-native logging (stdout/stderr)
**Intended use:**
For internal use behind a real SSL reverse proxy (Caddy, Traefik, HAProxy, or another nginx). Not intended for public-facing or SSL-terminating deployments.
**Use-cases:**
* Static file/asset serving in microservices
* FastCGI for PHP (WordPress, Drupal, etc.)
* Health checks and smoke tests
* CI/CD or demo environments where you want minimal surface area
**Security notes:**
* No shell/interpreter = much lower risk of “container escape”
* Runs as root by default for port 80, but easily switched to unprivileged user and/or high ports
I’d love feedback from the nginx/devops crowd:
* Any features you wish were included?
* Use-cases where a tiny nginx would be *too* limited?
* Is there interest in an image like this for other internal protocols?
Full README and build details here: [https://github.com/johnnyjoy/nginx-micro](https://github.com/johnnyjoy/nginx-micro)
Happy to answer questions, take suggestions, or discuss internals!
https://redd.it/1lvptij
@r_devops
GitHub
GitHub - johnnyjoy/nginx-micro: Ultra-minimal, multi-architecture, static NGINX container—perfect for secure, small HTTP serving…
Ultra-minimal, multi-architecture, static NGINX container—perfect for secure, small HTTP serving behind a reverse proxy. - johnnyjoy/nginx-micro
Real Consulting Example: Refactoring FinTech Project to use Terraform and ArgoCD
https://lukasniessen.medium.com/real-consulting-example-refactoring-fintech-project-to-use-terraform-and-argocd-1180594b071a
https://redd.it/1lvribg
@r_devops
https://lukasniessen.medium.com/real-consulting-example-refactoring-fintech-project-to-use-terraform-and-argocd-1180594b071a
https://redd.it/1lvribg
@r_devops
Medium
Real Consulting Example: Refactoring FinTech Project to use Terraform and ArgoCD
This is a FinTech project from my consulting career as a software architect. It involves refactoring a project to use Infrastructure as…
Need advice: Centralized logging in GCP with low cost?
Hi everyone,
I’m working on a task to centralize logging for our infrastructure. We’re using GCP, and we already have Cloud Logging enabled. Currently, logs are stored in GCP Logging with a storage cost of around $0.50/GB.
I had an idea to reduce long-term costs:
• Create a sink to export logs to Google Cloud Storage (GCS)
• Enable Autoclass on the bucket to optimize storage cost over time
• Then, periodically import logs to BigQuery for querying/visualization in Grafana
I’m still a junior and trying to find the best solution that balances functionality and cost in the long term.
Is this a good idea? Or are there better practices you would recommend?
https://redd.it/1lvsura
@r_devops
Hi everyone,
I’m working on a task to centralize logging for our infrastructure. We’re using GCP, and we already have Cloud Logging enabled. Currently, logs are stored in GCP Logging with a storage cost of around $0.50/GB.
I had an idea to reduce long-term costs:
• Create a sink to export logs to Google Cloud Storage (GCS)
• Enable Autoclass on the bucket to optimize storage cost over time
• Then, periodically import logs to BigQuery for querying/visualization in Grafana
I’m still a junior and trying to find the best solution that balances functionality and cost in the long term.
Is this a good idea? Or are there better practices you would recommend?
https://redd.it/1lvsura
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
AWS Freelanced Project Pricing Help
I recently got my first gig to set up some cloud infra on aws. The problem is I don't know how much is usually charged for the field of project based work. The infra I setup took about two days - I came up with the cloud architecture for the webapp and setup the Cloudfront Hosting, S3 buckets for storage, and wrote some lambda function for basic pin-based security - this is all just proof of concept.
The final project will have:
\-proper password access (Doesnt have to be super secure, its just so a large group of select people can view some images)
\-a database will be added for scalability
\-and the cloud front behaviors will need to be changed.
(Its pretty much an image gallery website with flare)
How should I price this?
https://redd.it/1lvsmnw
@r_devops
I recently got my first gig to set up some cloud infra on aws. The problem is I don't know how much is usually charged for the field of project based work. The infra I setup took about two days - I came up with the cloud architecture for the webapp and setup the Cloudfront Hosting, S3 buckets for storage, and wrote some lambda function for basic pin-based security - this is all just proof of concept.
The final project will have:
\-proper password access (Doesnt have to be super secure, its just so a large group of select people can view some images)
\-a database will be added for scalability
\-and the cloud front behaviors will need to be changed.
(Its pretty much an image gallery website with flare)
How should I price this?
https://redd.it/1lvsmnw
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Announcing Factor House Local v2.0: A Unified & Persistent Data Platform!
We're excited to launch a major update to our local development suite. While retaining our powerful Apache Kafka and Apache Pinot environments for real-time processing and analytics, this release introduces our biggest enhancement yet: a new Unified Analytics Platform.
Key Highlights:
🚀 Unified Analytics Platform: We've merged our Flink (streaming) and Spark (batch) environments. Develop end-to-end pipelines on a single Apache Iceberg lakehouse, simplifying management and eliminating data silos.
🧠 Centralized Catalog with Hive Metastore: The new system of record for the platform. It saves not just your tables, but your analytical logic—permanent SQL views and custom functions (UDFs)—making them instantly reusable across all Flink and Spark jobs.
💾 Enhanced Flink Reliability: Flink checkpoints and savepoints are now persisted directly to MinIO (S3-compatible storage), ensuring robust state management and reliable recovery for your streaming applications.
🌊 CDC-Ready Database: The included PostgreSQL instance is pre-configured for Change Data Capture (CDC), allowing you to easily prototype real-time data synchronization from an operational database to your lakehouse.
This update provides a more powerful, streamlined, and stateful local development experience across the entire data lifecycle.
Ready to dive in?
⭐️ Explore the project on GitHub: https://github.com/factorhouse/factorhouse-local
🧪 Try our new hands-on labs: https://github.com/factorhouse/examples/tree/main/fh-local-labs
https://redd.it/1lvubnm
@r_devops
We're excited to launch a major update to our local development suite. While retaining our powerful Apache Kafka and Apache Pinot environments for real-time processing and analytics, this release introduces our biggest enhancement yet: a new Unified Analytics Platform.
Key Highlights:
🚀 Unified Analytics Platform: We've merged our Flink (streaming) and Spark (batch) environments. Develop end-to-end pipelines on a single Apache Iceberg lakehouse, simplifying management and eliminating data silos.
🧠 Centralized Catalog with Hive Metastore: The new system of record for the platform. It saves not just your tables, but your analytical logic—permanent SQL views and custom functions (UDFs)—making them instantly reusable across all Flink and Spark jobs.
💾 Enhanced Flink Reliability: Flink checkpoints and savepoints are now persisted directly to MinIO (S3-compatible storage), ensuring robust state management and reliable recovery for your streaming applications.
🌊 CDC-Ready Database: The included PostgreSQL instance is pre-configured for Change Data Capture (CDC), allowing you to easily prototype real-time data synchronization from an operational database to your lakehouse.
This update provides a more powerful, streamlined, and stateful local development experience across the entire data lifecycle.
Ready to dive in?
⭐️ Explore the project on GitHub: https://github.com/factorhouse/factorhouse-local
🧪 Try our new hands-on labs: https://github.com/factorhouse/examples/tree/main/fh-local-labs
https://redd.it/1lvubnm
@r_devops
GitHub
GitHub - factorhouse/factorhouse-local: Docker Compose environments for demonstrating modern data platform architectures using…
Docker Compose environments for demonstrating modern data platform architectures using Kafka, Flink, Spark, Iceberg, Pinot + Kpow & Flex by Factor House - factorhouse/factorhouse-local
I’m stumped- how do Mac application developers test and deploy their code?
I’ve mainly worked with devs who write code for websites and that’s a pretty easy thing for me to suggest how they make their pipelines. However I’m going to be working with this developer who wants to deploy code to a separate mac using gitlab CI and my brain is just not processing it. Like, won’t they be writing their code ideally on a Mac itself? How does one even deploy code other than a tar/pkg file with an install to another mac? How does local testing not fit the use case? Feeling super new to this and I definitely don’t want to guide them in the wrong direction but the best idea I came up with was just 1) local testing or 2) a MacOS-like docker image that it appears is not really a thing that apply supports for obvious reasons.
https://redd.it/1lw0pay
@r_devops
I’ve mainly worked with devs who write code for websites and that’s a pretty easy thing for me to suggest how they make their pipelines. However I’m going to be working with this developer who wants to deploy code to a separate mac using gitlab CI and my brain is just not processing it. Like, won’t they be writing their code ideally on a Mac itself? How does one even deploy code other than a tar/pkg file with an install to another mac? How does local testing not fit the use case? Feeling super new to this and I definitely don’t want to guide them in the wrong direction but the best idea I came up with was just 1) local testing or 2) a MacOS-like docker image that it appears is not really a thing that apply supports for obvious reasons.
https://redd.it/1lw0pay
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
ELK Alternative: With Distributed tracing using OpenSearch, OpenTelemetry & Jaeger
I have been a huge fan of OpenTelemetry. Love how easy it is to use and configure. I wrote this article about a ELK alternative stack we build using OpenSearch and OpenTelemetry at the core. I operate similar stacks with Jaeger added to it for tracing.
I would like to say that Opensearch isn't as inefficient as Elastic likes to claim. We ingest close to a billion daily spans and logs with a small overall cost.
PS: I am not affiliated with AWS in anyway. I just think OpenSearch is awesome for this use case. But AWS's Opensearch offering is egregiously priced, don't use that.
https://osuite.io/articles/alternative-to-elk-with-tracing
Let me know if I you have any feedback to improve the article.
https://redd.it/1lw3ovq
@r_devops
I have been a huge fan of OpenTelemetry. Love how easy it is to use and configure. I wrote this article about a ELK alternative stack we build using OpenSearch and OpenTelemetry at the core. I operate similar stacks with Jaeger added to it for tracing.
I would like to say that Opensearch isn't as inefficient as Elastic likes to claim. We ingest close to a billion daily spans and logs with a small overall cost.
PS: I am not affiliated with AWS in anyway. I just think OpenSearch is awesome for this use case. But AWS's Opensearch offering is egregiously priced, don't use that.
https://osuite.io/articles/alternative-to-elk-with-tracing
Let me know if I you have any feedback to improve the article.
https://redd.it/1lw3ovq
@r_devops
I Found a Roadmap for DevOps—Can You Confirm if it's Right?
Hello People,
I have been glancing over DevOps for a bit now, and I just found a roadmap for it. Would you guys be kind and let me know if it's a well-written roadmap worth following?
The roadmap: https://roadmap.sh/devops
Thank you in advance.
https://redd.it/1lw4h1p
@r_devops
Hello People,
I have been glancing over DevOps for a bit now, and I just found a roadmap for it. Would you guys be kind and let me know if it's a well-written roadmap worth following?
The roadmap: https://roadmap.sh/devops
Thank you in advance.
https://redd.it/1lw4h1p
@r_devops
roadmap.sh
DevOps Roadmap: Learn to become a DevOps Engineer or SRE
Step by step guide for DevOps, SRE or any other Operations Role in 2026