DevOps&SRE Library

YamlQL

Query YAML files with SQL. Transform any YAML structure into a queryable database instantly.

https://github.com/AKSarav/YamlQL

3.55K views07:01

DevOps&SRE Library

kube-composer

A modern, intuitive Kubernetes YAML generator that simplifies deployment configuration for developers and DevOps teams.

https://github.com/same7ammar/kube-composer

2.87K views15:04

DevOps&SRE Library

What Is OTLP and Why It's the Future of Observability

You're probably reading this because you don't want to sink time or money into proprietary protocols and agents anymore. Why would you? They tie you to a single vendor, force you to adapt to their quirks, and make it painful to change direction later.

What you really need is an open, consistent way to instrument, collect, and move your telemetry without worrying about compatibility or lock-in. That's exactly what OpenTelemetry (OTel) gives you. And at the center of it all is the OpenTelemetry Protocol (OTLP), the common language that makes your services, collectors, and backends speak fluently with each other.

This guide will walk you through OTLP in detail: what it is, why it matters, and how to use it in real pipelines. By the end, you'll see how embracing OTLP and pairing it with an OTel-native backend helps you solve the challenges of modern observability while keeping your stack open, reliable, and free of lock-in.

https://www.dash0.com/knowledge/opentelemetry-protocol-otlp

2.91K views07:01

DevOps&SRE Library

What are metrics in OpenTelemetry: A Complete Guide

A comprehensive guide to understanding metrics in OpenTelemetry. What they are, how they work, and how to implement them effectively with practical code examples.

https://oneuptime.com/blog/post/2025-08-26-what-are-metrics-in-opentelemetry/view

3.81K views15:04

DevOps&SRE Library

Cloudreve

Self-hosted file management system with multi-cloud support.

https://github.com/cloudreve/Cloudreve

3.65K views07:04

DevOps&SRE Library

Building a Unified OpenTelemetry Pipeline in Kubernetes

https://fatihkoc.net/posts/opentelemetry-kubernetes-pipeline

3.64K views15:02

DevOps&SRE Library

velld

A self-hosted database backup management tool. Schedule automated backups, monitor status, and manage multiple databases from one place.

https://github.com/dendianugerah/velld

4.97K views07:04

DevOps&SRE Library

PrivateCaptcha

Private Captcha is an independent, privacy-first, self-hostable Proof-of-Work CAPTCHA service made in EU.

https://github.com/PrivateCaptcha/PrivateCaptcha

3.76K views15:03

DevOps&SRE Library

flint

A single <11MB binary with a modern Web UI, CLI, and API for KVM.
No XML. No bloat. Just VMs.

https://github.com/volantvm/flint

3.6K views07:05

DevOps&SRE Library

Optimising Kubernetes deployment with local continuous development tooling

https://gawbul.medium.com/optimising-kubernetes-deployment-with-local-continuous-development-tooling-15b1fbf7a722

3.38K views15:00

DevOps&SRE Library

cluster-bare-autoscaler

Cluster Bare Autoscaler (CBA) automatically adjusts the size of a bare-metal Kubernetes cluster by powering nodes off or on based on real-time resource usage, while safely cordoning and draining nodes before shutdown.

https://github.com/docent-net/cluster-bare-autoscaler

3.04K views07:00

DevOps&SRE Library

volcano-vgpu-device-plugin

Volcano vgpu device-plugin can provide device-sharing mechanism for NVIDIA devices managed by volcano.

https://github.com/Project-HAMi/volcano-vgpu-device-plugin

3.2K views15:00

DevOps&SRE Library

KAI-Scheduler

KAI Scheduler is a robust, efficient, and scalable Kubernetes scheduler that optimizes GPU resource allocation for AI and machine learning workloads.

https://github.com/NVIDIA/KAI-Scheduler

2.92K views07:00

DevOps&SRE Library

kubezonnet

Monitor cross-zone network traffic in Kubernetes.

https://github.com/polarsignals/kubezonnet

3.53K views15:00

DevOps&SRE Library

k3k

K3k, Kubernetes in Kubernetes, is a tool that empowers you to create and manage isolated K3s clusters within your existing Kubernetes environment.

https://github.com/rancher/k3k

4.57K views07:00

DevOps&SRE Library

ramalama

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.

https://github.com/containers/ramalama

3.45K views15:00

DevOps&SRE Library

hwameistor

HwameiStor is an HA local storage system for cloud-native stateful workloads. It creates a local storage resource pool for centrally managing all disks such as HDD, SSD, and NVMe. It uses the CSI architecture to provide distributed services with local volumes and provides data persistence capabilities for stateful cloud-native workloads or components.

https://github.com/hwameistor/hwameistor

3.09K views07:00

DevOps&SRE Library

Why Environments Beat Clusters For Dev Experience

The cloud ecosystem has reached a turning point. Tools for operators/administrators are now mature and can handle most day-to-day operations that deal with Kubernetes clusters. Finally, we can turn our focus to application developers and their needs.

If you look at all the Kubernetes tools available, you’ll understand that most of them treat Kubernetes as another form of infrastructure. You can easily find tools that install Kubernetes, monitor Kubernetes, secure Kubernetes, do cost estimations for Kubernetes, etc. But how many Kubernetes tools can you find that target application developers and their day-to-day responsibilities?

Several companies even try to hide Kubernetes completely from developers by using leaky abstractions or so-called developer portals. These adoption efforts almost always fail simply because nobody asked the developers what they really need. Don’t fall into this trap.

In this article, we see some common examples of what companies “think” about developers’ needs versus what developers need in practice, in the context of application development for Kubernetes.

https://medium.com/containers-101/why-environments-beat-clusters-for-dev-experience-f6eef0cd928b

3.51K views15:02

DevOps&SRE Library

Terraform state locking explained (and why it hurts at scale)

Terraform state locking is a textbook example of solving a distributed coordination problem with the wrong primitive. You have concurrent actors, partial modifications, and dependency graphs—and the solution is a global mutex on a JSON blob. The scaling characteristics are exactly what you'd predict from this mismatch.

https://stategraph.dev/blog/terraform-state-locking-explained

3.4K views07:01

DevOps&SRE Library

How to write and rightsize Terraform modules