DevOps&SRE Library

Kubernetes observability from day one - Mixins on Grafana, Mimir and Alloy

One of the things we quickly find out when using Kubernetes is that it’s hard to know what is going on in our cluster. In most cases, we implement monitoring and alerting after we’ve dealt with problems, but there is a better way.

We don’t need to wait for the explosions, we can re-use the community’s knowledge and implement observability from the beginning.

https://www.amazinglyabstract.it/kubernetes/observability/2025/06/26/kubernetes-mixins.html

3.25K views15:01

DevOps&SRE Library

Troubleshooting Packet Drops in a Kubernetes Cluster

https://medium.com/@teyyubismayilov/troubleshooting-packet-drops-in-a-kubernetes-based-observability-platform-3f5d29cfa69b

2.9K views07:02

DevOps&SRE Library

How We Migrated 30+ Kubernetes Clusters to Terraform

https://medium.com/learnings-from-the-paas/how-we-migrated-30-kubernetes-clusters-to-terraform-cd2b1cef8b84

2.85K views15:05

DevOps&SRE Library

Gateway API v1.3.0: Advancements in Request Mirroring, CORS, Gateway Merging, and Retry Budgets

https://kubernetes.io/blog/2025/06/02/gateway-api-v1-3

3.09K views07:04

DevOps&SRE Library

Kubernetes Node Stability and Performance: Tuning Kubelet for Better Resource Management

https://medium.com/@jfpucheu/kubernetes-node-stability-and-performance-tuning-kubelet-for-better-resource-management-e0f95ccfefe9

2.88K views14:00

DevOps&SRE Library

A Crash Course in Running Kubernetes Locally

Kubernetes clusters on your very own machine

https://jc1175.medium.com/a-crash-course-in-running-kubernetes-locally-7c573dd64933

2.92K views07:02

DevOps&SRE Library

kubectl-ai

kubectl-ai acts as an intelligent interface, translating user intent into precise Kubernetes operations, making Kubernetes management more accessible and efficient.

https://github.com/GoogleCloudPlatform/kubectl-ai

3.32K views15:02

DevOps&SRE Library

A fast, lightweight CLI utility toolkit for developers and IT professionals. ut provides a comprehensive set of commonly-used tools in a single binary, eliminating the need to install and remember multiple utilities or search for random websites to perform simple tasks.

https://github.com/ksdme/ut

3.01K views07:01

DevOps&SRE Library

shiori

Shiori is a simple bookmarks manager written in the Go language. Intended as a simple clone of Pocket. You can use it as a command line application or as a web application. This application is distributed as a single binary, which means it can be installed and used easily.

https://github.com/go-shiori/shiori

3.51K views15:04

DevOps&SRE Library

Terraforming With AI

This article will go over using a team of AI agents in conjunction with the Terraform MCP server and Docker's cagent tool to clean up some rather gnarly autogenerated terraform without needing to write any code.

https://dev.to/zloeber/terraforming-with-ai-g0o

3.62K views07:04

DevOps&SRE Library

cagent

A powerful, easy to use, customizable multi-agent runtime that orchestrates AI agents with specialized capabilities and tools, and the interactions between agents.

https://github.com/docker/cagent

3.51K views15:05

DevOps&SRE Library

Postgres Migrations Using Logical Replication

Moving a Postgres database isn’t a small task. Typically for Postgres users this is one of the biggest projects you’ll undertake.

https://www.crunchydata.com/blog/postgres-migrations-using-logical-replication

3.55K views07:02

DevOps&SRE Library

mathesar

Intuitive spreadsheet-like interface that lets users of all technical skill levels view, edit, query, and collaborate on Postgres data directly—self hosted, with native Postgres access control.

https://github.com/mathesar-foundation/mathesar

3.55K views15:06

DevOps&SRE Library

A Journey Through Kafkian SplitDNS in a Multitenant Kubernetes Offering

https://medium.com/learnings-from-the-paas/a-journey-through-kafkian-splitdns-in-a-multitenant-kubernetes-offering-d5fd274f676f

3.48K views07:02

DevOps&SRE Library

Non-HA Kubernetes Gotchas: Downtime and Autoscaling Pitfalls with Single Replica Workloads

https://eng.zemosolabs.com/non-ha-kubernetes-gotchas-downtime-and-autoscaling-pitfalls-with-single-replica-workloads-812ac4150d70

3.26K views15:01

DevOps&SRE Library

kor

Kor is a tool to discover unused Kubernetes resources.

https://github.com/yonahd/kor

3.15K views07:04

DevOps&SRE Library

homelab

After rebuilding my homelab one too many times, I committed to managing it entirely with GitOps. This repository is the result: a blueprint for a resilient, production-inspired Kubernetes cluster.

https://github.com/theepicsaxguy/homelab

3.31K views15:03

DevOps&SRE Library

mcp-server-kubernetes

MCP Server that can connect to a Kubernetes cluster and manage it. Supports loading kubeconfig from multiple sources in priority order.

https://github.com/Flux159/mcp-server-kubernetes

4.09K views07:02

DevOps&SRE Library

mysql-operator

The MySQL Operator for Kubernetes is an operator for managing MySQL InnoDB Cluster setups inside a Kubernetes Cluster. It manages the full lifecycle with set up and maintenance that includes automating upgrades and backup.

https://github.com/mysql/mysql-operator

2.88K views15:05

DevOps&SRE Library

Как работают инженеры по надёжности в 2025 году?

SRE-инженеры — те, кто держат продакшен в живых, настраивают мониторинг, ловят инциденты и отвечают за аптайм.

Ребята из DevCrowd, которые специализируются на ёмких и открытых отчетах о разных профессиях в IT, запускают свое первое исследование про SRE и DevOps-практики — чтобы понять, как всё устроено изнутри: кто за что отвечает, какие инструменты реально работают и где проходит граница между SRE и DevOps.

💡 Зачем участвовать

– посмотрите, как ваш опыт соотносится с другими инженерами: процессы, зрелость команд, инструменты;

– узнайте, какие reliability-практики внедряют коллеги;

– поможете сделать роль SRE понятнее и заметнее на рынке.

🛠 В опросе задачи, инструменты, мониторинг, алертинг, CI/CD, культура постмортемов и взаимодействие ролей.

🕐 Заполнение займёт около 10 минут.

📝 Пройти опрос → https://survey.alchemer.eu/s3/90909470/SRE-2025

📊 Результаты — в ноябре на devcrowd.ru

3.45K views16:54

About

Blog

Apps

Platform