DevOps&SRE Library
18.4K subscribers
465 photos
4 videos
2 files
4.98K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Termix

Termix is an open-source, forever-free, self-hosted all-in-one server management platform. It provides a web-based solution for managing your servers and infrastructure through a single, intuitive interface. Termix offers SSH terminal access, SSH tunneling capabilities, and remote file configuration editing, with many more tools to come.


https://github.com/LukeGus/Termix
arbok

Secure HTTP tunnels to localhost using WireGuard - Share your local dev server instantly


https://github.com/mr-karan/arbok
OpenArchiver

A secure, sovereign, and open-source platform for email archiving and eDiscovery.

Open Archiver provides a robust, self-hosted solution for archiving, storing, indexing, and searching emails from major platforms, including Google Workspace (Gmail), Microsoft 365, PST files, as well as generic IMAP-enabled email inboxes. Use Open Archiver to keep a permanent, tamper-proof record of your communication history, free from vendor lock-in.


https://github.com/LogicLabs-OU/OpenArchiver
uptimemonitor

Self-hosted uptime monitor for your websites


https://github.com/airlabspl/uptimemonitor
uptime-watchdog

Lightweight uptime monitoring tool written in Go.


https://github.com/seponik/uptime-watchdog
reviewboard

An extensible and friendly code review tool for projects and companies of all sizes.


https://github.com/reviewboard/reviewboard
iam-convert

CLI and Node Library to convert JSON IAM Policy Documents to other formats for Infrastructure as Code.


https://github.com/cloud-copilot/iam-convert
Diving deep into distributed microservices with OpenSearch and OpenTelemetry

https://opensearch.org/blog/diving-deep-into-distributed-microservices-with-opensearch-and-opentelemetry
Top 10 Status Page Examples: What We Like and What’s Missing

https://www.checklyhq.com/blog/top-10-status-page-examples
Redesigning Workers KV for increased availability and faster performance

https://blog.cloudflare.com/rearchitecting-workers-kv-for-redundancy
6 Reasons You Don't Need an SRE Team

The model of large SRE teams covering many services in a vague and nebulous way that's open to repeated re-interpretation is mostly a side-effect of (a) cargo-culting the building of these large groups, or (b) retrofitting SRE/DevOps onto existing groups without the company-wide reliability focus required (or the fortitude to decide you didn't need such a large group to do SRE).


https://log.andvari.net/6reasons.html
Choosing the right OpenTelemetry Collector distribution

https://www.datadoghq.com/blog/otel-collector-distributions
Setting Up OpenTelemetry on the Frontend Because I Hate Myself

Frontend developers deserve so much better from OpenTelemetry, especially since they stand to benefit so much.


https://thenewstack.io/setting-up-opentelemetry-on-the-frontend-because-i-hate-myself
OpenTelemetry configuration gotchas

https://blog.frankel.ch/opentelemetry-gotchas
Achieving High Availability with distributed database on Kubernetes at Airbnb

We chose an innovative strategy of deploying a distributed database cluster across multiple Kubernetes clusters in a cloud environment. Although currently an uncommon design pattern due to its complexity, this strategy allowed us to achieve target system reliability and operability.

In this post, we’ll share how we overcame challenges and the best practices we’ve developed for this strategy and we believe these best practices should be applicable to any other strongly consistent, distributed storage systems.


https://medium.com/airbnb-engineering/achieving-high-availability-with-distributed-database-on-kubernetes-at-airbnb-58cc2e9856f4
Introducing Off-CPU Profiling

How Off-CPU profiling works and how to get the most out of it


https://www.polarsignals.com/blog/posts/2025/07/30/introducing-off-cpu-profiling
1