DevOps&SRE Library
18.3K subscribers
455 photos
5 videos
2 files
4.93K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Service Level Indicators

Introduction to SLI, examples, counterexamples and tips


https://blog.alexewerlof.com/p/sli
On Error Budgets

An error budget is essentially the permissible limit of risk or failure that a service can tolerate while still meeting its objectives. It is closely tied to Service Level Objectives, which define the expected level of service reliability. For instance, if an SLO dictates 99.9% uptime, the error budget allows for a 0.1% margin of error or downtime.


https://www.codereliant.io/on-error-budgets
Upgrading GitHub.com to MySQL 8.0

GitHub uses MySQL to store vast amounts of relational data. This is the story of how we seamlessly upgraded our production fleet to MySQL 8.0.


https://github.blog/2023-12-07-upgrading-github-com-to-mysql-8-0
AWS CDK vs Terraform

IaC is one of the key DevOps practices, and AWS CDK & Terraform are both great IaC tools to manage your AWS infrastructure. Having used both extensively, let me share my experience with the 2 IaC tools.


https://medium.com/@kansvignesh/aws-cdk-vs-terraform-738c39d91f7a
Testing Framework in Terraform 1.6: A deep-dive

https://mattias.engineer/posts/terraform-testing-deep-dive
terraform-github-actions

This is a suite of terraform and OpenTofu related GitHub Actions that can be used together to build effective Infrastructure as Code workflows.


https://github.com/dflook/terraform-github-actions
Incident severity levels for online platforms

Defining clear Incident Severity levels is a key component to an efficient Incident Management process that helps Engineering teams quickly respond to outages and mitigate customer impact.


https://argoday.medium.com/incident-severity-levels-78bfe7dd7e0d
From RSS to WSS: Navigating the Depths of Kubernetes Memory Metrics

Beyond the basics, an in depth look at memory metrics in Kubernetes


https://itnext.io/from-rss-to-wss-navigating-the-depths-of-kubernetes-memory-metrics-4d7d77d8fdcb
dufs

A file server that supports static serving, uploading, searching, accessing control, webdav.


https://github.com/sigoden/dufs
Secure Secret Management in Kubernetes: Exploring Different Approaches

https://adityaoo7.hashnode.dev/secure-secret-management-in-kubernetes-exploring-different-approaches
k8s-event-logger

This tool simply watches Kubernetes Events and logs them to stdout in JSON to be collected and stored by your logging solution, e.g. fluentd, fluent-bit, Filebeat, or Promtail. Other tools exist for persisting Kubernetes Events, such as Sysdig, Datadog, or Google's event-exporter but this tool is open and will work with any logging solution.


https://github.com/max-rocket-internet/k8s-event-logger
helm-drift

The Helm plugin that comes in handy while identifying configuration drifts (mostly due to in-place edits) from the deployed Helm charts.


https://github.com/nikhilsbhat/helm-drift
loxilb

loxilb is an open source hyper-scale software load-balancer for cloud-native workloads. It uses eBPF as its core-engine and is based on Golang. It is designed to power on-premise, edge and public-cloud Kubernetes cluster deployments.


https://github.com/loxilb-io/loxilb