DevOps&SRE Library
18.3K subscribers
457 photos
4 videos
2 files
4.94K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Debugging Running Pods on Kubernetes

Exploring Kubernetes’s debugging feature, kubectl debug, and introducing kubectl superdebug — an enhanced kubectl debug supporting volume mounts.


https://medium.com/datamindedbe/debugging-running-pods-on-kubernetes-2ba160c47ef5
Backblaze Drive Stats for 2023

As of December 31, 2023, we had 274,622 drives under management. Of that number, there were 4,400 boot drives and 270,222 data drives. This report will focus on our data drives. We will review the hard drive failure rates for 2023, compare those rates to previous years, and present the lifetime failure statistics for all the hard drive models active in our data center as of the end of 2023. Along the way we share our observations and insights on the data presented and, as always, we look forward to you doing the same in the comments section at the end of the post.


https://www.backblaze.com/blog/backblaze-drive-stats-for-2023
Troubleshooting the Connection Reset Incident

tl; dr: Check HTTP Keep-Alive idle timeout settings on both client and server side.


https://blog.wtcx.dev/2023/09/23/troubleshooting-connection-reset-incident
Kubernetes logging series

How It Works: Cluster Log Shipper as a DaemonSet: https://blog.wtcx.dev/2022/04/29/how-it-works-cluster-log-shipper-as-a-daemonset

Getting Started with Grafana Loki, Part 1: The Concepts: https://blog.wtcx.dev/2022/05/02/getting-started-with-grafana-loki-the-concepts

Getting Started with Grafana Loki, Part 2: Up and Running: https://blog.wtcx.dev/2023/07/15/getting-started-with-grafana-loki-up-and-running
Writing an Excellent Postmortem

In the world of software engineering, incidents happen every once in a while. The uninitiated fix them in isolation and move on while the enlightened dig into the root cause, prevent them from recurring, and share it widely across the org.


https://medium.com/@vincesackschen/writing-an-excellent-postmortem-8534409f6e0d
Multi-Service Progressive Delivery with Argo Rollouts

https://codefresh.io/blog/multi-service-progressive-delivery-with-argo-rollouts
GitHub Actions, self-hosted runners on Amazon EKS & spot instances

How to spin up ephemeral runners in Kubernetes.


https://levelup.gitconnected.com/github-actions-self-hosted-runners-on-amazon-eks-spot-instances-bc3abcd5d38f
Legacy CLIs No More

Linux CLIs are a part of every software engineer's daily workflow. But I still see many developers rely on legacy tools that have been around for decades. It's time to upgrade your CLI toolbelt and switch to faster, more powerful, and flexible tools.


https://www.codereliant.io/legacy-cli-no-more
Lessons From Our 8 Years Of Kubernetes In Production — Two Major Cluster Crashes, Ditching Self-Managed, Cutting Cluster Costs, Tooling, And More

https://medium.com/@.anders/learnings-from-our-8-years-of-kubernetes-in-production-two-major-cluster-crashes-ditching-self-0257c09d36cd
inbucket

Inbucket is an email testing service; it will accept messages for any email address and make them available via web, REST and POP3 interfaces. Once compiled, Inbucket does not have any external dependencies - HTTP, SMTP, POP3 and storage are all built in.


https://github.com/inbucket/inbucket
Alerts Are Fundamentally Messy

Good alerting hygiene consists of a few components: chasing down alert conditions, reflecting on incidents, and thinking of what makes a signal good or bad. The hope is that we can get our alerts to the stage where they will page us when they should, and they won’t when they shouldn’t.

However, the reality of alerting in a socio-technical system must cater not only to the mess around the signal, but also to the longer term interpretation of alerts by people and automation acting on them. This post will expand on this messiness and why Honeycomb favors an iterative approach to setting our alerts.


https://www.honeycomb.io/blog/alerts-are-fundamentally-messy
glasskube

Using traditional package managers or applying manifests directly can be super confusing and doesn't scale. Therefore, Glasskube will help you to install your favorite Kubernetes packages using the Glasskube UI for reduced complexity and increased transparency. We are also providing a brew inspired CLI for advanced users. Our packages are dependency aware, as you would expect from a package manager. Designed as a cloud native application, so you can follow your GitOps approach.


https://github.com/glasskube/glasskube
apisix

Apache APISIX is a dynamic, real-time, high-performance API Gateway.

APISIX API Gateway provides rich traffic management features such as load balancing, dynamic upstream, canary release, circuit breaking, authentication, observability, and more.

You can use APISIX API Gateway to handle traditional north-south traffic, as well as east-west traffic between services. It can also be used as a k8s ingress controller.


https://github.com/apache/apisix
Multiple Terraform projects in a mono-repo. How to survive a mess?

Do you have a set of projects sitting in a mono-repo and having various workspaces, file structures, and Terraform versions? A pain of switching the versions and remembering all path/workspace combinations? Uncertainty about the correctness of the workspace, or plan file before applying it?
I feel you! I’d share my experience in managing such projects, an approach to make it much easier, and a simple tool I wrote a few years ago for that. How is it related to Docker Compose? I’ll tell you…


https://tech.westwing.de/multiple-terraform-projects-in-a-mono-repo-how-to-survive-a-mess-e1ec5a136d17
k8s-cleaner

Cleaner is a Kubernetes controller that identifies unused or unhealthy resources, helping you maintain a streamlined and efficient Kubernetes cluster. It provides flexible scheduling, label filtering, Lua-based selection criteria, resource removal or update and notifications via Slack, Webex and Discord.


https://github.com/gianlucam76/k8s-cleaner
gitbutler

The GitButler version control client, backed by Git, powered by Tauri/Rust/Svelte


https://github.com/gitbutlerapp/gitbutler