DevOps&SRE Library
18.4K subscribers
465 photos
4 videos
2 files
4.99K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
The Karpenter Effect: Redefining Our Kubernetes Operations

A reflection on our journey towards AWS Karpenter, improving our Upgrades, Flexibility, and Cost-Efficiency in a 2,000+ Nodes Fleet


https://medium.com/adevinta-tech-blog/the-karpenter-effect-redefining-our-kubernetes-operations-80c7ba90a599
terraform-aws-clickops-notifier

Get notified when actions are taken in the AWS Console.


https://github.com/cloudandthings/terraform-aws-clickops-notifier
Kubernetes networking: service, kube-proxy, load balancing

TL;DR: This article explores Kubernetes networking, focusing on Services, kube-proxy, and load balancing.


https://learnk8s.io/kubernetes-services-and-load-balancing
How Agoda Handles Load Shedding in Private Cloud

In this article, we’ll explore load shedding, which involves deciding which traffic to serve when you can’t handle all of it. The reason for having insufficient capacity can vary. We might face unexpected high traffic from a promotion, a malicious attempt to take our service offline, or maybe we’ve rolled out a change that doesn’t scale properly despite our best efforts to catch it in testing.


https://medium.com/agoda-engineering/load-shedding-private-cloud-first-81ddd5ab53ac
A Hands-On Guide to Kubernetes Endpoints & EndpointSlices

Understanding Kubernetes Endpoints and Endpoint Slices: A Comprehensive Guide


https://medium.com/@muppedaanvesh/a-hands-on-guide-to-kubernetes-endpoints-endpointslices-%EF%B8%8F-1375dfc9075c
Amazon EKS- managing and fixing ETCD database size

Story detailing how to investigate and fix ETCD db issues when using EKS. You will find out how I managed to completely break our EKS cluster because of overloaded ETCD.


https://marcincuber.medium.com/amazon-eks-managing-and-fixing-etcd-database-size-b6fb875888cb
A Hands-On Guide to Kubernetes QoS Classes

Understanding Quality of Service Classes in Kubernetes: A Practical Example


https://medium.com/@muppedaanvesh/a-hands-on-guide-to-kubernetes-qos-classes-%EF%B8%8F-571b5f8f7e58
Scaling Strategies on AWS EKS: Understanding HPA, VPA, and Cluster Autoscaler

https://towardsaws.com/scaling-strategies-on-aws-eks-understanding-hpa-vpa-and-cluster-autoscaler-12b88758d1d5
zeropod

Zeropod is a Kubernetes runtime (more specifically a containerd shim) that automatically checkpoints containers to disk after a certain amount of time of the last TCP connection. While in scaled down state, it will listen on the same port the application inside the container was listening on and will restore the container on the first incoming connection. Depending on the memory size of the checkpointed program this happens in tens to a few hundred milliseconds, virtually unnoticable to the user. As all the memory contents are stored to disk during checkpointing, all state of the application is restored.


https://github.com/ctrox/zeropod
AWS Controllers for Kubernetes

Manage AWS services using Kubernetes


https://aws-controllers-k8s.github.io/community
1
helmper

A little helper that pushes Helm Charts and images to your registries, easily configured with a declarative spec.


https://github.com/ChristofferNissen/helmper
contrast

Contrast runs confidential container deployments on Kubernetes at scale.


https://github.com/edgelesssys/contrast
prom-analytics-proxy

prom-analytics-proxy is a lightweight proxy application designed to sit between your Prometheus server and its clients. It provides valuable insights by collecting detailed analytics on PromQL queries, helping you understand query performance, resource usage, and overall system behavior. This can significantly improve observability for Prometheus users, providing actionable data to optimize query execution and infrastructure.


https://github.com/nicolastakashi/prom-analytics-proxy