DevOps&SRE Library
18.4K subscribers
464 photos
4 videos
2 files
4.98K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
rotz

Fully cross platform dotfile manager and dev environment bootstrapper written in Rust.


https://github.com/volllly/rotz
Moving fast breaks things: the importance of a staging environment

https://graphite.dev/blog/staging-environment
Terragrunt Reference Architecture

This repository embodies a structured approach to organizing Terraform code with Terragrunt, focusing on reusability, ease of management, and scalability across multiple environments and cloud providers. It's crafted to guide teams in building robust cloud infrastructure that adheres to best practices and principles.


https://github.com/Excoriate/terragrunt-ref-arch
oneuptime

OneUptime is a comprehensive solution for monitoring and managing your online services. Whether you need to check the availability of your website, dashboard, API, or any other online resource, OneUptime can alert your team when downtime happens and keep your customers informed with a status page. OneUptime also helps you handle incidents, set up on-call rotations, run tests, secure your services, analyze logs, track performance, and debug errors.


https://github.com/oneuptime/oneuptime
Understanding Kubernetes emptyDir — With 3 Practical Use-cases

Learn how to effectively implement emptyDir memory for pods, with hands-on use cases for temporary data handling in Kubernetes.


https://decisivedevops.com/understanding-kubernetes-emptydir-with-3-practical-use-cases-960f550e0e34
Mastering Kubernetes: Journey with Cluster API

Let’s talk about how at Hepsiburada, we efficiently manage hundreds of Kubernetes clusters that directly handle about 95% of our over 100 million monthly visitor traffic. We’ll delve into the complexities of managing multiple clusters and discuss the strategies we employ to tackle these challenges.


https://medium.com/hepsiburadatech/mastering-kubernetes-journey-with-cluster-api-2fb779ee7177
Horizontal Autoscaling in Kubernetes

In this article I will write about the horizontal autoscaling in kubernetes. The intended audience is the software developers and devops/SRE engineers with at least some elementary background in kubernetes interested in learning about auto-scaling. When I was learning this topic, I didn’t find a single straightforward article that explains all the relveant concepts, so I took the challenge and rolled one myself.


https://medium.com/@aharon.haravon/horizontal-autoscaling-in-kubernetes-b9ef7a9f067a
Testing Service Mesh Performance in Multi-Cluster Scenario: Istio vs Kuma vs NSM

This article may be useful for those who are aware of service meshes and probably trying to improve scalability and connectivity between applications in Kubernetes and other container orchestration systems, e.g., adding encryption and authorization for application connections.


https://dev.to/pragmagic/testing-service-mesh-performance-in-multi-cluster-scenario-istio-vs-kuma-vs-nsm-4agj
Maximizing the Utility of Scarce AI Resources: A Kubernetes Approach

Optimizing the use of limited AI training accelerators


https://towardsdatascience.com/maximizing-the-utility-of-scarce-ai-resources-a-kubernetes-approach-0230ba53965b
Kubernetes — Cost optimisation and savings on AWS

Around 4 years ago, the ELMO Infrastructure team began the Kubernetes journey which involved building out multiple production clusters across multiple AWS regions, across multiple AWS accounts. Since then we have been able to migrate almost all our applications into Kubernetes from various different places such as Amazon ECS, AWS Opsworks and datacenters. One of the biggest challenges we faced, and i’m sure everyone has faced, is ensuring that we didn’t blow out the AWS bill with our Kubernetes costs. The idea is to have the cheapest but highest performing cluster possible… it’s important to not compromise performance for cost.


https://medium.com/elmo-software/kubernetes-cost-optimisation-and-savings-on-aws-88a7cf8e7469
Whoami — The quest of understanding GKE Workload Identity Federation

If you’re anything like me then using product features that you don’t fully understand always leaves you with a feeling of unease. Sure, using the feature might even be easy and cheerful at least as long as everything works as expected. We could even leave it at that. However, somewhere in between intrinsic engineering curiosity and the life experience that at some point in the future a deeper understanding will come in handy, we still have the desire to understand and debunk the magic.


https://medium.com/google-cloud/whoami-the-quest-of-understanding-gke-workload-identity-federation-e951e5e4a03f
Kubernetes Pod Policies — imagePullPolicy

When a pod is launched in Kubernetes, it starts with several policies. In this series, we will understand these policies, starting with imagePullPolicy.


https://decisivedevops.com/kubernetes-pod-policies-imagepullpolicy-fd939057a93f
Kubernetes Pod Policies — terminationMessagePolicy

Learn practical uses of terminationMessagePolicy in Kubernetes for efficient container debugging and error diagnostics.


https://decisivedevops.com/kubernetes-pod-policies-terminationmessagepolicy-c073eb936ef2
Kubernetes Pod Policies — dnsPolicy

Learn key aspects of Kubernetes Pod Policies, focusing on dnsPolicy, including practical insights into configurations like ClusterFirst, Default, and more.


https://decisivedevops.com/kubernetes-pod-policies-dnspolicy-1a70064ec590
Monitor your K8S Cluster costs with kubecost

Let’s install kubecost in 1 minute and get a fine grain report of your K8S expenses


https://medium.com/@chaisarfati/monitor-your-k8s-cluster-costs-with-kubecost-4a9d64050466
Waiting for hooks in ArgoCD

ArgoCD is a fantastic tool to deploy applications via GitOps. You can defined all your kubernetes manifests in git and have ArgoCD watch them for changes. It’s a very popular product used to manage resources in kubernetes.

There are a couple syncing options that you can use, automated, self health or manually sync. I would love to see some kind of approval process in the future. Let’s build one.


https://systemweakness.com/waiting-for-hooks-in-argocd-e5329ec0436c
podinfo

Podinfo is a tiny web application made with Go that showcases best practices of running microservices in Kubernetes. Podinfo is used by CNCF projects like Flux and Flagger for end-to-end testing and workshops.


https://github.com/stefanprodan/podinfo