DevOps&SRE Library
18.3K subscribers
456 photos
4 videos
2 files
4.94K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
k8s-device-plugin

The NVIDIA device plugin for Kubernetes is a Daemonset that allows you to automatically:

- Expose the number of GPUs on each nodes of your cluster
- Keep track of the health of your GPUs
- Run GPU enabled containers in your Kubernetes cluster.


https://github.com/NVIDIA/k8s-device-plugin
kluctl

Kluctl is the missing glue that puts together your (and any third-party) deployments into one large declarative Kubernetes deployment, while making it fully manageable (deploy, diff, prune, delete, ...) via one unified command line interface.


https://github.com/kluctl/kluctl
k8e

Kubernetes Easy Engine(k8e)🚀 is a lightweight, scalable enterprise-grade Kubernetes distribution that allows users to manage, protect and obtain out-of-the-box Kubernetes clusters in a unified manner. It is suitable for enterprise environments.


https://github.com/xiaods/k8e
zed

Code at the speed of thought – Zed is a high-performance, multiplayer code editor from the creators of Atom and Tree-sitter.


https://github.com/zed-industries/zed
heynote

Heynote is a dedicated scratchpad for developers. It functions as a large persistent text buffer where you can write down anything you like. Works great for that Slack message you don't want to accidentally send, a JSON response from an API you're working with, notes from a meeting, your daily to-do list, etc.


https://github.com/heyman/heynote
The Bun Shell

The Bun Shell is a new experimental embedded language and interpreter in Bun that allows you to run cross-platform shell scripts in JavaScript & TypeScript.


https://bun.sh/blog/the-bun-shell
wal-listener

A service that helps implement the Event-Driven architecture.

To maintain the consistency of data in the system, we will use transactional messaging - publishing events in a single transaction with a domain model change.

The service allows you to subscribe to changes in the PostgreSQL database using its logical decoding capability and publish them to the NATS Streaming server.


https://github.com/ihippik/wal-listener
The state of Kubernetes jobs in 2023 Q4

Kubernetes Job market trends for Q4 2023


https://kube.careers/state-of-kubernetes-jobs-2023-q4
42 things I learned from building a production database

https://maheshba.bitbucket.io/blog/2021/10/19/42Things.html
12 Factor CLI Apps

At Heroku, we’ve come up with a methodology called the 12 factor app. It’s a set of principles designed to make great web applications that are easy to maintain. In that spirit, here are 12 CLI factors to keep in mind when building your next CLI application. Following these principles will offer CLI UX that users will love.


https://medium.com/@jdxcode/12-factor-cli-apps-dd3c227a0e46
Viacheslav Biriukov - SRE deep dive into Linux Page Cache

In this series of articles, I would like to talk about Linux Page Cache. I believe that the following knowledge of the theory and tools is essential and crucial for every SRE. This understanding can help both in usual and routine everyday DevOps-like tasks and in emergency debugging and firefighting.


https://biriukov.dev/docs/page-cache/0-linux-page-cache-for-sre
Loki's new TSDB Index

https://lokidex.com/posts/tsdb
uptrace

Open source APM: OpenTelemetry traces, metrics, and logs


https://github.com/uptrace/uptrace
kubernetes-image-puller

Kubernetes Image Puller is used for caching images on a cluster. It creates a DaemonSet downloading and running the relevant container images on each node.


https://github.com/che-incubator/kubernetes-image-puller
Why Distributed Systems Fail?

Distributed systems are tricky - it's easy to make wrong assumptions that lead to problems down the road. Back in the 90s, computer scientist L. Peter Deutsch identified several common misconceptions, or "fallacies," that trip up engineers working on distributed systems. Surprisingly these fallacies are still relevant today:

1. The Network is Reliable: It's risky to assume networks are 100% reliable. Networks can and do fail in various ways.
2. Latency is Zero: While we might wish our networks had no latency, that's simply not physically possible - even light takes time to travel distances. Ignoring the inevitable delay in data transmission can lead to unrealistic expectations of system performance.
3. Bandwidth is Infinite: This overlooks the physical and practical limitations on data transfer rates.
4. The Network is Secure: No wonder Security is a growing industry. Assuming inherent security can lead to vulnerabilities and oversight in protective measures.
5. Topology Doesn't Change: This neglects the dynamic nature of network configurations.
6. There is One Administrator: A simplification that fails to consider the complexity of managing distributed systems.
7. Transport Cost is Zero: Overlooking the resources required for data movement.
8. The Network is Homogeneous: Ignoring the diversity in network systems and standards.

These fallacies, if not recognized and addressed, can lead to design flaws, performance issues, and security vulnerabilities in distributed systems. In the following sections, we will break down each of these misconceptions, exploring their implications and how to mitigate the risks they pose in real-world applications.


P1: https://www.codereliant.io/why-distributed-systems-fail-1

P2: https://www.codereliant.io/why-distributed-systems-fail-2
Terragrunt root selector: automatically select the best root directory base on file changed

https://medium.com/@bill.nz/terragrunt-root-selector-automatically-select-the-best-root-directory-base-on-file-changed-8f0b4147a8a3
mcfly

McFly replaces your default ctrl-r shell history search with an intelligent search engine that takes into account your working directory and the context of recently executed commands. McFly's suggestions are prioritized in real time with a small neural network.


https://github.com/cantino/mcfly
crd-to-sample-yaml

Generate a sample YAML file from a CRD definition.


https://github.com/Skarlso/crd-to-sample-yaml