DevOps&SRE Library
18.3K subscribers
456 photos
5 videos
2 files
4.93K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
acme-dns

A simplified DNS server with a RESTful HTTP API to provide a simple way to automate ACME DNS challenges.

https://github.com/joohoi/acme-dns
Building and operating a pretty big storage system called S3

Today, I am publishing a guest post from Andy Warfield, VP and distinguished engineer over at S3. I asked him to write this based on the Keynote address he gave at USENIX FAST ‘23 that covers three distinct perspectives on scale that come along with building and operating a storage system the size of S3.

https://www.allthingsdistributed.com/2023/07/building-and-operating-a-pretty-big-storage-system.html
Bridging the gap between IaC and Schema Management

When we started building Atlas a couple of years ago, we noticed that there was a substantial gap between what was then considered state-of-the-art in managing database schemas and the recent strides from Infrastructure-as-Code (IaC) to managing cloud infrastructure.

In this post, we review that gap and show how Atlas – along with its Terraform provider – can bridge the two domains.

https://atlasgo.io/blog/2023/07/19/bridging-the-gap-between-iac-and-schema-management
A misadventure with Terraform Sets & PagerDuty Schedules

How Terraform's setunion() disregards ordering.

https://tratnayake.dev/a-misadventure-with-terraform-sets-pagerduty-schedules
Stop using IAM User Credentials with Terraform Cloud

I recently started using Terraform Cloud but discovered that the getting started tutorial which describes how to integrate it with Amazon Web Services (AWS) suggested using IAM user credentials. This is not ideal as these credentials are long-lived and can lead to security issues.

https://www.wolfe.id.au/2023/07/17/stop-using-iam-user-credentials-with-terraform-cloud
Secure Your AWS Environments with Terraform, Vault, and Veeam

https://julia.hashnode.dev/secure-your-aws-environments-with-terraform-vault-and-veeam
sre-checklist

A checklist of anyone practicing Site Reliability Engineering

https://github.com/bregman-arie/sre-checklist
Why bother with SLI and SLO?

Is there really any value in setting service level indicators and objectives?

https://blog.alexewerlof.com/p/why-bother-with-sli-and-slo
Traffic Jams in the Cloud: Are Overloads Sabotaging Your Application's Reliability?

https://blog.fluxninja.com/blog/traffic-jams-in-the-cloud-unveiling-the-true-enemy-of-reliability
PostgreSQL: No More VACUUM, No More Bloat

PostgreSQL, a powerful open-source object-relational database system, has been lauded for its robustness, functionality, and flexibility. However, it is not without its challenges – one of which is the notorious VACUUM process. However, the dawn of a new era is upon us with OrioleDB, a novel engine designed for PostgreSQL that promises to eliminate the need for the resource-consuming VACUUM.

https://www.orioledata.com/blog/no-more-vacuum-in-postgresql
Identifying GCP’s Hidden Network Inter-Zone Egress Costs

Learn how to identify your Inter-Zone Egress costs in a few easy steps, using commonly available methods.

Ever wondered where those Inter-Zone Egress costs are coming from? Found yourself looking at GCP’s network pricing page many times to break it down? Me too. So I thought I might as well try to help clear things up.

https://www.doit.com/identifying-gcps-hidden-network-inter-zone-egress-costs
faasd

faasd is OpenFaaS reimagined, but without the cost and complexity of Kubernetes. It runs on a single host with very modest requirements, making it fast and easy to manage. Under the hood it uses containerd and Container Networking Interface (CNI) along with the same core OpenFaaS components from the main project.

https://github.com/openfaas/faasd
blazingmq

BlazingMQ is an open source distributed message queueing framework, which focuses on efficiency, reliability, and a rich feature set for modern-day workflows.

At its core, BlazingMQ provides durable, fault-tolerant, highly performant, and highly available queues, along with features like various message routing strategies (e.g., work queues, priority, fan-out, broadcast, etc.), compression, strong consistency, poison pill detection, etc.

https://github.com/bloomberg/blazingmq
Scaling Terraform with Terramate

In CWISE we use Terraform a lot. The most common use cases for Terraform for us is cloud resource provisioning, Kubernetes configuration management, and SaaS services (like Github/Gitlab) management.  

We prefer Terraform over many other competitors due multiple reasons:

- Tried and tested tool, has been around for a long time and Hashicorp is doing great work of developing it. Can be defined as mature and even boring technology;

- A large number of community resources like providers, modules, and documentation;

- Good developer experience due to support in IDE's and support tools;

- Has got a configuration state (database);

https://www.cwise.eu/post/scaling-terraform-with-terramate
terraform-tui

TFTUI is a powerful textual GUI that empowers users to effortlessly view and interact with their Terraform state.

With its latest version you can easily visualize the complete state tree, gaining deeper insights into your infrastructure's current configuration. Additionally, the ability to inspect individual resource states allows you to focus on specific details for better analysis and management. Lastly, it's now possible to select resources and perform actions such as tainting and untainting.

https://github.com/idoavrah/terraform-tui