Healthchecks.io Hosting Setup, 2022 Edition
https://blog.healthchecks.io/2022/02/healthchecks-io-hosting-setup-2022-edition
Healthchecks.io Hosting, Questions and Answers
https://blog.healthchecks.io/2022/05/healthchecks-io-hosting-questions-and-answers
https://blog.healthchecks.io/2022/02/healthchecks-io-hosting-setup-2022-edition
Healthchecks.io Hosting, Questions and Answers
https://blog.healthchecks.io/2022/05/healthchecks-io-hosting-questions-and-answers
Martian Kubernetes Kit: a smooth-sailing toolkit from our SRE team
https://evilmartians.com/chronicles/martian-kubernetes-kit-a-smooth-sailing-toolkit-from-our-sre-team
We’ve been using Kubernetes since before it was a “thing”, and as of 2023, we believe that it is still underutilized. In fact, it’s the best (and basically only real “at-scale”) solution for orchestrating Docker containers—or containers in general, after you’ve outgrown services like Heroku or Fly.io! That’s a bold claim, but it’s a belief backed up by our years of SRE experience. In this post, we’ll expand on that, and we’ll introduce a Kubernetes toolkit we already use and support for our clients, which simultaneously de-complexifies and highlights the benefits of Kubernetes.
https://evilmartians.com/chronicles/martian-kubernetes-kit-a-smooth-sailing-toolkit-from-our-sre-team
Service Level Indicators
https://blog.alexewerlof.com/p/sli
Introduction to SLI, examples, counterexamples and tips
https://blog.alexewerlof.com/p/sli
On Error Budgets
https://www.codereliant.io/on-error-budgets
An error budget is essentially the permissible limit of risk or failure that a service can tolerate while still meeting its objectives. It is closely tied to Service Level Objectives, which define the expected level of service reliability. For instance, if an SLO dictates 99.9% uptime, the error budget allows for a 0.1% margin of error or downtime.
https://www.codereliant.io/on-error-budgets
Upgrading GitHub.com to MySQL 8.0
https://github.blog/2023-12-07-upgrading-github-com-to-mysql-8-0
GitHub uses MySQL to store vast amounts of relational data. This is the story of how we seamlessly upgraded our production fleet to MySQL 8.0.
https://github.blog/2023-12-07-upgrading-github-com-to-mysql-8-0
AWS CDK vs Terraform
https://medium.com/@kansvignesh/aws-cdk-vs-terraform-738c39d91f7a
IaC is one of the key DevOps practices, and AWS CDK & Terraform are both great IaC tools to manage your AWS infrastructure. Having used both extensively, let me share my experience with the 2 IaC tools.
https://medium.com/@kansvignesh/aws-cdk-vs-terraform-738c39d91f7a
Testing Framework in Terraform 1.6: A deep-dive
https://mattias.engineer/posts/terraform-testing-deep-dive
https://mattias.engineer/posts/terraform-testing-deep-dive
terraform-github-actions
https://github.com/dflook/terraform-github-actions
This is a suite of terraform and OpenTofu related GitHub Actions that can be used together to build effective Infrastructure as Code workflows.
https://github.com/dflook/terraform-github-actions
Incident severity levels for online platforms
https://argoday.medium.com/incident-severity-levels-78bfe7dd7e0d
Defining clear Incident Severity levels is a key component to an efficient Incident Management process that helps Engineering teams quickly respond to outages and mitigate customer impact.
https://argoday.medium.com/incident-severity-levels-78bfe7dd7e0d
From RSS to WSS: Navigating the Depths of Kubernetes Memory Metrics
https://itnext.io/from-rss-to-wss-navigating-the-depths-of-kubernetes-memory-metrics-4d7d77d8fdcb
Beyond the basics, an in depth look at memory metrics in Kubernetes
https://itnext.io/from-rss-to-wss-navigating-the-depths-of-kubernetes-memory-metrics-4d7d77d8fdcb
dufs
https://github.com/sigoden/dufs
A file server that supports static serving, uploading, searching, accessing control, webdav.
https://github.com/sigoden/dufs
Kubernetes 101: Assigning Pod to Nodes
https://hwchiu.medium.com/kubernetes-101-assigning-pod-to-nodes-e52eebb4bc38
https://hwchiu.medium.com/kubernetes-101-assigning-pod-to-nodes-e52eebb4bc38
Validation WebHook troubleshooting, How low can you go?
https://medium.com/@movergan/validation-webhook-troubleshooting-how-low-can-you-go-b1d435635ec7
https://medium.com/@movergan/validation-webhook-troubleshooting-how-low-can-you-go-b1d435635ec7
The internals and the latest trends of container runtimes (2023)
https://medium.com/nttlabs/the-internals-and-the-latest-trends-of-container-runtimes-2023-22aa111d7a93
https://medium.com/nttlabs/the-internals-and-the-latest-trends-of-container-runtimes-2023-22aa111d7a93
Secure Secret Management in Kubernetes: Exploring Different Approaches
https://adityaoo7.hashnode.dev/secure-secret-management-in-kubernetes-exploring-different-approaches
https://adityaoo7.hashnode.dev/secure-secret-management-in-kubernetes-exploring-different-approaches
Argo Workflow — A Pipeline to Build and Deploy Containers
https://medium.com/@chukmunnlee/argo-workflow-a-pipeline-to-build-and-deploy-containers-f03775d8e01b
https://medium.com/@chukmunnlee/argo-workflow-a-pipeline-to-build-and-deploy-containers-f03775d8e01b
ArgoWorkflows for Distributed MongoDB Logical Backup
https://yossicohn.medium.com/argoworkflows-for-distributed-mongodb-logical-backup-1a5d8147c3bf
https://yossicohn.medium.com/argoworkflows-for-distributed-mongodb-logical-backup-1a5d8147c3bf