CatOps
5.08K subscribers
94 photos
5 videos
19 files
2.57K links
DevOps and other issues by Yurii Rochniak (@grem1in) - SRE @ Preply && Maksym Vlasov (@MaxymVlasov) - Engineer @ Star. Opinions on our own.

We do not post ads including event announcements. Please, do not bother us with such requests!
Download Telegram
​​A friend of my close friends is raising funds for a vehicle for the 50th Separate Storm Brigade.

https://send.monobank.ua/jar/3CYuCnWww7

Let’s help him to make that happen!

#donations #Ukraine
🤝3
Spotify has released a postmortem for their outage that happened on 16th of April, and was almost global.

In nutshell, it was a combination of a bug, and a cascading issue caused by user retries. Here's an interesting bit:

> This change was deemed low risk and as such we applied it to all regions at the same time.

This is something what burned a lot of engineers. So, the take-away is probably never consider any change low-risk, especially if you already have the architecture for gradual rollouts. However, it's much easier to be said than done.

#postmortem #sre
9👍3😁1
Kubernetes v1.33 Fixes a 10-Year-Old Image Pull Loophole.

While technically a loophole, I wouldn't say that its impact was too high. It would be concerning only if you'd run multi-tenant clusters, where customers' pods run on the shared nodes. And even then, it could have been mitigated with pullPolicy: Always. While I never encountered this, I could imagine such setup in some PaaS company.

The gist is that previously (or still, depends on your K8s version), kubelet doesn't check the correct permissions to use a container image if this image is already present on a node.

#kubernetes #security
👍81
If you have some time today and you feel like watching some videos, here is a playlist from KubeCon Europe 2025 (the one that was in London).

https://www.youtube.com/playlist?list=PLj6h78yzYM2MP0QhYFK8HOb8UqgbIkLMc

#slides #event
6
​​Let's help to close a fundraiser from a member of our community.
This one is from a colleague of mine from my very first paid job. His wife is raising funds for a vehicle.

Here's a link to the Monobank jar:

https://send.monobank.ua/jar/5axqiosSrT

More information is in this Instagram pos
t

#donations #Ukraine
🔥1
A super-short article about Rate Limiting.

Also, it comes from yet another Substack blog about system design, if you're into such things.

This article doesn't show all the details, but it lists some most common algorithms, so you can continue your journey from there.

#systems #networking
🔥3
​​🎉 On this day, 8 years ago, this channel was created 🎉

I find it to be a big accomplishment: being able to take care of it for so long and also keep somewhat consistent posts schedule! In these 8 years, CatOps grew to more than 5k subscribers, we had our voice chats (although irregular), and a newsletter.

I've led CatOps longer than I stayed at any job. Heck! In these years, I've changed jobs 3 times and moved countries. Yet, this channel is still here. This is cool, but also a bit weird at the same time.

It all is possible because of you! Thank you for keep reading CatOps, reacting to the posts, and sharing them. For real, I have an idea of abandoning it for good many times, Each time though, I thought: well, but at least someone finds it interesting.

If you enjoy CatOps, and you want to make us a small present, you can do it by donating to Hospitallers using this Monobank Jar:

https://send.monobank.ua/jar/9aHg73XmQm

#catops #birthday
🔥424
A great concise explainer-article about PostgreSQL.

It’s needless to say, how popular is Postgres in the industry. This article covers topics of:

- Connection management
- WAL
- MVCC
- Query execution
- Indexing
- Table partitioning
- Logical decoding
- Extensions
- Statistics collector

So, a quite excessive list actually. My only two nitpicks are:

- When talking about MVCC, there’s a phase that sounds as if locks do not exist in Postgres. They pretty much do! Moreover, it’s crucial to pay attention to what locks what operations acquire. I usually use this reference to double-check.
- When talking about the query planning, there’s article doesn’t explain the subtle difference between EXPLAIN and EXPLAIN ANALYZE. The latter actually runs a query under the hood, which may be ok for SELECT queries, but likely not for inserts and updates.

Apart from this small things, this is a very good article!

#databases #postgres
👍13
A friend of mine's recon team is getting a Shark complex, but they need a trailer to move it!

This powerful UAV needs a two-axle trailer for transport. Let's help them get it.

Donate to get us closer to giving them the mobility they need:

- Monobank jar: https://send.monobank.ua/jar/9hNbCnoiN1
- Card: 4441 1111 2429 2776

#donations #Ukraine
3
On Describing Not Explaining is a neat life-story that unveils a way of reasoning about incident investigations.

The gist is that instead of guessing what could possibly happen (an instinctive approach), you try to describe what exactly happened and in what order. Just saying this out loud can help you to cut off many unlikely causes, and also may help you to remember some less obvious recent changes.

#sre #incidents
👍3
I think, I first encountered this tool in Den Vasyliev's channel. Kubeshark - a network observability tool for Kubernetes.

Network observability comes handy at times. So, here are some other tools and articles one can use to capture packets in your sustem.

- ksniff - a Kubectl plugin to capture traffic
- Hubble - an observability tool for Cilium
- How to use debug containers to capture the traffic - basically running tcpdump inside a pod
- A hands on lab on how to run tcpdump in a pod


Happy capturing!

#kubernetes #networking
👍12
​​Let’s close the last week’s fundraiser today for good! There’s not that much left.

​A friend of mine's recon team is getting a Shark complex, but they need a trailer to move it!

This powerful UAV needs a two-axle trailer for transport. Let's help them get it.

Donate to get us closer to giving them the mobility they need:

- Monobank jar: https://send.monobank.ua/jar/9hNbCnoiN1
- Card: 4441 1111 2429 2776

#donations #Ukraine
👍5
All talks today are about AI: models, agents, RAGs, MCPs, editors, etc.

In this article, Arseniy Zinchenko explains what is an MCP (model context protocol) with an example.

And in the follow-up article, he expands the example by writing a basic MCP for Victoria Logs.

BTW, if you're still not subscribed to his Substack, make sure to subscribe! Arseniy posts some great technical content there and makes it quite regularly.

#ai
🔥41
To analyze the data, one has to collect it first. So, I'd like to invite you to participate in two ongoing surveys:

- 2025 Stack Overflow Developer Survey - an annual survey from a very important (albeit not so popular anymore) engineering resource (in English).
- DOU Salary survey - an annual survey of the Ukrainian community (in Ukrainian).

#random
🔥5
Figma runs in Kubernetes. How can I be sure? By reading their blog post How we migrated onto K8s in less than 12 months.

This blog post doesn't dive deep into technical details, but it provides a glimpse of what technologies are used by Figma to manage their infrastructure.

What I liked about this article is that they have "in less than 12 months" right in the title! I think, more articles should provide realistic timelines, especially when talking about production systems under load. "Kubernetes up & running in 30 minutes" have its own merit, but not in prod.

#kubernetes
👍6
This article is quite old, but it's interesting nonetheless, since it describes an approach rather than a specific technology.

Moreover, it describes a phenomenon that was identified long time ago. However, here Slack shows how they used it to adopt (or discard) software within the company. Sure, such an approach would work better in larger organizations, but it's still interesting to read about.

#culture
🔥1