Reddit DevOps
270 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Prometheus Alerting with Slack

Wondering if anyone has tips on how to make the slack alerts from alertmanager prettier?

If you've got good looking alerts can you share the templates?

https://redd.it/yzcop2
@r_devops
Deploying to AWS from GitHub actions: is this something Fortune 500 security reviews will cry about?

We have many large customers so we go through typical security reviews (archaic generic spreadsheet of questions etc)

For a few reasons, it would be helpful to move our deployment from AWS CodePipeline to GitHub actions.

Is this going to be a major issue? Should I be aware of any common critiques of this architecture security wise?

It’s not like CodePipeline was in a private VPC or anything anyway…

https://redd.it/yzft2b
@r_devops
Testing/mocking Customer IDP integrations

We provide Auth0 to our Customers for authenticating into a client apps. A common problem we run into is being able to test their authentication, as they provide, via SAML, additional properties for authorization. What I mean is due to policies on their end, we cannot test what the experience will be for them authenticating into our apps. In the end, what we are concerned with in addition to authentication itself, are the additional properties they are required to provide to use via SAML.

Has anyone set up a test IDP to simulate scenarios such as this?

Thanks

https://redd.it/yzigz0
@r_devops
Kubectl plugin to display OOMKilled pods/containers

This was something which remedied a work pain for me. It became quite a chore to sift through the output of

kubectl describe pod <name>

Or using `grep` when there are a lot of multi-container pods in a cluster.

&#x200B;

I wrote a plugin to solve the problem and I hope it is also useful for others as well! Providing a quick way to check previously killed containers via a known interface in `kubectl`

&#x200B;

https://github.com/jdockerty/kubectl-oomd

https://redd.it/yzgjah
@r_devops
Suggestions for dealing with airgapped registries and promoting from dev to prod?

Preface: Our compute environment is entirely on prem based on VMware Tanzu with the exception of our GitHub Enterprise which is SaaS.

We have our Dev and QA environments in a “lab” network environment and our staging and prod environments in our production network. Our security department has very specific rules and one of them is that there is no communication allowed between lab and prod networks. Currently, to get anything into or out of prod requires using a Citrix environment and manually copying files.

In each environment we have an airgapped registry and we are facing pain due to having to export images from the lab and manually copy them into the prod environment when we are ready to promote the code.

Since we are trying to build pipelines we need to automate the process but our security rules are standing in the way. Their stance is they wont support “promoting” from lab to prod as they have no visibility into what goes on in the lab.

My point was, well we’re doing it manually how is this any different if we automate it?

So as a complete CICD noob I am here to ask for advice. How do you deal with this scenario in your shop?

https://redd.it/yzpsxt
@r_devops
Did you receive any LeetCode (aka Data Structures / Algorithms) type questions during the interview process?

title.

View Poll

https://redd.it/yx98i9
@r_devops
DevOps Role.. but not really DevOps?

Hi all.

I graduated University in April of this year, and fortunately landed a job in a big Canadian bank within a DevOps centric role. Although, I am very grateful for this opportunity, as I come from a mechanical engineering background, I can't help but feel like I am not getting the exposure to true DevOps tools as I should be.


The role is a new graduate role, so I am not exposed to everything, but even my leads and managers and basically most people in the organization, don't use advanced software, such as the commonly mentioned tools in the sub Reddit, such as Ansible Kubernetes and Docker. My team says I am learning DevOps, but mostly all I've been doing is coordinating deployments, using Jenkins to automate deployment and working with UNIX command line.

&#x200B;

I feel like this is great experience, considering I don't come from a software engineering background, but I really do want to take the next step and learn/use some of the more advanced tools in the industry. How should I approach this? Should I begin applying for jobs in different industries? Should I look to get some sort of Amazon AWS certificates? Any guidance from someone more experienced would be great.

https://redd.it/yx92wk
@r_devops
What are some good resources(repositories, youtube channels) to practice building different DevOps projects

What are some good resources(repositories, youtube channels) to practice building different DevOps projects

https://redd.it/yzt73k
@r_devops
How do I deal with latency workloads in a multi-cluster Kubernetes-based platform?

I want to deploy an application and the idea is that it can be deployed globally where the effect of network latency is brought to a minimum. Can i get help on how to start researching whether this is possible?

https://redd.it/ywudpx
@r_devops
What will the benefit be? AWS Security.

Hello there,


So earlier today i finished up my required tasks to begin the transition to devops. Last time i spoke with my boss about it i was supposed to be moving on to docker, etc. However my boss told me to focus on getting the AWS Security cert. What benefit would there be to an aspiring devops engineer having a security cert? I was expecting him to tell me to get the solutions architect / devops cert for amazon.


What do you all think?

https://redd.it/yx24zb
@r_devops
Should I switch Jobs?

Hi All,

I've received and verbally accepted a job offer but haven't signed the contract yet or handed in my notice.

I'm getting cold feet and thinking it isn't the right decision after recieving the contract.

I'm currently a senior site reliability engineer at a scale up tech company and like my work and team. I only looked for a new role as it was announced there was going to be redundancies. It turns out I'm not on the to be axed list.

The job offer I accepted is to be a team leader at a subsidiary of a big tech company (I could put the big tech company on my CV though). They are offering an additional £5k a year salary and £20k in RSU that vest after 4 years. However, they require me to be in the office 50% of the time and their working hours are longer. On average with longer working hours and commute I would be doing an additional 8 hours a week. They also are offering 10 days less holiday a year.

The new job seems a good opportunity in terms of career progression as I would be jumping up to a team leader position but in terms of finance and benefits it seems much worse. Should I back out?

I would feel bad backing out as I did negotiate with them and had a celebratory dinner with the recruiter.

https://redd.it/yzwlw9
@r_devops
All-inclusive online game: Redis, kafka, both?

So I probably got the general idea: redis is more of an in-memory db, while kafka is more of an event bus/queue (among others). I know this is not the whole description, but it's all I know atm.

Now, I have a Unity game, online, multiplayer, turn-based, with different dev environments/stages each with its own postgres instance, all that ci/cd infra for both the game (which runs on pc, webassembly, mobile), editor (pc) and the back end (.net 5, cross-platform, but mostly ran on linux under docker).

There are also 2 systems for live content: one for game assets (asset bundles in Unity terminology), one for gameplay configs (general game rules and per-level configs), and coming soon a liveops tool to manage the game's economy, schedule events and such, which will need its own backend.

Question is simple: could I hit many of those birds with 1 stone, in terms of the communication protocol?

Here are a few examples:

- when a game client web build (web assembly) was uploaded to a certain bucket in google cloud, there's an event fired to a messaging system (e.g. by a github action) and a python script which waits for that event: stops the current live WebGL webserver, downloads and replaces it, then starts the new one, so 'play.game.com' always reflects the latest build, and ofc 'play-dev.game.com' uses a different build.

- general-purpose messaging between back-end components

- anyone can host a game server, so a game server has to register itself with the master server (this already happens, but it might need extending in the future and a messaging system seems better for faster development. I'm currently using websockets for Everything, which is overkill)

- player x invites player y to his game

- chat-like functionalities (I already have a third-party solution for this, but other chat-like messages would be needed over which I have full control - or even a completely separate, rudimentary chat as a backup system in case the main one stops for some reason)

- logging/analytics (I know third-parties have analytics packages and Unity Analytics will be used here, but it's not a silver bullet, and ofc I'll need backend analytics as well, to understand how all of the systems communicate and identify abnormalities)

I know how to 'just do it', but story of my career is: "I didn't knew that tool existed - what an easier life I could've had, had I know about it"

Other details: this is currently a 1-man project (excluding art) that will soon expand to probably 5-10 people, and I foresee at least 8-10y of runtime. Game is not yet available to masses, and I imagine it'll roll out slowly: 100 ppl in 1st month, 1000 in second month etc. Anything can happen, but we don't expect massive sudden loads on the infrastructure, so performance is currently secondary

https://redd.it/yzyalq
@r_devops
Introduction to Docker: A Beginners Guide

Blog post on Introduction to Docker: A Beginners Guide that covers all the basic concepts of Docker, its Architecture, and how it can be used for real-world applications!

Link: https://karanjagtiani.medium.com/introduction-to-docker-a-beginners-guide-for-2023-cbf9be911352

https://redd.it/yzwg2m
@r_devops
Awesome CoreDNS

https://github.com/mariuskimmina/awesome-coredns

I started an awesome list for CoreDNS ressources. There are probably still a lot of things missing but I thought it was already worth publishing so that other people can add to it as well (PRs welcome).

https://redd.it/z00siu
@r_devops
docker container to redirect host traffic to proxy server

I have an app(https://pawns.app/cli-download/) that does not support proxy server under ubuntu system, I want to know if it is possible to create a docker container that redirects all traffic to proxy server or how should I solve this scenario?

https://redd.it/yzxqsl
@r_devops
Looking for a pass with mtls

Hi everyone,

We’re starting a new project where it need to be really stable and fast on networking and autoscaling.

Another aspect asked is to be the best as we can cloud agnostic and we’re a small team so it need to be the simpler as we can too. And… stored in Canada

I know it sound a contradiction but at the days I worked on pivotal CloudFoundry release’s and it was not so bad.
Now when I’m looking for a paas, I see lot of solutions but where we need to keep a updated kubernetes to run the paas on it which is something we want to avoid if possible. Do you guys know something that can fit that ?

https://redd.it/z035d8
@r_devops
What according to you is a developer environment?

Hello Folks.

I have been hearing about this term for some time now and wanted to understand what does this mean?

In our organization, everyone installs vscode on their laptop, and the required language. Developer environment is kind of ready, is there anything else?..

Pls share your thoughts

https://redd.it/z02e3b
@r_devops
How to organize E2E testing?

I am in an org dealing with financial data (SQL). A lot of the functionality is just CRUD operations but there are some more complex parts such as FIX trading and dynamic PDF generation.

In this environment we have limited unit tests as most functions are to update or select from the database and a lot of calculations are using aggregate functions from the DB.

Truly complex calculations are scripted as functions and they have unit teststhey also dont need to be updated and aren the places wev run into issues where functionality doesn't meet expectations.

Integration/E2E testing is where we've found tests are valuable however if we setup the environment for each test, it'll take forever to test the entire system (800+ routes, 50+ jobs, 50+ reports).

Instead our approach is to create a "data story", throughout the tests, data is created by the testsused and updated as the tests execute. However this does mean at least some test suites are based on others, and that when ealier tests are change they break later tests which starts to become unwieldy.

Thoughts?

https://redd.it/z04aqw
@r_devops
TraceView - OpenTelemetry UI released

We have released an early version of our OpenTelemetry UI "TraceView"

https://github.com/asynkron/TraceViewDeploy

TraceView is intended as a tool for developers, to pinpoint issues in microservice architectures.
There are plenty of good and scalable tools for observability out there, DataDog, Grafana, etc.

But they do tend to focus on DevOps and SRE, e.g. focus on latency, focus on huge logs.

TraceView rather tries to combine and analyze this data to show meaningful views where you can reason about what is going on.

e.g. as a replacement for Jaeger or similar on local dev machines.

Any feedback is welcome

https://redd.it/z05bcc
@r_devops
Please criticize my SaaS architecture

https://drive.google.com/file/d/1JNnqSIbkSikTjtmjQYNtsDHbclL\_Pnot/view?usp=sharing

Monthly budget: 150 USD

I have also used the AWS calculator to sort of determine how much it will cost monthly, see here https://drive.google.com/file/d/1V-ZQrEyBYYl1PRaLJ5L\_SLjOeyRD5U81/view?usp=sharing

Idea: It's a serverless WordPress hosting infrastructure

1. The site owner sends a request to setup WordPress
2. There are two containers in the fargate instance (WordPress and DB container), Data are persisted in the DB
3. EFS for volume mount
4. & 5. Lambda function to deploy the static site to S3
5. The same with 4
6. Managing end-user requests to s3 bucket

&#x200B;

This is just an high-level overview, I didn't go indepth on security and managing DDOs, however, I welcome all suggestions

GOAL: The goal is to reduce the budget as much as possible to 120-150 Monthly

&#x200B;

Thank you

https://redd.it/z08vmc
@r_devops
A poor man's API

Creating a full-fledged API requires resources, both time and money. You need to think about the model, the design, the REST principles, etc., without writing a single line of code. Most of the time, you don’t know whether it’s worth it: you’d like to offer a Minimum Viable Product and iterate from there. I want to show how you can achieve it without writing a single line of code.

Read more

https://redd.it/z0a1zx
@r_devops