Reddit DevOps
272 subscribers
34 photos
31.5K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
What are your strategies or advice on a new devops engineer learning to support an application?

I recently got a new role as a devops/sre engineer from a network engineer background and is wondering how do people "learn" applications to be able to provide operations support?

In networking, we engineers just have to have a solid understanding of standard networking protocols and we can start troubleshooting in most environments. However, in the app world, there is no "standard" protocols and we seem to need to troubleshoot incidents with little understanding of what is happening in the application. I'm kind of lost at the moment and would appreciate some advice on how to get started

https://redd.it/ug0ll3
@r_devops
ansible - Not sure where to begin

Undergrad student here -- I have a devops project I'm working on and need some guidance. My professors and TAs are confusing me even more so I've turned to reddit.

I'm building, testing (running project test cases), and deploying an open source project by having a specific job for each using ansible.

I'm starting off with the build job. The open source project repo defines steps in the readme on how to get it started but I'm confused on how to translate that into lines in ansible. Any help is appreciated.

https://redd.it/ug5ubs
@r_devops
For those looking to make a career switch to devops, did it ever feel overwhelming with all the things you need to know?

Right now I'm learning Python because I feel like many IT careers (not just devops) would benefit from learning it, but when I start to see everything else you need to know it feels like it's neverending.

How do you usually manage not jumping between things, I know you can't learn everything at once but at what point do you go from learning one thing to learning another thing simultaneously?

https://redd.it/ug939g
@r_devops
Ansible for windows?

Are there any programs I can use to configure \~40 computers? My company updates their software (.exe) once a week, and requires:

1. Updating \~5 softwares (.exe), which requires uninstalling and reinstalling
2. Pulling changes from 3 repositories on GitHub
3. Reinstalling the python virtual environment (lots of pip install commands)

I currently use Jenkins, but have been running into significant configuration drift because some pipelines may error out. I heard Ansible is a good program, but noticed it is mostly for linux. Are there any alternatives for Windows?

https://redd.it/ugfd8w
@r_devops
Some practical learnings while participating in on-call rotations

I wrote a short blog post on some practical learnings I gained while participating in on-call rotations. Read it here: https://ernestas.me/on-call-leave-it-better-than-you-found-it

https://redd.it/ugjs4r
@r_devops
Monthly 'Getting into DevOps' thread - 2022/05

What is DevOps?

[AWS has a great article](https://aws.amazon.com/devops/what-is-devops/) that outlines DevOps as a work environment where development and operations teams are no longer "siloed", but instead work together across the entire application lifecycle -- from development and test to deployment to operations -- and automate processes that historically have been manual and slow.

Books to Read

The Phoenix Project - one of the original books to delve into DevOps culture, explained through the story of a fictional company on the brink of failure.
[The DevOps Handbook](https://www.amazon.com/dp/1942788002) - a practical "sequel" to The Phoenix Project.
Google's Site Reliability Engineering - Google engineers explain how they build, deploy, monitor, and maintain their systems.
[The Site Reliability Workbook](https://landing.google.com/sre/workbook/toc/) - The practical companion to the Google's Site Reliability Engineering Book
The Unicorn Project - the "sequel" to The Phoenix Project.
[DevOps for Dummies](https://www.amazon.com/DevOps-Dummies-Computer-Tech-ebook/dp/B07VXMLK3J/) - don't let the name fool you.

What Should I Learn?

Emily Wood's essay - why infrastructure as code is so important into today's world.
[2019 DevOps Roadmap](https://github.com/kamranahmedse/developer-roadmap#devops-roadmap) - one developer's ideas for which skills are needed in the DevOps world. This roadmap is controversial, as it may be too use-case specific, but serves as a good starting point for what tools are currently in use by companies.
This comment by /u/mdaffin - just remember, DevOps is a mindset to solving problems. It's less about the specific tools you know or the certificates you have, as it is the way you approach problem solving.
[This comment by /u/jpswade](https://gist.github.com/jpswade/4135841363e72ece8086146bd7bb5d91) - what is DevOps and associated terminology.
Roadmap.sh - Step by step guide for DevOps or any other Operations Role

Remember: DevOps as a term and as a practice is still in flux, and is more about culture change than it is specific tooling. As such, specific skills and tool-sets are not universal, and recommendations for them should be taken only as suggestions.

Previous Threads
https://www.reddit.com/r/devops/comments/tv01vk/monthlygettingintodevopsthread202203/

https://www.reddit.com/r/devops/comments/t4fozq/monthlygettingintodevopsthread202203/

https://www.reddit.com/r/devops/comments/ru3zhm/monthlygettingintodevopsthread202201/

https://www.reddit.com/r/devops/comments/r6myz4/monthlygettingintodevopsthread202112/

https://www.reddit.com/r/devops/comments/qkgv5r/monthlygettingintodevopsthread202111/

https://www.reddit.com/r/devops/comments/pza4yc/monthlygettingintodevopsthread2021010/

https://www.reddit.com/r/devops/comments/pfwn3g/monthlygettingintodevopsthread202109/

https://www.reddit.com/r/devops/comments/ow45jd/monthlygettingintodevopsthread202108/

https://www.reddit.com/r/devops/comments/obssx3/monthlygettingintodevopsthread202107/

https://www.reddit.com/r/devops/comments/npua0y/monthlygettingintodevopsthread202106/

https://www.reddit.com/r/devops/comments/n2n1jk/monthlygettingintodevopsthread202105/

https://www.reddit.com/r/devops/comments/mhx15t/monthlygettingintodevopsthread202104/

Please keep this on topic (as a reference for those new to devops).

https://redd.it/ugqrkn
@r_devops
Monthly 'Shameless Self Promotion' thread - 2022/05

Feel free to post your personal projects here. Just keep it to one project per comment thread.

https://redd.it/ugqs3a
@r_devops
Which IDE/Editor is Your Daily driver?

In last few years I tried Vim with bunch of plugins, NeoVim, Emacs (Vanila, Spacemacs and Doom), VsCode (also with neovim), Acme (from Plan9), IntelliJ GoLand, Sublime Text... I'm curious, which IDE/editor with external tooling is Best for You.

View Poll

https://redd.it/ugstjr
@r_devops
Those of you using prometheus as part of your observability stack, what approach did you take to scaling to scrape 25+ clusters, and why? Is Thanos the answer to my problems?

Hi everybody,

First post, longtime lurker :) .

So recently i've been tasked with implementing prometheus/grafana in our org as part of our preparations to get good devops strategies in place, as we are now aiming to hit zero downtime deployments in production.

We've got almost 50 clusters to scrape, and I'm thinking about the most efficient way to scale prometheus to handle all this ingestion, while having some breathing space for future growth.

Particularly as there is a non-zero possibility of another \~100 clusters to be provisioned in less than 24 months.

My ideal outcome is that all metrics will be available via a single grafana instance, and we can manage alerting from there for all clusters, but I'm worried about if this will be doable at scale.

I understand that Thanos (https://github.com/thanos-io/thanos) was built with the idea of improving prom's scalability and availability , and will be testing this shortly, but would love to hear from others that have tried various approaches to try to solve this challenge.

At the bottom of the github link above there are two architecture diagrams for possible implementations, has anyone used it before, and how did it pan out?

Before learning of thanos, I was thinking of running prom on each cluster, and then having a "master" instance of grafana to offer us a centralised view in grafana, but i'm sure there's a better approach to this.

In terms of configuration, I'm installing the prometheus/grafana operator using the helm charts in the community repo: https://github.com/prometheus-community/helm-charts

https://redd.it/uhmk77
@r_devops
How do I find out which container is responsible for a specific docker overlay folder?

I'm inside a VM which runs multiple docker containers. We have inspected some logs and noticed that in the VM's /app/docker/overlay directory there's a bunch of folders (presumably from the containers). They have hashed names like "jsdnjenljf8239ujsdkaldkoksdjo". Anyway, how do I determine which container is responsible for a particular directory like "jsdnjenljf8239ujsdkaldkoksdjo"?

​

https://redd.it/uhcerd
@r_devops
For 2022 - Sites for learning Azure & DevOps Tools? Pluralsight vs aCloudGuru vs Youtube / Udemy vs Books?

I'm somewhat fucked - I actually got a job for a Cloud / DevOps position - without any experience in DevOps and only some experience with Azure.
They just like me a bit too much.
I'm a sysadmin with a knack for linux, automation, containers and monitoring. Nothing else.
I have a month and 10 days until it begins and I lost the last three weeks just understanding concepts and "playing" with the tools.

I have the problem that I'm tutorial hopping, switching between books, Youtube, aCloudGuru, blogposts, etc... without making real progress.
I never know if the course I start is even in-depth enough to be worth my time.
The hopping is mainly caused because I'm not able to decide if I really want to go deep and learn everything or if I should work on projects without understanding shit so I can show them what I "prepared".
But I need the basics. And the basics are worth years of content.

Everyone is saying something else... aCloudGuru is bad, it's good, it's not deep enough, they make awesome content, go back to reading books, find another job or do some goat farming.

It's just too much tbh. The breath of possibilities is immense.

I need some guide... some structure... I'm working on my own curriculum with questions, why's and projects right now, but it's permanent trial & error and I really don't have the time for this.

So, are there any good courses (or sites) on building solutions with Azure and CICD pipelines that stretch quite far and are deep... or should I go back to reading books and accept that this will take a long, long time?

https://redd.it/ui05m1
@r_devops
GH actions - how to change trigger branch without a code change?

**Context**

We try to use git-flow.

During our sprint, we work on the develop branch and towards the end of the sprint, we cut to release branch and make any last min changes there. Meanwhile, the rest of the team keeps merging their PRs into develop. The GH action workflow watches the specified branch and deploys the code when we have a merge (into develop for example).

**Problem**

We want to change our trigger branch (branch to watch & deploy) when it comes towards the end of the sprint. We can do this by changing the branch specified in the yaml file but feels like an overkill. We would ideally like to do it from somewhere else like the workflow\_dispatch GUI or using a label or something that requires minimal effort for the team.

Here is our stripped down yaml:

name: Build & Deploy
on:
workflow_dispatch:
push:
branches:
- develop # I want to change this dynamically
jobs:
build:
runs-on: ubuntu-latest
- name: Install & build
run: |
echo "Deploy code..."

Is this a GH actions limitation or can our workflow be improved?

https://redd.it/ui2vk8
@r_devops
KUBECON EU 2022

KubeCon EU 2022 is just around the corner and LitmusChaos is all set for its Project Meeting on 16th May (Monday) at 13:00 to 17:00 hours CEST in Valencia, Spain.

Register now to book your seats NOW (limited seats available)!

https://linuxfoundation.surveymonkey.com/r/WCPMX6R

https://redd.it/uid5vm
@r_devops
Alert individuals when an item changes lanes in an ADO board

Is there a way to alert a group of people when a card is added to an ADO board? For example, if a card is moved from the Development lane to the Testing lane, the members of the Testers group is alerted? If so, where is the configured?

https://redd.it/uiczgq
@r_devops
Is there a good place to hear devops STARs stories, especially cloud ones?

Hi guys,

I've always learned best through example and I was wondering if there were any online resources to read or watch people explaining their devops STARs(situation, task, action, result) with good specifics? I'm just looking to see examples so that I know how to better talk about my own experiences. Thanks!

https://redd.it/uif51j
@r_devops
Using HaProxy on Nginx server. Not listening to port 80

Recently setup a server & using haproxy. Everything else runs smoothly but port 80 is not connecting. Here is the haproxy config file. Esp gives problem when certbot tries to renew. what am I missing here?

frontend backend.sample.com
bind *:80

# Test URI to see if its a letsencrypt request
acl letsencrypt-acl path_beg /.well-known/acme-challenge/
use_backend letsencrypt-backend if letsencrypt-acl
bind 64.123.456.124:6684 ssl crt /etc/haproxy/certs/backend.sample.com.pem
default_backend webapps

backend webapps
balance roundrobin
server app01 64.123.456.124:5684

backend letsencrypt-backend
server letsencrypt 127.0.0.1:54321

https://redd.it/uiqjq4
@r_devops
How do I get my newrelic app string code?

I created a Web app, and I got the long code for importing to the newrelic.JS file, BUT, I didn’t commit it, and I need the code to copy again....

Where do I find it at?

https://redd.it/uin9fg
@r_devops
When did it become standard for every job interview to require homework on top of 4-5 rounds of interviews?

I've been hearing the market is hot so I went to look for myself. I've gone through quite a few interviews and every one of them has required a take home skills assessment on Hacker Rank. I went through the first few but after that I just started declining the tests. I just don't have the time/capacity to dedicate 6 hours of interviews/testing per job right now. I can understand the appeal to weed out people but there has to be a better way. This used to be unique to FAANG companies but it seems to have caught on. Are there any DevOps/SRE jobs out there that don't structure their interviews like this? What has your experience been like?

https://redd.it/uix61t
@r_devops
How many (AWS) accounts do you have?

How many AWS accounts does your organisation have? Why that number, and what's the organisation strategy behind it? When do you add a new account? What are the best practices with accounts?

I'm curious because where I work we have, very roughly, one account per environment (dev, staging, prod) per repository. As a dev I find it hard to figure out where everything is, but then I'm not a devops person.

^(
Assuming you're using AWS. Question is specific to AWS, but probably applies to other platforms too.)

https://redd.it/uj8yzi
@r_devops