Reddit DevOps
270 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Overwhelmed with so much to learn :(

I’m a azure cloud engineer with a windows sys engineer background, I’m about 6 months into the job and have learned quite a lot about Azure DevOps but still feel I’m missing a lot of skills to become a full DevOps engineer focused role.

There’s just so many things to learn, I don’t know how it’s possible to learn them all: AWS, Python, Powershell, Jenkins, Ansible, Terraform, kubernetes, the list goes on…

How do you up skill when there is so much to learn? Do I just simply start with 1 focus point and move on to the next?

To summarise, I only have exposure with Azure, M365, Azure DevOps but I haven’t done much creating of my own pipelines or automation.. I also don’t have a strong coding background

Appreciate any advice!

https://redd.it/qtlhjo
@r_devops
Dockerless containers with Podman on MacOS

Docker the company has been throwing wrenches lately into what used to be a smooth user experience with their new Terms of Service for Docker Desktop and various limits imposed on image pulls from DockerHub. Understandably, many people have been looking for alternatives which are easier to implement on Linux but not so much on MacOS.

We are showing here a way to build and use containers on a Mac without Docker, using Podman.

https://gokloudtech.com/index.php/2021/11/13/dockerless-containers-with-podman-on-macos/

https://redd.it/qthao4
@r_devops
Python Selenium Debugging

I have the following [webpage](https://i.imgur.com/gZLwTgh.png) . I'm trying to double click the "Blocked_IPs" text.

This is the code that interacts with it:

blocked_ips = driver.find_elements_by_xpath('//td[contains(.,"Blocked_IPs")]')
print(len(blocked_ips), blocked_ips)
action = ActionChains(driver)
action.double_click(blocked_ips[0])

Problem is, it just doesn't seem to double click it. When I do it manually, it works. When I execute the code, it doesn't. There's only one occurance of the word "Blocked_IPs". This is the output in the terminal:

1 [<selenium.webdriver.remote.webelement.WebElement (session="82b277a5f85cbb202f5cd57c0b800f3b", element="530b1a15-a190-401c-8495-921777f8fa84")>]

Does anyone happen to know why it's not working? How can I test it? Thanks ahead!

https://redd.it/qtq1ok
@r_devops
If you could automate any troubleshooting workflow for k8s errors, what would it be?

You all have a series of checks and processes you go through to find the root cause every time there's a recurring error. Which of your workflows would you like to automate? Imagine every time you got XXXX error it resolved on its own.
For example, when I get an ImagePullBackOfferror I check for changes in the image tag and repo, then I verify that the repo is specified, then if it's a secret I check to see if it's expired or misconfigured, etc. And each one of those checks can branch out and spiral into endless steps and iterations until I find the root cause. And that's one of the (relatively) simple errors.


If I had to choose I'd go with OOMKilled, since it's so recurring and annoying to debug. What would you choose?

https://redd.it/qtpu8f
@r_devops
CI CD for low level Linux and kernel modules

Hi all,

How do you guys do CI CD with kernel modules?

Do you just setup VMs with proper kernel configurations and run your jobs on them through jenkins or gitlab ci?

Did you manage to do them through docker containers and somehow made it work? if so how did you do it?

I would love to hear from all of you devops people who work in embedded linux and kernel development, how do you guys work on such projects?

https://redd.it/qto0ly
@r_devops
One platform, one tool.

What if you could do everything from one platform. From CI/CD to cluster deployments, and perhaps even deploy your regular infra tools.

View Poll

https://redd.it/qt995v
@r_devops
My Containers Learning Path

Following up on the feedback I got here last time, I decided to share my containers learning path as a blog post. The following learning order turned out to be particularly helpful for me to understand Docker (and not only) containers:

1. Linux Containers \- learn low-level implementation details.
2. Container Images \- learn what images are and why do you need them.
3. Container Managers \- learn how Docker helps containers coexist on a single host.
4. Container Orchestrators \- learn how Kubernetes coordinates containers in clusters.
5. Non-Linux Containers \- learn about alternative implementations to complete the circle.

I've been digging into the internals of the containerization tech for the past few years, meticulously documenting my findings. However, I've come up with the above order only recently. So, this blog post is an attempt to organize my in-depth but ad-hoc write-ups into a structured way to learn containers' fundamentals.

Hope someone will find it useful.

https://redd.it/qtw0qo
@r_devops
Homelab CI/CD Pipeline

I am looking for a tutorial to build a simple CI/CD pipeline using open-source tools on my homelab. Everything I have found to this point has been overkill for what I want to accomplish. I appreciate the help!

https://redd.it/qtw75h
@r_devops
Looking for hosting recommendations for a static site with some form of access control (ideally free or cheap)

Hi I looking for some recommendation on the best and cheapest option to host a small website with some form of access control.

Essential I have collated all my personal notes from differ formats and ported them into markdown files and saved them to a private repo. What I am now looking to do is make these into a knowledge base / wiki website for personal use that I can easily refer to when not on my personal computer, however I do not want to make them public accessible without some form of authentication (basic auth would do).

Currently I looking at using something like Hugo or Jekyll or mkdocs to generate the site every time there is a commit to the git repo (using github free actions) and then copy them to the website server.

I been looking at a few options (ideally free as money is a bit tight atm, but I am expecting that I might have to pay for something), however all free options I seen so far do not offer any form of access control on the website (ie github pages, cloudfront pages etc). Does anyone have any recommendations on what hosting to use? I would be fine with a cheap VM that I can run nginx/Apache on and make use of .htaccess files to control access.

&#x200B;

Thanks

https://redd.it/qu0ntm
@r_devops
Can Therapy lower chances of getting hired for Gov positions?

Not really a dev ops question, just related but I’m not sure where to ask. So sorry if this isn’t the right place. But asking in case anyone can help. I have a friend that needs therapy, he’s a Devs Ops Engineer. He is avoiding therapy saying he doesn’t want the fact that he went to Therapy to show on his background check. He says he might get rejected for US government positions if they see that on his background check. Can that happen? Anyone have any experience or knowledge about this? Any answers would be appreciated, thank you!

https://redd.it/qtxmqz
@r_devops
Is Nana's devops bootcamp worth it?

I am planning to take DevOps bootcamp by Nana. Can anyone who have experience or have done let me know whether it's worth it to invest in it.
thanks

https://redd.it/qu8dw3
@r_devops
DevOps Bulletin Newsletter - Issue 26

DevOps Bulletin - Digest #26 is out, the following topics are covered:

Keeping K8s clusters clean and tidy
Git techniques to get out of hairy situations
Efficient communication during an incident
Mini-projects in Python for DevOps

Complete issue: https://issues.devopsbulletin.com/issues/git-techniques.html

Feedback is welcome :)

https://redd.it/que8hh
@r_devops
Unified or automated timeouts and contracts between microservices

Hello,

I wanted to ask about best practices and maybe tools available on the market for my purpose.

So, let's say that I have a product consist of multiple services. Different services defines different SLAs and timeouts in especially. I'd like to somehow automate the process of reading those SLAs, adjust timeouts dynamically if one of the downstream services agreements changes.

&#x200B;

Consider these services:


A -> B -> C & D

and a situation where e.g. service B defines timeout of 5s, but a request that's processed in 7s in service C can be a proper response from service C point of view.

I'd like to have some kind of automation, or maybe semi-autiomation, where service B can ask service C what are his SLAs or timeouts, and set its own timeouts properly to that.

https://redd.it/qup48x
@r_devops
Interactive Architecture Diagrams

Does anyone have experience or recommendations for a tool that would allow an engineer to create a multi-layered, interactive/explorable infrastructure diagram? I'm looking to create a diagram that encompasses everything from VPC, subnets, security groups, EKS, statefulsets, deployments, etc. etc.

I did some Google searching, and the only product I found that seemed to fit the bill was Terrastruct. Are there other alternatives? Is Terrastruct a good fit for this use case? Does anyone have experience with Terrastruct, a similar tool, or creating this kind of infrastructure diagram?

https://redd.it/quv52y
@r_devops
Webhooks in Kraken CI for GitHub, GitLab and Gitea

Hello, I have extended webhooks in Kraken CI, in the latest release 0.753. Besides GitHub, there is now support for GitLab and Gitea.
A guide about webhooks in Kraken can be found here:
https://kraken.ci/docs/guide-webhooks/

https://redd.it/qv16g5
@r_devops
How to handle cloud resources in your application while running localhost

Hi (non-)binary people,

When you are developing an application and you run it on your own device, how do you do this when you have to rely on cloud resources like API Gateway, Cloud-map, S3 or a secret store like KMS? With RDS or with the DocumentDB you can run an addtional PostGresSQL/MySQL or MongoDB service on your device. (Although I mention several AWS services, I meant it in general.)

With for example the Application Load Balancer: I keep that configuration to a minimum and basically add configuration to send everything to a reverse proxy. Where the reverse proxy has the correct configuration, with retrying, setting headers etc. I understand that the ALB can do this as well (some maybe better or not, not relevant for this), but when I run my application locally, I still need to have something running to set these headers for example. As I can not set these, what am I doing locally then? I still want to know for sure that if I have a nice feature I can test this locally first before commtting and create a PR. Like I can not run a local ALB, but I can run a reverse proxy locally. So by keeping the configuration of the ALB to a minimum and use a reverse proxy, I can - if needed - run almost everything on my device.

How do you do this, I am eager to your experience.

"May the force be with you"

https://redd.it/qv8ynq
@r_devops
Prometheus alerts in grafana

Hi, all i have installed kube-prometheus-stack in a kubernetes cluster.

This helm chart provides a lot of alerting rules for kubernetes resources. This cluster is offline so i can't send alerts to an external destination using alertmanager. Instead the idea is to observe the cluster using grafana. Is there a way to see prometheus alerts in a grafana dashboard? I tried alertmanager-datasource but does not work maybe because is not updated for a while

https://redd.it/qvbdh5
@r_devops
How incidents made me a better engineer

My colleague has written a great post about handling incidents has made her a better engineer, and how people can make the most of that opportunity.

Think it's a great post for people in this sub-reddit!

https://incident.io/blog/incidents-made-me-a-better-engineer

https://redd.it/qv94gf
@r_devops
Buildpacks pack CLI for FaaS

Does anyone know if there is something similar to Buildpacks for FaaS?

Buildpack with pack CLI allows you to create Docker images based on the code only, which is great. Though, I'm looking for something that can look at your code and create Lambda (AWS) or a Cloud Function (GCP), or even better if you can choose the cloud provider where the FaaS will run to generate the code for you.

https://redd.it/qvjfhk
@r_devops
Need some guidance on building out monitoring/observability from the ground up

So I've semi-recently started at a new job as their only DevOps person where a lot of infra things are greenfield and I'm tasked with getting things built out. I generally am pretty comfortable with most areas of infra (including Kubernetes, been using it in prod since 2015), but I feel a huge gap of mine is monitoring/observability.

I've setup the many metrics/logging/alerting systems before in different environments (Prometheus, DataDog, Sensu, etc), but I've not really actually done much beyond that as usually other teammates in the past have taken on actually working with dev teams, identifying metrics to gather, and getting dashboards/alerting setup for those.

In this new job, one of the first big projects is to get monitoring going and to help the dev teams get started on this. I've already got Prometheus/Grafana setup, and they've got the client library in the main service going. However I'm a bit overwhelmed on how to help them beyond this point. They ask questions about Prometheus and monitoring that I'm not able to answer, and I don't feel equipped to lead here since I'm trying to fill in this gap of experience.

What are some good resources for myself as an infra person on observability/monitoring as well as Prometheus best practices? And are there good resources for me to send to them for them on how to effective use Prometheus to monitor their services?

https://redd.it/qvmrn0
@r_devops