Reddit DevOps – Telegram

Reddit DevOps

270 subscribers

5 photos

31K links

Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels

Download Telegram

About

Blog

Apps

Platform

270 subscribers

DevOps to Pipeline Engineer: Need resources for GitHub Actions best practices and optimization

Hi all, i worked as a devops couple of years, but in my new position, i was tasked to help developers with their github actions pipelines. I am not very proficient on it because in all my past devops roles, i was mostly into aws, (cloud engineer) and this company is a bigger one, they got cloudops for the aws things, they got sre's for kubernete and monitoring, so this devops team are actually a "pipeline" team.

The thing is that developers are actually very capable of building their own pipelines, so every time i get on a call with them they ask "so, what can you do for us" and i actually don't know what to respond..

Could someone point me into a book or something that can help me compile a checklist of best practices that a pipeline can have? for example, i was reading a nice application security book that in the end it had checklists of things need to perform and i am sure there is something similar for pipelines that i just can't seem to be able to find.

Through personal research i found that there are things like scorecards, sboms, attestations, container signing etc, but i am sure there are more like "how to make a pipeline run faster" etc. So i would appreciate if someone could point me to the right direction, thanks!

https://redd.it/1epg2ly
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

13 views10:28

SpecFlow UI Tests Failing on Azure Pipelines Despite Extended Wait Times and Workarounds – Need Advice!

I'm facing a frustrating issue with some of my UI SpecFlow test cases running in Azure Pipelines. They keep failing with a NoSuchElementException, even after trying several fixes.

I've already:

Increased the ExpectedCondition wait from 12 to 30 seconds.
Changed the implicit wait polling from 1 second to 3 seconds.
Added a hardcoded wait of 3 seconds before interacting with the element.
To address this, I suggested a few changes to our DevOps team, and they were implemented:

Added test run in parallel.
Enabled failedrerun as true.
Set UI test to true.
Unfortunately, none of these solutions have resolved the issue, and the tests still fail intermittently. I suspect it might be due to the slower performance of the pipeline server/MS agents, but I can't seem to find a solution that works consistently. Has anyone else encountered this issue or have any suggestions on how to tackle it?

Any advice would be greatly appreciated!

https://redd.it/1eph3hf
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views11:28

Azure vs. AWS

If you had to pick on of these cloud providers for a long run, which one would you pick and why?

https://redd.it/1epk8mz
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

8 views13:28

Strategy/Tech for managing/building docker images on a larger scale?

Hi, we are slowly moving towards more widespan use of containers for our applications. Already seeing way faster turnaround on dev-cycle and all the fun DevOps stuff.

Anyways! We are a on-prem company that have some the usual quirks that follow. For example proxies here and there in the network and SSL inspection that needs its certs to be shimmied into our images..

So to make the process easier the devs we have "shimmed" a bunch of the basic official images. IE add the CA certs and the config that follow, add the internal artifact repository and the likes.

The images are currently built with yaml shenanigans and docker-compose, but I'm not really all that happy with the setup.. Even more so as the amount of "base" images we maintain grows..

So, anyone got any good tips for managing a growing volume of images?

https://redd.it/1epk2t7
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

7 views14:28

Microservices into an on premise installer?

Hello, I would like to ask if any of you could help me with ideas how to achieve packaging microservices into a distributable offline installer.

The idea is that the customer installs this in their own server with a couple of clicks, but the end product consists of microservices so they have to be "installed" and managed somehow. I am not allowed to use docker but I would be open to ideas using that

https://redd.it/1epnd2y
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views15:28

I might need to "breakout" of DevOps

So, I am a DevOps student and I have been studying for 3 months since trying to "break in" and I'm starting to feel discouraged, or more like the field is too "technical" for my interests, so I need advice on what I could do to help (or if I'm a lost cause).

Context: my background is in medicine and I have been interested in joining the tech/computer science atmosphere since literally finishing my BS in medical professions. I didn't really want to become a doctor like I set out to be maybe 2 years into undergrad, but I still loved stem and wasn't really sure of my other interests, so I continued and still kept my major (also was young, burnt out, and not interested in spending more time in school). For the past 3 years, I really observed the tech field and thought I found myself really interested in joining (due to other peers making switch from medicine, pay, and I started super basic level coding and thought it was pretty fun). Maybe, the boosts to the field during COVID sparked my interests too. I enjoy the lifestyle and work life that comes with tech and prefer alone work rather than team work.

Now, I am 3 months into a studying devops, I've learned (but not mastered) Linux, AWS, Ansible, and GitHub, and barely any Docker, but I just feel so lost. I'm not sure if it's my study methods but I feel like I'm not going to do well in the field or I'll just be super lost (and that scares me!!!!). I have no idea how to answer interview questions, and I feel less passionate about learning and studying more since I feel like I'm not really grasping everything "technically". I feel like this field is extremely hard to break into if your technical background was non existent before studying like mine. What do you guys think? Should I take my interests to another tech field?

https://redd.it/1eple0x
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views16:28

Can’t resolve a sonatype vulnerability

Every time I run my pipeline in azure devops, sonatype picks up a vulnerability for a dll file that needs to be updated versions. I update the file to the needed version and then run the pipeline but the file keeps going back to the version that is vulnerable. This file is in the Release folder

How do I fix the file version permanently? This is a .net project

https://redd.it/1epq787
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

8 views17:28

Need Guidance (Student)

Hey, There I am an engineering student, I have a pretty good interest in DevOps. I have started some lechers about AWS, if there any small amount of guidance you can provide me that would be helpful. For getting started

https://redd.it/1eprn6m
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

8 views18:28

What do you think of platform engineering?

Is it the next generation devops?

https://redd.it/1eps6p9
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

10 views19:28

Is there a book/course/podcast/tool-suite that helped you go from fundamental understanding to mastery?

15 year software developer entering devops space.

https://redd.it/1epsrj2
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

8 views20:28

How do you handle disaster recovery?

As a best practice, let's say.

And on that note, what does HA mean to you?

https://redd.it/1epvu41
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

10 views21:28

Distributed Tracing Weekend Project with Grafana Tempo

🌟 This weekend, I got curious about exploring Distributed Tracing in a Microservice architecture using containers in my Homelab. I decided to dive in and simulate an order processing system with multiple services: order-service, inventory-service, payment-service, warehouse-service, and fraud-service.

🔍 One of the challenges with microservices is debugging when a single request traverses multiple services. It can be tricky to trace and understand what’s happening across the system. To tackle this, I implemented distributed tracing using OpenTelemetry, Grafana Tempo, and Grafana Loki. The setup is designed to provide a seamless way to view traces directly from logs, making it easier to debug and monitor the entire process.

🚀 The project includes Docker Compose with auto-configuration, so you can easily spin it up and explore the architecture yourself. If you're interested, feel free to check out the repo, and don't forget to give it a ⭐️ if you find it useful!

Github Repo:

https://github.com/ruanbekker/grafana-tempo-loki-tracing

https://redd.it/1epww6b
@r_devops

GitHub - ruanbekker/grafana-tempo-loki-tracing: Grafana Distributed Tracing Example with: Tempo, Prometheus, Loki, Grafana and…

Grafana Distributed Tracing Example with: Tempo, Prometheus, Loki, Grafana and Python Flask - ruanbekker/grafana-tempo-loki-tracing

9 views22:28

Risks of running 2nd Express server with health check port?

I have a simple app running on NodeJS/Express ubuntu backend on AWS ec2, with free monitoring using UptimeRobot (UTR). I decided I didn't want to leave health check API exposed publicly, so I stood up a second express instance in my server.js, a second port (4431), and configured port 4431 to host only my healthcheck route. I then locked down p4431 access via Security Group to only IP ranges owned by UTR (https://uptimerobot.com/help/locations). It all works as intended, UTR can monitor successfully while the port and health check remain publicly closed. Just curious: Are there any risks or critical tradeoffs with this approach? Something like "a second express server drastically increases resource consumption", etc?

https://redd.it/1epy6cw
@r_devops

Locations and IPs | UptimeRobot

If you get any false positives, there is a strong chance that the IPs used are blocked by your hosting provider. Make sure that you allow-list these IPs!

14 views23:28

DevOps Testing Tools For 2024 Compared

The article discusses various testing tools that are commonly used in DevOps workflows. It provides an overview of the following popular tools for different types of testing (unit, integration, performance, security, monitoring) to help choose the right testing tools for their specific needs and integrate them: [9 Best DevOps Testing Tools For 2024](https://www.codium.ai/blog/best-devops-testing-tools/)

* QA Wolf
* k6
* Opkey
* Parasoft
* Typemock
* EMMA
* SimpleTest
* Tricentis Tosca
* AppVerify

https://redd.it/1eq4pyv
@r_devops

13 Best DevOps Testing Tools For 2025 - Qodo

The appropriate selection of DevOps automated testing tools is crucial for the success of any DevOps initiative.

22 views05:28

Pragmatic scaling of small self hosted CI runner fleet

Hi,

I'm managing the self-hosted CI infrastructure for a small software dev team. Mostly, I'm ensuring we have enough runners for the team needs. We don't have apps, the runners are just building code, running tests, etc (in docker, so the runners just need to have docker).

We have a couple small servers, and what I had so far was a basic Linux distro plus a homemade script to have the runners up and running.

Now I'm facing two issues : we're leaving gitlab for github, where runner can only execute one job, so while having the capability to parallelize a dozen jobs was just one simple param in a config file for gitlab, now I actually need to instanciate a dozen runners. Plus, the team grows, the CI does more and more, so I need to be able to add runners from time to time.

Now I'm looking for the most pragmatical way to scale this runner fleet up, given that I've never played with k8s, proxmox, ansible and the likes. It should be easy to maintain and scale, and not too hard to setup.

I'm thinking about getting proxmox on each node with 4 VMs, each having a runner (setup through a script), but managing all this manually already feels hard.

What are the best options, simple and not overkill, yet efficient, given that I don't know yet about k8s, we don't have a cluster or anything like that?

https://redd.it/1eq7sqy
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views08:28

HAProxy and hotsport on eks

Hello everyone,

I have a situation at work where I need to change how we expose our sandboxed environments to clients. Our current infrastructure runs on eks and we provision pods to clients on-demand using nodeport as the service type with one node in the cluster exposed publicly and acting as the entry point for the client connection. We are running this setup because all of the client connections are tcp based and the guy who designed the original infra obviously hasn't put much thought into the user-base growing and the nodeport range limitation posed by eks that we'd eventually run into (Only 2767 ports could be used simultaneously).

Now, I am thinking about using HAproxy controller and hostport to map the client connection directly to the pod, but I have no idea or how that would work, it's just an idea that I have initially. I would love to hear some solution suggestions and/or pointers on how I would start implementing the idea that I have. All the application pods are tcp based and I need to make an exclusive pod for each client those are the only two constraints.

https://redd.it/1eqayb4
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views11:28

Organizing & minimizing cloud costs in AWS

We're running a bunch of workloads on AWS in different accounts. Every now and then (usually when we have a big spike in expenses), we find ourselves trying to figure where our main expenses are coming from, what kind of workloads are currently running and wasting money and whether we have redundant workloads that we should get rid of.

In general, we are trying to constantly add tags to workloads and educate the team to add the relevant tags to any workloads they start (often developers starting EC2 machines, snapshots, Sagemaker pipelines, S3 buckets etc.).

Needless to say that sometimes people don't add tags at all, or do not add the appropriate tags. Sometimes people forget their expensive instances running idle during the weekend etc.

How do you guys handle monitoring your workloads (what asset belongs to what project), expenses, reducing redundant workloads to a minimum and generally keeping a good hygiene environment where a lot of money is not spent unnecessarily

https://redd.it/1eqf1h3
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views14:28

New to Devops - How do I find where our grafana instance is installed in our EKS cluster?

Good day folks. I was tasked with troubleshooting a grafana-loki issue but I don't know where to start. I looked at our console and tried to verify that the loki data connection was good to go but it isn't. It can't call the resource. I was told that once upon a time it worked and then stopped working a few weeks ago. I didn't configure grafana or loki myself so I don't know the details.

At this point I am just trying to find where the grafana/loki configuration is located. The lead for our particular section of the project is out sick so. And even when he gets back, I hate asking him stuff like this because I get the notion that 1. either he feels like I should know it already or 2. He just hates being bothered. He never voiced this but his tone isn't really inviting lol.

I have been a systems admin for quite a while and I just this year got the opportunity to get deep into Devops. So, sorry if my responses aren't as educated as one would expect lol. Our environment seems very intricate and not only me, but a few other new hires with over 10-15 years in IT are even saying the way these guys are going about getting us accustomed to this environment isn't optimal lol.

Thanks in advance.

https://redd.it/1eqgonz
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

10 views15:28

Where to store Py script run as part of GH Actions Workflow?

I have a Github Actions workflow which orchestrates terraform resources across a few different platforms. One step of the process is running a Py script which queries one of our platforms for a few key pieces of info before appending them to tfvars. Currently that script lives in the module root folder. This is part of a template which is cloned to create multiple services, so that means each repo has it's own copy of the script - probably a bad practice, we'll have to update each individual script in each repo if we ever make changes. What is the best way to make this script available to Workflow as a single source of truth?

https://redd.it/1eqgpp8
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views16:28

Sharing a Kubernetes + Azure Key Vault Integration Guide - Feedback Welcome!

Hey r/DevOps community,
I've been working on integrating Kubernetes with Azure Key Vault using OIDC, and I thought I'd share what I've learned in case it's helpful to others.
I've put together a detailed video guide (about 2 hours long) covering:
Setting up a K3d cluster
Establishing OIDC trust between Kubernetes and Azure Key Vault
Implementing the External Secrets Operator
Practical secret management in a production-like environment
Here's the link: https://youtu.be/JFJJWB7neIg?si=auHt3HF0wqZT5ZC7
I'm not here to promote myself, just to share knowledge. I've learned so much from this community, and I hope this can give something back.
If you do check it out, I'd be incredibly grateful for any feedback, corrections, or suggestions for improvement. There's always more to learn, and I'm sure many of you have tackled similar challenges in different (probably better) ways.
Some questions I'm particularly interested in:
Have you faced any specific challenges with Kubernetes-Azure integration that aren't covered here?
Are there any best practices or security considerations you think are crucial for this kind of setup?
How do you handle secret management in your organizations?
Whether you watch the video or not, I'd love to hear your thoughts and experiences on this topic. Thanks for being such a great community for learning and sharing!

https://redd.it/1eqj2gl
@r_devops

🔐 Ultimate Guide: Kubernetes OIDC Integration with Azure Key Vault | External Secrets in Action

Dive deep into the world of Kubernetes and Azure integration in this comprehensive 2-hour tutorial. Learn how to establish a secure OIDC trust between a k3d Kubernetes cluster and Azure API, unlocking powerful secret management capabilities.

🔧 What you'll…

15 views17:28

EKS for dev teams

I got a task to build EKS cluster for software developers. While EKS setup is clear - i got a question, what would developers prefer for deploying their stuff? (Including observability, logs, etc). I am looking at stuff like ArgoCD - but heard about it also not so favorable comments. So prefer pure pipelines, but still seeing “something” in cluster in my opinion is nice.

https://redd.it/1eql1s9
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

12 views18:28