Reddit DevOps
269 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Windows for work?

Hi there!

I've been working in the DevOps field for two years. I've been using macOS, and it's been working well for me, and I've enjoyed it. However, I'm about to start a new job.

I've been given a Windows laptop, and I'm aware it can cause issues, like the base64 encoding in Terraform (for instance, if the VM is generated with Linux, it can make you recreate the machine).


What would you recommend I do?

On one hand, I feel it's not right to change the OS on a company-issued laptop. But on the other hand, I'm not sure if Windows will be able to handle the job...

https://redd.it/1ek5cbk
@r_devops
Backend dev here. How many Dockerfiles and docker-compose files should a microservice have?

I am learning about CI/CD and I would like to know if the same Dockerfile and docker-compose file should be shared across testing, staging, production, or if you need multiple files for these?

https://redd.it/1ekfhlh
@r_devops
Is there any open source replacement for the full MuleSoft Studio web-based platform which is free to use?

Hello,

I need to test complex API endpoints and would like to find a way to quickly build such tests. It can be pure HTTP, REST, WebSockets, web services, or plugin-based and extendable. I am aware of some minimalistic and crippled versions of Mule, but I need one with a web-based GUI.

Thanks.

https://redd.it/1ekh534
@r_devops
How do you manage a "large" amount of docker environments and containers?

I did not want this.

We're producing just the software for our customers and deploy it manually or per the tooling of the customers choosing - like their Jenkins - on their servers that they control. That's because access is secured per VPN (and/or the server being 'managed' by another provider), so our Jenkins instance won't have access to the customer's systems for deployment.

Yes, we're using Jenkins. Yes, our customers don't care if their services aren't available for 2 days.

The bar is so brutally low, you won't believe it. Monitoring for PROD? Nonono, only if the customer wants it and pays for it (which, I mean, makes sense).

Now we have over two dozen servers to manage (seven of them are our customer's) and I don't even know how many containers, running on Docker.

Every container gets its own folder for its volumes, the .env file and the docker compose file.

One service per file. On every server.

If we want to deploy a new version (automatically), we use Jenkins to run a script or to directly replace the VERSION variable and then run the compose.

* GitOps? Nah, what if someone changes the config on the server? (wtf) I have to save/backup the configs MANUALLY (really funny if i have to edit 20 f\*\*\*\*\* compose files).
* Secrets? PLAINTEXT.
* Docker Swarm (for the secrets)? Isn't compatible with Spring - Tomcat hates the swarm host naming convention.
* When we decide that we have to do xyz another way I have to connect to every goddamn system that exists and DO THE CHANGES MANUALLY.

Whyyyyyyy.

So, now, let's ̵t̸r̷y̴ ̶t̸o̶ ̶s̵m̵i̸l̴e̷ again.

Ok. How do you guys manage - let's say - between 50 and 100 containers (just the beginning) that don't have to scale and are hosted on many different systems?



https://redd.it/1eki570
@r_devops
Devops Engineer here, unsure about future

Hi Everyone,



I’ve been working in the DevOps field for about four years, focusing on tools such as Jenkins, Terraform, Kubernetes, and Docker, primarily within Google Cloud Platform. As I look to expand my skill set, I’m considering exploring new areas such as security or data. I’m interested in hearing your thoughts on which direction might be most beneficial for future growth and how best to get started. Any suggestions or advice would be greatly appreciated!



Thank you!

https://redd.it/1ekifox
@r_devops
How do I get use Tetragon to get notifications when someone made some actions in our environment?

When I started testing Tetragon I imagined I'd be able to get alerts when someone kubectl exec'ed into a pod and did some things, but it seems like it's not as straight forward.

Tetragon seems to expose a few metrics that I thought would help, like tetragon_events_total or tetragon_policy_events_total, but both don't provide any information on what command was executed.

For example, following their setup docs I was able to run cat /etc/shadow which got a SIGKILL, and that event shows up in the above metric, but I don't see how I'm utilizing this information to get alerts.

Am I doing this wrong? How did you implement this or a similar eBPF tool in your environment?

https://redd.it/1ekj0cq
@r_devops
How Do You Prefer to Use a CLI Tool?

Hey everyone!

I just made a migration tool that helps you move from Nexus and Artifactory to my new platform, RepoFlow. It's all in TypeScript, and I’m trying to figure out the best way to make it available for everyone.

How do you like to install CLI tools? Would you prefer:

An npm package?
A Docker image?
A yum package?
Or should I just open-source it and let you run it straight from the code?

https://redd.it/1ekl36t
@r_devops
MMORPG/Games streaming architecture help

Hi all,


Lately I have been fairly curious about how do most MMO games/games streaming services like XBOX Game Pass' infrastructure look under the hood, how sessions are managed, server provisioning/scaling etc. Unfortunately, I was able to find little to no reference architecture on that regard. Do you know of any good references/projects/books/articles etc. I could look into so I can get a better view of how they work under the hood?

Thanks in advance!

https://redd.it/1ekkz15
@r_devops
Junior fullstack developer -> appsec or devops?

Hello, I was wondering what is a more natural career progression for a junior fullstack developer working on a web app? As part of my job I have very limited interactions with the ci cd pipepline in Azure DevOps and I was curious to get to know more about it.

This got me a little interested in DevOps and I was wondering if this was a natural career progression to take? I was also very curious and interested about Appsec as I've I'm also interested in cybersecurity as I do reverse engineering as a hobby (but not reverse engineering malware or anything like that) and I was told that was a valuable skill for Appsec.

As a junior fullstack webdev, what would be a more natural career or even lucrative progression for someone interested in both DevOps and Appsec? I imagine I only have time to go in one direction, right?

https://redd.it/1ekmqt6
@r_devops
What is the best Git branching strategy for managing Ansible CIS (hardening) roles?

We currently have one AWX server and a Gitlab instance in our environment to develop and test automation. I was tasked with testing the roles as a proof of concept for multiple OSs/applications (MS SQL servers, web, RHEL 7-9, etc). Once we knew the roles worked and we were satisfied with our compliance results, our lead said that we needed to build an automated testing process to ensure code quality. We ended up building something that ideologically works in theory, but would probably be a disaster to manage in practice unless I can guarantee that our pipeline process is forcefully rigid.

To manage inventory, they put each ENV:OS type in its own file. For example, we have a Dev<type>Server.yml, Test<type>Server.yml, Prod<type>Server.yml, and the same pattern of .yml files in this one repository for any other type of server (RHEL, SQL, etc) you can think of. Why did we do this? We did this because we thought we could not keep the inventory file the same in the repository that the role lives in, because we have 3 separate branches for each environment. So now, I am able to keep each hardening deployment separated, because there is a .CI file that essentially forces an upstream code promotion pattern as commits are made, linted against, tested in the corresponding environment and merged to the next branch.

But there is literally an inventory file for each environment per OS/server function type living in a separate repository. Each inventory file corresponds with an inventory object in AWX which we correlate to a job template. When a developer makes a commit to the development branch in role’s repository, we trigger AWX’s API to launch the development job template (after linting the commit in development branch of role). If the development job template runs successfully in AWX, the pipeline creates a MR, randomly assigns the MR to some reviewers so we can build an audit trail then the next merge will restart the same .CI process but for the upstream environments.

This works fine in theory, but I foresee an event where we have TONS of job templates for the same role but in each environment in our Ansible server. I am also wondering how we are going to treat each application’s hardening process different. For example, I think all application teams who use RHEL servers should use a golden hardened image before they even build their app on top, because we are starting to see issues occur when we harden a system that belongs to another team and they say the server is unreachable or something breaks. Having a separate version of the role for each team to satisfy each application sounds horribly unmanageable. I just don’t see how I can maintain separate environment, for each server type, FOR EACH SEPARATE TEAM.

https://redd.it/1eknzsl
@r_devops
Greetings fellow newly unemployed people. How can we apply to jobs more efficiently?

A lot of the popular auto-complete forms are absolute trash. There must be a better way.

https://redd.it/1ekoef4
@r_devops
Branching strategy and environments.

I'm a little confused about how branching strategies related to environments for developing, testing and production, can someone explain to me how they do it in practice?

https://redd.it/1ekq3de
@r_devops
Supercharge Monorepo CI/CD: Unlock Selective Builds

Hey DevOps community,

I've been battling with slow CI/CD pipelines in our monorepo setup for months, and I finally found a solution that's been a game-changer for us. Thought I'd share in case anyone else is pulling their hair out over this.

TL;DR: Implemented selective builds in our monorepo, and it's cut our build costs by ~70%.

I wrote up a detailed guide on how we did it, including:

- The concept behind selective builds
- How to implement it using GitHub Actions and Redis
- Code snippets and real-world examples
- Pitfalls we encountered and how to avoid them

It's not a silver bullet, but it's made a huge difference for our team. If you're dealing with monorepo headaches, especially in larger codebases, you might find this useful.

https://developer-friendly.blog/2024/08/05/supercharge-monorepo-cicd-unlock-selective-builds/

Happy to answer any questions or hear about your own monorepo war stories. What's worked (or spectacularly failed) for you?

https://redd.it/1ekszx8
@r_devops
Noob here. Should I build my project source code into an executable in my Dockerfile? Or should I copy the executable from host machine into container directly?

I am asking because I want to know what is the best practice, and most important, why.

What would be the best practice and why?

1) Copy source code into the image and build the program executable there
2) Copy the executable directly from the host machine into the image (skip build)

What is best? And why? Thanks!

https://redd.it/1ekrnag
@r_devops
A Blockchain ETL and efficient data pipline management webinar

Blockchain ETL has unique challenges for DevOps teams managing data pipelines. This webinar explores practical solutions and best practices for handling blockchain data at scale.

Webinar: Optimizing DevOps for Blockchain ETL Pipelines

Date: August 8th, 12 PM EDT

Topics:

1. Blockchain data architecture for high-throughput systems
2. Containerization and orchestration strategies for blockchain nodes
3. Monitoring and alerting for blockchain-specific metrics
4. CI/CD pipelines for blockchain data services
5. Live demo: Real-time blockchain data synchronization and indexing

Speakers:

Andrei Terentiev, CTO of [Bitcoin.com](https://Bitcoin.com)
Seb Melendez, ETL Software Engineer at Artemis

Key takeaways:

Strategies for maintaining data consistency across distributed ledgers
Performance tuning for blockchain data ingestion and processing
Security considerations in blockchain data pipelines
Q&A session addressing DevOps-specific blockchain challenges

Target audience: DevOps engineers, SREs, and technical leads working with blockchain infrastructure

Registration: Webinar Registration Link

https://redd.it/1ekusu0
@r_devops
RESUME REVIEW

Hello Everyone,

I need some feedback on my resume. I created it with a specific focus on achievements and improvements at the product/business level.

In particular, I need serious suggestions for point number 3 under the work experience section. I want to highlight my achievement of adding KEDA to the entire data warehouse pipeline, which significantly improved data processing efficiency. However, I'm struggling with how to word this effectively as an achievement in 2 lines to match the theme of overall resume

If you have any suggestions, please share them as they will help me a lot.

Thanks!



=============> https://imgur.com/a/ec9Gptt <====================

https://redd.it/1ekzo6c
@r_devops
New boss says I should be OK with being on call every other week

Had an interesting conversation with my new boss today that I'd love to get some perspective on. I work on a two person devops team supporting an application used by some fairly large players in the transportation industry in a critical role. This is an application that has SLAs with associated financial penalties and to be honest our customers, I think, expect that we have more invested in our operational capabilities than we actually do considering how little revenue we make a year from the whole thing.

Currently, myself and a junior engineer split an on call rotation that I set up 'voluntarily'. Previously, our alerts were just coming in to emails or SNS, which wasn't effective obviously, and so not having an easy way to get phone alerts I setup a free pager duty account. Thus began our 26 weeks each of 'official' on-call a year for which I am the escalation point so functionally speaking i'm on call 24/7/365 for the last few years. This has led to some pretty great uptime compared to what things were looking like previously but I never had a formal conversation about what should be expected of me in regards to on call

This past Saturday, we had an issue where a pet reporting service (Jasper Reporting Server, biggest pain in the ass ever I do not recommend) that had recently been updated to a new version became unresponsive due to a thread issue and unfortunately it did not get detected prior to a support ticket getting raised. My co-worker wasn't available when support contacted her and I was out for a walk and didnt have my phone so users were unable to generate reports for about 3 hours until I was back home

This incident prompted a retrospective today where I raised the point that we needed an incident response strategy in place for these types of situations because it was unreasonable to expect two people to split an on call rotation like this and say to our transportation customers that we're taking incident response seriously. I personally want to open up the on-call rotation to the development team as well and roll out some runbook automation for common tasks (such as restarting a service althought my boss was incredulous that i'd have to train people to do this). I can still be an escalation point but I don't need to//cannot be on call 24/7

My boss responded by making what I perceived to be a kind of shitty comment that two people managed the devops program at his previous job and being on call, even every other week or all the time, isn't that big of a deal. It was kind of a shitty comment because the way it was said kind of implied that we're lesser than the two people he worked with previously and that because we're lesser engineers thats why we have more operational issues and that the only reason we don't like on call is because of our own problems. There was a lot to unpack in that statement, especially given that I am on a team with a non-existent tooling budget, but whatever, I wont get sour because of some difficult talk after basically an undetected service outage

However, I do not personally agree with his position that being on call every other week is acceptable as having to plan to have a laptop with me is a non-trivial thing and the stress of knowing you could get an alert while I'm out at dinner is a lot, even if you don't get 'that many' alerts. I'm curious what other people's thoughts are on frequent on call for small teams?

It's probably time for me (I wasted too much time not learning kubernetes already) to move on but I wasn't sure if I was overreacting to his position about on-call because of the perceived slight

TL;DR Is expecting someone to take an on-call rotation every other week reasonable given that they're on a two person team one person being significantly more junior?


*edit* we are not compensated for on call hours worked outside of our yearly salaries

https://redd.it/1el1bfq
@r_devops
Flyway with Jenkins

Anybody here tried using this stack before? How was your experience? Does anyone have any use case I can use a reference? Currently trying out flyway if we can adapt it in our dev environment and if we should get the subscription... Any insight is appreciated.. thanks

https://redd.it/1el21aa
@r_devops
Configure ec2 in Github Actions workflow via SSH or use Ansible?

Working on a Github Actions workflow of which part is deploying an AWS ec2 via Terraform. To configure the ec2 instance for a Nodejs application, I could theoretically SSH or remotely run commands on the instance in the workflow - but is there an advantage to running an Ansible playbook via Actions workflow instead? One reason that may be in favor of Ansible: increases the modularity of the pipeline, meaning I could more easily port to another workflow or even CI/CD platform (Jenkins, etc) as the Ansible playbook is agnostic to CI/CD platform on which it rurns. Any other thoughts?

https://redd.it/1el1ryf
@r_devops
Careers after DevOps - experience or suggestions?

Awful economy and a stupidly wide-range of roles within "DevOps Engineer" that are almost impossible to fulful. So what are good exit careers after DevOps?

obviously development (if your programming skills are up to scratch)
what else?



https://redd.it/1elav9p
@r_devops