Reddit DevOps
269 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Is this job really worth the salary?

I'm UK based so can't speak for the rest of the world.
In the UK, I am seeing many DevOps jobs going for around the 50k - 60k mark. Some positions are even lower than this!
I know there are positions which rise higher than this, and senior roles can even go a good bit higher. But, my question is, do you guys think it's worth it?

I am just shy of 2 years into a DevOps role. Worked as a dev for a year before hand - very luckily managed to upskill and became a DevOps engineer. However, my salary is 36k, move from 34k as a developer. But the stress I feel as a DevOps engineer is x10 worse than when I was a developer.

I work in a consulting firm, so maybe this is part of it. But I have non-stop customers messaging me all day. Things are faulty here and then, and tickets are raised all over the place. Constantly meeting with clients to discuss SOW and updates on projects, etc. I know my pay will bump up, but even at 50k+ it seems like a whole lot of responsibility and knowledge you ned to know!

Comparing myself to my friends that are still part of the developer team, by no means are their jobs relaxed, but I feel like it's soo much more peaceful that the constant bombardment I am under.
Granted I know a lot of this is to due to me having to learn a whole bunch of new things and be there to answer questions over a whole range of topics. But I feel like when you compare what knowledge is expected of you compared to other professions which can get to 50k... it is quite wild.

https://redd.it/1e5m1dy
@r_devops
How does your org handle dozens of tool versions across dozens of repos?

I'm curious about how everyone handles versioning across multiple repositories in their organizations. At my company, we deal with a mix of third-party and home-spun container images, machine images, software packages, software libraries, Terraform providers, Terraform modules, Helm charts, Kubernetes CRDs, Crossplane provider packages, and probably a few other things I'm forgetting.

With so many moving parts, we're trying to nail down a solid approach to versioning. For instance, when we have a Dockerfile in a repo that downloads a specific version of a tool, is it better to keep that tool updated regularly or stick to the pinned version that's known to work?

Where do you all draw the line between "maintain this version and try to keep it up to date as much as possible to get security and bug fixes" and "this pinned version is working, so let's not mess with it until necessary"? We're struggling with finding this line because we have so many versioned tools, but only 6 people on our platform team to manage it all.

I'm really interested in hearing how other teams handle this. Do you have specific policies or best practices that you follow? Are there tools or methods that help you manage and track versions efficiently? How do you deal with dependencies and ensure everything remains compatible?


https://redd.it/1e5ohvf
@r_devops
Github actions reusable templates not really that usable ?

Hi all!

What am I missing ? I thought to make a reusable workflow and then use it a bunch of times, passing different inputs.

Of course, I would get those inputs from something like env vars or inputs to the workflow or any such thing. But that is not possible it seems because they somehow made it in a way that with the context of 'uses' none of those are available ? See this link: https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability . Now am I using this thing wrong? I have used things like azure devops pipelines and others, each with their own quirks... but this seems very strange to me, unless I am just trying to use a hammer to drill a hole, in which case I would gladly hear where the drill is !

https://redd.it/1e5msj7
@r_devops
Terminating Elegantly: A Guide to Graceful Shutdowns

For applications deployed in orchestrated environments (e.g., Kubernetes), graceful handling of termination signals is crucial.

I prepared this repo to demonstrate how to do it in Go/Kubernetes to make sure there is no loss of requests/data - https://github.com/plutov/packagemain/tree/master/graceful-shutdown

https://redd.it/1e5sdp4
@r_devops
What was the most challenging bug you ever fixed?

What's the most challenging bug you ever fixed?

Share your toughest debugging stories! 🚀

https://redd.it/1e5pffl
@r_devops
I am so baffed

Can someone explain to me what devops is? I am starting a DEVops role this September which is a 4 year apprenticeship with uni but have nooo idea what DEVops is. I been reading the thread and see alot of different versions of what responsibilities they have. Is DEVops a support role like IT where you’re just a fixer? Am I working on the cloud using AWS? What the flip is a docker? Is DEVops a mix between being a support engineer and a cloud engineer? Any help would be appreciated!!

https://redd.it/1e60ae4
@r_devops
We have a "code sync up" meeting after our standup that I find useless..

Some of our devs want to discuss our code more and requested an additional daily meeting an hour after standup..

I kind of got a bit flustered and said something along the lines of..

- We're all senior+ devs here.. if you have an issue just bring it up after standup..

- Put a PR together and i/we can review your code and provide feedback

- 9AM - 10AM standup then 11AM code review standup killls my entire morning. What. the. fuck my dudes.. figure it tf out or gtfo..

- If we're all actually senior devs then we do not need an additional meeting to dive deeper into stories/code.. (honestly i got quite flustered and pushed this point lol)

..and yes i'm actively looking for an internal transfer.. love the company but this team is just odd

https://redd.it/1e5zvd9
@r_devops
Build server specifications

Hi everyone,

I'm planning to set up a new on-premise build server for our development team and could use some advice on the specifications. Here are the key details and requirements:

Project Details:

Type of Projects: A mix of C++ and C#
Number of Developers: Around 15 developers.
Build Frequency: Multiple builds per day, with CI/CD pipelines (AzureDevOps).
Expected Load: Simultaneous builds for different PR

Current Specifications:

vCPU: VM\_1 24, VM\_2 24, VM\_3 24
RAM: VM_1 24, VM_2 24, VM_3 16
OS: Windows Server

The 3 VMs works in a VSphere cluster under VMware. The pyshical machine is shared with testing and PO environments. We woulkd like to build a dedicated build server.

Currently the total build process of a PR takes 1 hour. Some application builds on VM\_1, some on VM\_2 and others on VM\_3


In my wettest dreams I'll love a docker configuration. Anyway, we would like to decrease the PR time down to 10-20 minutes.

Additional Information:

Budget: Open to suggestions, but looking for a balance between performance and cost-efficiency. They actually asked for 2 tiers: a mid-tier solution and a beefy solution.
Scalability: Should be able to scale with increased load in the future.
Other Requirements: Suggestions for backup solutions, redundancy, or any other considerations would be appreciated.

Any recommendations or experiences you can share would be incredibly helpful.

Thanks in advance!

https://redd.it/1e65bmf
@r_devops
Dependency Track not showing components (and vulnerabilities )for some SBOMs made with Syft

I'm using Dependency Track to monitor for vulnerabilities on multiple systems. I create an SBOM in CycloneDX 1.6 format using Syft and then import the SBOM into Dependency Track. The problem is that for some systems I upload the SBOM and the system accepts it without complaining that something is wrong but then nothing happens. The component list just stays empty and nothing is shown.

For other systems doing the same works just fine.

Any ideas what could be wrong?

https://redd.it/1e65jfu
@r_devops
9LPA TO 8LPA developer to cloud engineer Pune

Hi Friends. currently i have 9lpa as a front end developer 2.7yoe. I want to transition to Cloud/Infra role for Good Career in future.

I got opportunity to give interview for Cloud engineer 2-3yrs of experience. i aquired all the skillset that company needed.

Actual Company is big tech giant but Consultant telling that Budget is 8LPA.

Will this Lowering in Pay will make my Future better or i continue with Devlopment

https://redd.it/1e66mq5
@r_devops
How to get real time work experience as a Devops engineer?

Hello, I'm from Hyderabad, Telangana, India. I have completed Devops training two months ago and have been practicing from 3-4 months. Previously, I have worked for 2 years in a finance company and left it because of low salary and increasing targets. Prior to that I have worked in a web hosting company until covid-19 pandemic and had experienced in working with servers and troubleshooting errors related to hosting and websites.

So after knowing about Devops subject and considering my previous experience with servers, I've joined a training institute to learn Devops with AWS and Linux. The topics I've learnt are : Git, Github, Maven, Jenkins, Nexus,Tomcat, Ansible, Docker, K8s, Prometheus, Grafana, Argocd, Helm and Terraform.

Now, before applying for Devops jobs, I want to experience what real-time work looks like. What should I do? I'm ready to work for free fulltime in exchange of gaining real-time experience. Any suggestions or help is appreciated. Sorry for lenghty post.

Thanks.

https://redd.it/1e66g9e
@r_devops
Why deploy Argo components (Workflows, ArgoCD, Events, Rollouts) in different namespaces?

I'm looking to stand up the full Argo ecosystem in a test cluster to try out the full Argo flavoured CICD system. In the Argo docs each component is to be deployed in it's own namespace:

- ArgoCD -> `argocd` namespace
- Argo Workflows -> `argo` namespaces
- Argo Events -> `argo-events` namespace
- Argo Rollouts -> `argo-rollouts` namespace

Why is this?

Wouldn't it make more sense to deploy them all in an `argo-system` ns for example?

If anybody has any experience deploying the different components to a single common namesapce I would love to hear about your experience.

https://redd.it/1e67txc
@r_devops
What are some of your life hacks when it comes to DevOps? Share your tricks

It was only recently where I learnt about firefox containers. Really cool feature that allows me to have multiple different AWS Accounts open at the same time. I used to have to have different browsers open for this.

Good documentation is also a good one. I try to document pretty much everything I do. That way whenever I get stuck, I hopefully have a note somewhere the helps.
I also always have a tab open on the far left of my browser for ChatGPT.

Really interested to hear any tips you all have for getting a tiny bit further in your day to day work.

https://redd.it/1e6b4al
@r_devops
Who deploys and manages API Gateway

Folks - I have a question on API gateway usage. Who actually uses API gateways? Who sets it up and manages it? Is it Infrastructure engineer who sets it up and manages it? And devs use it to configure routes ?

https://redd.it/1e6asow
@r_devops
What are some practices you follow to reduce cloud infra cost ?


I have been advised to look at cloud cost across aws and gcp and it’s wild. Anything you do which helps you control the infra costs ? These are demo or test envs btw the resources with most cost are

Compute engines
Cloud filestore
Kubernetes engine
RDS

https://redd.it/1e6ca8v
@r_devops
Triggering alerts for PrometheusRules in a multi cluster setup

We deployed kube-prom-stack in a multi cluster setup where Thanos is deployed on the observability cluster and can see all of the rules we configure. Thanos does this by reading a path we provide it. Just for context we're doing something like this:

Deploy Thanos with Terraform and provide it with a values file:

...
...
values =
"${templatefile("chart_thanos_values.yml", {
cluster_name = local.cluster_name,
...
...
...
alerts = join("\n", [
for fn in fileset("", "./monitoring-rules/prometheus/*.yml") : file(fn)
)
})
}"
]
}

Then in the values file with pass alerts:

ruler:
enabled: true
clusterName: ${env}-ruler
alertmanagers:
- https://prometheus-kube-prometheus-alertmanager:9093
config: |-
groups:
${indent(6, alerts)}


Up until recently we never PrometheusRules manually, we only defined them in that ./monitoring-rules/prometheus/ folder.

For testing purposes I created a rule manually on our staging cluster. The rule is visible in Thanos (obsrv cluster) and it's even in firing state, but Alertmanager doesn't pick up the rule, so we're not getting the alert.

I'm pretty new to Prometheus and maybe I'm missing something, but how do I make my Alertmanager see these rules? Eventually we're planning on creating multiple rules on different clusters, but either it's not possible with our current config, or I'm just not doing something right.

I tried moving to Grafana Alerts. It sees all of the alerts, including the manual ones, it sees that it's firing but I wasn't able to make them fire from Grafana's alert manager. It seems like it's not possible for Grafana to alert on non-Grafana-managed rules.

Any help would be appreciated.

https://redd.it/1e66d4j
@r_devops
Career Advice Network Engineer -> Software / CloudDev / DevOps

Good day,

Looking for the advice for the above.

Essentially I am currently in a Helpdesk role with a company and looking at paths to further my career.

Preferably, the end goal would be for a remote position, however, that is not a requirement.

Current certification is primarily CCNA, of which I am pursuing my Cisco DevNet as well.

I've played around a bit with some software development, with a small number of languages, as well docker which i find rather fascinating. So not 100% on which path would work best for me, however, I am still researching what each position entails and would, of possible hear from people in similar roles already, who wouldn't , mind offering some guidance.

I have considered looking into a BSc in Computer Science from the University of London, however, with my current age, (31) I'm not sure how feasible that would be.

Any and all advice, suggestions, opinions are welcome.



https://redd.it/1e6gifa
@r_devops
Documentation

Shout out to y'all who spent hours writing those support documentation tasks which will never be read and stashed away in confluence until the end of time. Peace out homies.

https://redd.it/1e6ii75
@r_devops
Sysadmin here - do you manage your software yourself or let admins do it?

Hello,

Sysadmin here, currently updating software via SCCM, to get rid of some vulnerabilities. I've noticed that a lot of dev & devops users do not update their software (docker, python etc).

Since I'm a sysadmin, I'm more than happy to do it for you in bulk, but I'm aware that developer apps are very delicate and can break when updating.

So my question is - would you rather prefer to receive an email, giving you a month to update your apps (after that time, it's my time to shine) or you don't care and want admins to do it for you?

I realize the first option may not work, as probably a lot of people would just ignore an email.

All thoughts appreciated, thanks.

https://redd.it/1e6l38a
@r_devops
DevOps for industrial automation - SCADA, PLC controllers and the like (rant and a question)

/ ===== OPENING RANT =====
Hope you enjoy my writing
It provides context for the question
But it is not required to understand it
Skip to the next comment like this if you don't care :(
======================== /

I got hired as the IT team at a small company two weeks ago. I'm not even out of university and I'm already an entire engineering department, cool. We do mainly PV substations, construction and maintenance; but also home automation, power grid connections and the like. Since the company is small (less than 10 people) I also do the gritty industrial stuff, both in the office and on-site, in addition to being a code monkey and the like.

Hailing from the software engineering world, I have a very particular take on the process of creating
stuff. I have a nice modern code editor with bells and whistles, variable names are long, I write tests, commit changes to a VCS, run tests, maybe even automate running tests. Sometimes I even automate deployments! There's also the project management side - GitHub issues, projects, checklists, TODOs in code and out of code. Libraries are well documented (usually), or at the very least, I can look at the code.

Imagine the whiplash I got when I opened the SCADA software we use. It's older than me, the documentation is impenetrable (or maybe I just don't get it), and one of the main protocols is broken (though we don't know who's responsible, both implementors blame each other). Support for automating away boilerplate is almost non-existent. Did you know you can use AutoHotkey as an ad-hoc "code" generator? It's really neat! The SCADA uses JS as its scripting language. The engine has probably not been updated since I was born. It does not even have standard types, so you have to learn a custom `String` type. The system also uses a proprietary data format - it's a bunch of XML nodes glued with binary data. You can manually edit that, but only to a limited extent.
Okay, so maybe it'll get better when we get to safety-critical systems. After all, they better not fail. It would be quite unwise to, I dunno, not disconnect a 500 kW PV plant when the protection and management controller loses power, wouldn't it? This is not an edge case, right? You probably don't want the grid to exist in an unmanaged state. Wouldn't it be unfortunate if this specific scenario- who am I kidding, this happened today. No damage done, beyond a reputation hit, because it was done during project hand-off to the owner. Testing is done by finishing the contract and hoping nothing will explode or catch fire. I don't think the editor even supports tests, nor can I really check because it's proprietary software and a second copy has not yet been bought.
Version management does not exist. It's just not a thing. I copied the SCADA design file I was working on to my local computer, renamed it to include the feature I was working on and I upload it to the main server when I'm done for the day, with a README.txt in which I describe what's been done and what's left to do. I fuck something up? I better remember what I did, or revert to yesterday's copy entirely. Editor history is sketchy at best. Merging changes from two different people to two different things? You wish. One of us will need to manually copy the changes to their project. What changes? God himself could not diff those files and neither can I.
Project management? Done with notebooks. Sometimes. By some people. Usually we just wing it. I don't know what others are doing, and they don't know what I'm doing. I started writing READMEs, but I don't think this will catch on. 


/
===== QUESTION TIME =====
I apologize for any incoherence
I am currently kinda sick
And also falling asleep
Hope it was enjoyable anyway
========================= /

How would you go about DevOps for industrial automation? I can use Git with a self-hosted frontend for tracking changes, but that's not
really enough. Files are in a semi-binary format, so a standard diff isn't the right tool, and merging will basically have to not be done. I'm thinking of rolling a custom tool, specifically for working with those files, both for diffing and merging, but that will require reverse-engineering the file format and also plenty of time. Is there anything that can be done in the meantime? What about testing? I've read a paper about a similar situation, and, inspired by it, I'm considering hooking up the controller to a simulated IO device and either rolling my own or adapting an existing test harness to use real world hardware.
What about deployment? I think it's done rarely enough that doing it manually is fine, but still, not having to do that would be cool. Would something like Ansible work here?
If you have had experience dealing with similar systems, could you share any tips, mistakes you have made and the like?
Feel free to be imaginative - I have very few workflows and people to fight. Securing funding might be difficult, but I'm willing to give wacky ideas a shot.

This isn't really fitting here, so feel free to point me in the direction of more fitting subreddits, but I already wrote an essay so who cares if it's a bit longer. Any suggestions regarding automating some parts of the workflow? AutoHotkey really does help, but it's flimsy. One wrong press of a button and you entered a string of commands that did god knows what. Changing parameters requires opening a code editor and replacing strings. Is there anything short of rev-engineering the custom file format that would allow me to not do the same thing 30 times in a row?

https://redd.it/1e6lq7j
@r_devops