Reddit DevOps
270 subscribers
37 photos
31.6K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Interview advice

I have a technical screening later this week that consists of “cloud operations and logging”

Any advice on programmatic things to keep in mind or any sample interviews in this area would be greatly appreciated.

https://redd.it/ud3aef
@r_devops
vkv: recursively list key-values entries from Vaults KV2 engine in various formats

Hi I wrote this small utility in order to view large and nested entries in Vaults KVv2 Engine. Maybe it is interesting for you guys. I plan to add more formats, for instance displaying the current used token capabilities on each path and entry:

https://github.com/FalcoSuessgott/vkv

https://redd.it/ud9jcm
@r_devops
There is no such thing as too much logging - or is there?

What’s a modern day best practice around logging? How do you approach this? I myself like to log as much as possible, my boss swears there should only be debug and error logs.

https://redd.it/udgohy
@r_devops
Is DevOps in my environment possible?

My company uses a bunch of different .NET apps, some are APIs hosted on IIS, some are Windows Services. They talk to each other using MSMQ (Windows Message Queue), and SQL Server is used for all data storage. Setting up a new client requires manually configuring all these things.


I'm trying to wrap my head around how I can create CI/CD pipelines for our software/services/database. Is CI/CD only meant for single apps/websites or can it be used for entire environments? Thanks in advance.

https://redd.it/udj8s8
@r_devops
AWS sanity check

I've just discovered something in my workplace's AWS systems that feels unusual to me, and I'm hoping you guys can help me check my sanity.

For non-production environments, we secure our public facing services so only the people working on them (developers, QAs, stakeholders etc.) can use them. No problems there, I've done that before. But as part of their approach to this, they put the load balancers into private subnets. Then for production, these are instead in public subnets.

This feels wrong to me. It means there is a very different architecture in production and non-production environments. It seems to go against the principle of having environments as similar as possible. When I've done this in the past, we've always put our public LBs in public subnets and we secure non-prod environments with a VPN.

Of course this is just based off my own experience and I can't be sure if my worries are unfounded. What do you guys think? Thanks.

Edit: Our service in this case is running in ECS fargate containers with an Application Load Balancer and RDS

https://redd.it/udo6tk
@r_devops
If I use release steps in my build pipeline, can I make it show up on the Azure Release page with green dots where it was released?

If I use release steps in my build pipeline, can I make it show up on the Azure Release page with green dots where it was released?

https://redd.it/udvj4o
@r_devops
Why developers hate shift left and related automation buzzterms

Here's a question. I work (in a business, not a technical role) in a company that develops a solution to help the engineering unit manage security without the overhead (helps the developers manage the Oss jungle). Now when I speak with developers, they explain that they resent shift-left buzz terms and that they don't feel that shift-left solutions are helping them or that automation promise is encouraging. I wanted to hear your feedback and learn more. Shift left means that engineering is becoming the center, the heart, where everything happens, isn't it? Automation is always a blessing, it removes manual work, isn't it? what am I missing here?

https://redd.it/ue066o
@r_devops
Why developers hate shift left and related automation buzzterms

Here's a question. I work (in a business, not a technical role) in a company that develops a solution to help the engineering unit manage security without the overhead (helps the developers manage the Oss jungle). Now when I speak with developers, they explain that they resent shift-left buzz terms and that they don't feel that shift-left solutions are helping them or that automation promise is encouraging. I wanted to hear your feedback and learn more. Shift left means that engineering is becoming the center, the heart, where everything happens, isn't it? Automation is always a blessing, it removes manual work, isn't it? what am I missing here?

https://redd.it/ue066o
@r_devops
What are the best cloud-agnostic tools you use?

what are the best cloud-agnostic tools to handle your infra that you have tested and approuved?

https://redd.it/udz5bl
@r_devops
Will we move away from DSLs?

We had a discussion about Pulumi last week that got me thinking.

We have so many tools orientated around DSLs (usually based on YAML), such as Ansible, CloudFormation, Dockerfiles and all the different flavour of CI/CD, that integrate poorly with each other and are yet another thing to learn.

Would it not make more sense to just use a general purpose language, such as Python or Go, then have all these exist as libraries/modules?

The advantages being:

- All tools can be easily integrated, in any way you wish. Want to integrate docker build and configuration management? No problem.

- We'd have to learn less DSLs. e.g. One method of doing if statements, not several with varying syntax.

- Everything can be tested as a single unit, using tried and tested methods/tooling.

I don't know whether it will actually happen or not as vendors probably have too much vested interest. It feels like the way things should good though.

https://redd.it/uekb77
@r_devops
Process metrics of a workflow run of GitHub Action

Hi everyone,

We are highly using GitHub Actions for our operations. Last week, one of our workflows started running 2x slower than expected. Code hasn't changed, so as our workflow definition. We couldn't understand why by checking the metrics that GitHub Actions provide for us.

One approach is to catch the processes with their start-end times, arguments, and paths. I mean by the process is seeing every tiny operation done during the whole workflow execution.

Would it make sense to track the process metrics inside of the GitHub Actions?

https://redd.it/uemoia
@r_devops
Merge static application metrics in Grafana

Hi, I need some advice from the devops community.

I am mostly software developer with some DevOps tasks too and I am trying to find the best way to show application data metrics in Grafana.

We have some microservices/backends running in Kubernetes and I configured the have the default metrics in Grafana and it works just fine. The Prometheus framework we are using is providing them and Prometheus is scraping them automatically from the cluster.

Now I wanted to add some metrics of the application data (values from the database) to also show in Grafana. The nature of those metrics is a little bit different because they are not dependent on the application instance but they share the same value. Lets say for instance the amount of registered users or the amount of some interactions they do with the application that involves saved stuff in the database. I added a background worker that fetches those data from the database and provide it to the Prometheus integration framework we're using. Prometheus is scraping them but of course it shows them once per instance. I am struggling to find a way to merge those to a single value in the Grafana dashboards. I am currently using a workaround of adding the instance to the Grafana dashboard variables and then force the viewer to choose one but I wanted to know if there is a better way to do it.

As I see it I would have the following possibilities:

\- build an run an extra pod that only scrapes those database metrics and let Prometheus scrape it from there instead of the app instances. This could lead to duplicate code since it would use some of the business logic of the actual backend to interact with the database and would also be another place where database credentials must be given to (via Kubernetes secrets)

\- somehow try to prevent running the background worker in more than one pod. This could be complicated logic that would contain constantly checking if one of the pods is currently running the background service and starting stopping it automatically based on other pods

\- The approach I am currently using with only showing data for a single instance in the dashboard but the user must choose one where it doesn't really matter for the data

\- Kinda workaround using max() in Prometheus queries, this works for simple metrics but gets complicated

What would be the best way, is there some possibility to do this with a Prometheus query that I was missing when reading through the Prometheus docs?

https://redd.it/uem5p1
@r_devops
As a working professional, what certificates would you recommend to have?

I am a working professional, I would like to finish some certificate courses so that I can "Jump the Queue" if needed in the eyes of HR if I need to change jobs or ask for higher pay.

I live a modest, minimal life and happen to save a decent chunk to invest back into my career. I know that certificates may not be everything but I think having some may help my prospects for finding something better.

https://redd.it/uepjr9
@r_devops
Do you have personal wikis, websites or blogs full of your notes & documentation you like to share?

I'm kinda obsessed with documenting everything I do and use Zettelkasten to never ever find myself in the situation of "I knew this once, I also wrote it down, but I don't fucking know where to find it" again.
I also love to extract and combine the ideas of others (and shamelessly copy the knowledge of some people).

So, if you're new, experienced, or whatever, do any of you want to share something of that kind?
Like Github repos full of your projects and documentation of how you did it, like personal Wikis or Obsidian Publish Sites full of notes with detailed documentation, concepts or fresh ideas about DevOps?
Maybe you even wrote a book?

If you don't wanna share it with the people here you could also DM me.
I just don't want to miss the crispy experience of indulging in ideas and alien ways of thinking.

https://redd.it/ueswl1
@r_devops
Pipeline security question in interview

I recently had an interview, that I wasn’t successful with, I think partially because my response to a security question wasn’t good enough.

The question was to describe the security steps present in a CI pipeline.

I talked about SAST and DAST, and mentioned having policies in place to protect things in deployment (I realised I Probably went here because I didn’t feel confident in my response, and felt I was missing something)

I am now spending the weekend researching the nuances of the answer, but wondered what you fine people would respond to such a question

https://redd.it/ueyb3v
@r_devops
Use multiple cloud-init datasource in VMware

I have little issue understanding how cloud-init datasource work and hope for your advice.

I deploy a VM using terraform from the ubuntu cloud image Ova.
This image supports userdata via parameters which are picked up by cloud-init on boot. Works as expected and solves my problem bootstrapping a VM.

Cloud-init also has the VMware datasource which is provisioned by extra parameters on the VM itself. This datasource is behind the ovf datasource in cloud init by default. Side note, VMware datasource is available in cloud init by default. This was an extra step in earlier versions.

So now I'm curious how cloud-init should work when I provision userdata on both datasources. When I debug available datasources in the vm both are detected with Ovf in the first place, followed by VMware.

In the logs I can see Ovf datasource is executed. VMware seams to be ignored.

So should cloud-init even consider the VMware datasource if Ovf is present?

https://redd.it/uf7bvi
@r_devops
Vault HA mode(OSS) vs Vault Enterprise

Hashcorp vault enterprise provides three main features, performance replica, disaster recovery, and namespace. Well my use case is not required to go with disaster recovery and for performance replica i can setup Vault OSS with consul backend and run many active cluster which will be equivalent of performance replica, Is my understanding s correct will that feasible to not to use license and still have the same what Vault Enterprise

https://redd.it/ufdrv4
@r_devops
Is my expectations for candidates too high?

So we’re taking interviews for a senior DevOps role.

Most of the candidates are good with just tools, they know how to make a declarative pipeline with Jenkins, kubernetes deployments, daemonsets etc., Prometheus, grafana.

When we start to talk about systems, most candidates have no idea. For example, when I ask candidate who possess a CKA certification about the function of kube proxy, they really have no idea. Just saying kube proxy takes care of networking is a good answer? Expecting a candidate to know what subsystem of Linux is used by kube proxy is too much?

Expecting a candidate to know what layer of OSI does SSH belong irrelevant to DevOps?

Some basics about SSH/DNS/HTTP/TLS are essentially for any DevOps/systems engineering role imho, but candidates pursuing DevOps as a career lacks these.

Sorry for the rant, what I’m coming to ask is, just knowing to operate tools without giving a damn about the internals good enough ? How do you select people?

Edit: I see lot of engineers don’t like the idea of asking a osi reference as a question, just to let you know, I don’t cling to the osi and reject them for not knowing it. All I’m talking about is, lacking in basics of systems like DNS, ports, HTTP and etc. I’m sorry if I didn’t make it clear at my initial attempt.

https://redd.it/uffq2u
@r_devops
Out the Womb, Straight into DevOps

I will be joining all you guys as an associate devops engineer after graduation. If there is 1 piece of advice you could give me, what would it be?

What tool or concepts do you wish you knew when starting out? Any advice is much appreciated!

UPDATE: Thank you to everyone who commented! Everyone gave good advice and I really appreciate it!

https://redd.it/uerrww
@r_devops
How important is Design Patterns in a non-SWE role?

I am a self taught sysadmin, worked decades in the field and trying to gain dev competency, although I dont intend to be a full blown SWE. I know there are massive gaps in my knowledge and I am looking into "Design Patterns" and woah... the rabbit hole is deep. I am starting to think I cant even say I know Python now.

My question is: how important is "Design Patterns" for a devops/SRE role? Would you expect someone in this role to know it? Would a job that requires Python knowledge also require a good appreciation of "Design Patterns"?

https://redd.it/uf1ufp
@r_devops