Reddit DevOps
269 subscribers
4 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Secret Management Across Environments / Vault

My team is growing, and we're running into an issue now where managing secrets is just getting too crazy. It's happened several times where a secret deployed to our integration environment doesn't exist in production and it halts the release. We want to do a release to staging? Forget it... We'll have to wade through all the secrets added since the last deploy.

I was thinking of creating a tool that allows uploading secrets to our environments, but whenever you upload a secret to one environment it forces you to specify it for all of them. Then I realized this is too common of a problem and surely there is a better solution.

1. I started looking into Vault. I'm not sure what to think of it though. I also still don't feel like I'm getting it. It sounds like Vault wants you to deploy an instance of it per environment, instead of having a single instance over all environments. If I have an instance of Vault for every single environment, it seems like Vault doesn't really solve my problem.
2. I'm not an ops guy, but this is going to fall to me to champion it. I'm not really finding a full explanation of how this is all going to work together on GCP.
3. If a secret expires from vault (because that's a thing apparently?), how does the server get a new value? Is retrieving the new value manual or automatic?
4. Is it ok to deploy Vault to a serverless environment like CloudRun? This means that the container won't get CPU cycles unless there is an active request. Will this cause me issues?

https://redd.it/10hipa7
@r_devops
Do you use Intune?

What do you think about Intune? Are you using it? Or do you have specific reasons to not to? Are you using something else? Is Intune fitting into a a complete DevOps solution for your job? I'm curious how common Intune is for people into devops.

https://redd.it/10hkl0e
@r_devops
Git merge from development to production



The content of our file in the dev branch

>server: dev-server
parameters - 200|300

we change the parameters to 200, which needs to be moved to production. However the server portion should not be changed while merging. I am wondering how can/ways to achieve this using git?

prod configuration

>server: prod-server
parameters - 200|300

I know if we do a merge this will change the server portion as well.

As a side note we maintain git for maintaining the server configuration files for a Data quality software tool. This repository contains the configuration/files that is needed for that tool to be deployed properly.

There is shell script which takes this code from git and deploys to the server where the tool is hosted. After deployment, post a restart of the server, the changes will take into effect to the tool.

https://redd.it/10hl665
@r_devops
Containers

Nowadays 2023 and going forward, I'm pretty new to containers and currently learning, is worth at all learn Docker concepts or even use Docker within an Orchestration technology as Kubernetes? I know that Docker manages containers for Apps mean while the container Technology can be swap for another one like RKT or CRI-O.. I'm wondering if as of today 2023 is Docker a solid proposal for example for green field projects? .. I haven't heard too much about RKT or CRI-O ... Seems like the buzzword in regards of containers is Docker but I'm here asking you... Any feedback appreciated thanks !

https://redd.it/10hja1t
@r_devops
Devops or Full Stack Engineer - Career Path

Hi Im at a standstill as to what direction I should take my career. I recently got laid off my Technical Support Role and want to change my career. I have the option to enroll in a very good coding full stack bootcamp or to do a program preparing me to become a devops engineer.

​

I have a friend who does devops and he said it is not a bad job, however he is going to do something else as he does not light the odd hours. I also have fears that devops will change fast in the next 10-20 years. I want something high paying but stable. Please advice. Thanks

https://redd.it/10got1o
@r_devops
🚨 Terraform from 0 to Hero Blog Series

In the following weeks, I will be releasing a series around Terraform with beginner-friendly content that engages juniors and even non-technical people. I am going to take you through my 6-year journey with Terraform and how I believe you should learn it.

The first 3 episodes are already up and you can use this article as a table of contents: https://techblog.flaviusdinu.com/terraform-from-0-to-hero-0-i-like-to-start-counting-from-0-maybe-i-enjoy-lists-too-much-72cd0b86ebcd

Hope this will help beginners get a better grasp on the concepts and on what they should learn in order to get better.

https://redd.it/10hrk2s
@r_devops
junior dev ops here - need to configure Linux and Windows build/dev workstations on demand, for CI/CD pipelines and on-premise developers with special drivers/install processes that sometimes take 2-3 days manually. ML/AI. What tech stacks would you advise for config?

small shop. i'm currently working with devs in Machine Learning/AI and often we need to configure computers that utilize GPU/CUDA manually.

i'm in the process of setting up our build pipelines on gitlab with on prem workstations, but even that is taking quite a bit of time - we need both windows and linux runners and whenever a developer wants to integrate a new tool, we're going into each runner and going through the manual install process - AND ensuring each dev workstation is also updated. it just seems to be getting worse and worse each time and i'm struggling to keep up.

my knowledge of devops is really limited up to automated testing/build of applications and now it's going into IT infrastructure and I'm not exactly sure what tools I should be using. I'm manually installing drivers, configs on each computer (linux, windows) and sometimes there are so many areas of for human error or just losing track of what is installed on what.

on linux, i'm writing these extensive bash scripts that check and install the necessary dependencies (even downloading from our local nas ...) which devs can easily run and it'll update their workstations (or our runners) and I don't even know where to start on windows (the idea of maintaining a seperate set of powershell scripts that replicate the same purpose sounds insane to me in the long run).

Am I missing something? What tools should I be looking into?

https://redd.it/10ht39r
@r_devops
Fullstack DevOps is real and this is what it really means. And why you're probably not one..

DevOps is just a collaboration between developers and system administrators to help speed up the development process. It's NOT a mindset or culture as some of the people here like to say. Yes this closer working collaboration which can help to create a culture, but it's inaccurate to define it as such. True DevOps engineers are highly experienced full-stack developers.. Meaning they know both Dev side of things aswell as the Ops side.. Most people either only know Dev or Ops.. Its just that simple..

https://redd.it/10hvcim
@r_devops
Does anyone know the current status of Chick-fil-A’s per-restaurant Kubernetes cluster?

In 2018, CFA published a Medium post describing how they put a Kubernetes cluster in every restaurant to cache IoT events, auth, and a few other things.

Does anyone know if this is still running, and if so, what’s changed since this post?

https://medium.com/@cfatechblog/edge-computing-at-chick-fil-a-7d67242675e2

https://redd.it/10hw3yt
@r_devops
I created an open source secrets manager and Y Combinator just invested in it!

Super pumped to continue working on this and reduce some of the common pain points with secrets management us devs face. It's end to end encrypted like Vault but much easier to use with a growing list of integrations. Check it out! https://github.com/Infisical/infisical

https://redd.it/10i6ra1
@r_devops
Does trunk-based development still work for mlops and data science / AI heavy teams?

If you google trunk based development + mlops, you get very few hits. I'm curious to see if anyone here works with teams that build and publish machine learning models with decent success using trunk based development. As far as I know, the predominant model in the ML teams I've worked with was branch per environment, so, dev/stage/prod branches but we all know the challenges that style brings.

The reasoning I was always given was that data science / ml is much messier than pure software dev and therefore doesn't map well. I'm unconvinced.

So it was a surprise to see it recommended as the approach here by a thought leader in the ML world : https://www.databricks.com/explore/data-science-machine-learning/big-book-of-MLOps#page=1.

If you practice trunk based development on an ML team, please can you share how your team does it?

https://redd.it/10i2ixz
@r_devops
Hashicorp terraform on psionline for non-English speakers

I have a doubt, I've already taken online exams through Pearson Vue and I know they offer a text chat for people who are not fluent in English.

​

Does PSI online have the same tool for those who are going to take an online exam without being fluent in English?

https://redd.it/10ic13p
@r_devops
how to automate AWS marketplace publishing with Ansible - A beginner's guide

Hello everyone,

I've been a long-time subscriber to this subreddit, but this is my first post. I recently published an article on automating AWS marketplace publishing using Ansible. If you're new to Ansible or are looking to streamline your AWS marketplace publishing process, this article is for you!

In this article, I cover the basics of Ansible, how to create an EC2 instance, create an Amazon Machine Image (AMI), and how to use Ansible to automate the publishing process on the AWS marketplace.

I also share some tips and best practices for using Ansible to automate your AWS marketplace publishing.

You can find the article here: https://medium.com/@arshad.zameer/getting-started-with-ansible-for-aws-marketplace-publishing-a547cc13d182

I hope the article is helpful to you. If you have any questions or feedback, feel free to comment.

Thanks for reading!

\#Ansible #AWS #AWSMarketplace #Automation

https://redd.it/10iaq9a
@r_devops
Salary Sharing Thread January 2023

This thread is for sharing recent offers you've gotten or current salaries.

Please only post an offer if you're including hard numbers, but feel free to use a throwaway account if you're concerned about anonymity.

Education:

Company/Industry:

Title:

Years of technical experience:

Location:

Base Pay

Relocation/Signing Bonus:

Stock and/or recurring bonuses:

Total comp:

Tech Stack:

Last thread was a huge success so bringing it back on popular demand

https://redd.it/10i1hq5
@r_devops
What's your thoughts on Crossplane ?

Hello,

Am trying to get into IaC and it seems that there are three options in terms of technologies:

Terraform, Pulumi and Crossplane.

​

I definitely like the Kubernetesque-way of handling things (like Crossplane does).

But my questions are these :

​

What’s your experience/opinion on Crossplane so far (having in mind the other tools as well)?

Why should one use Crossplane instead of Pulumi or Terraform?

​

Any opinion or recommendation would be much appreciated.

Thanks

https://redd.it/10iix3j
@r_devops
Automating lambda functions

We have around 20 python lambda functions, so far whenever there is a change in function, I manually go and change it in all three envs (dev, uat and prod) so I am looking for a way to automate this.

First problem that comes to my mind is should I create a Single repo for all of them or separate, I also thought of creating a single repo but separate branch for each function. Separate repos will be a pain to manage and for small functions, it seems unnecessary. I prefer single repo but I do not want to trigger them all when there is a change in one function, so I came across Git Submodules features, which sounds exactly what I am looking for and even CodeBuild has toggle of "Use Git Submodules" but I do not understand how will CodeBuild know which build to trigger. I am not very clear with this point.

​

Now, once I version, I want to replicate this change across envs. I thought of using SAM/Cloudformation but how do I change my account number in ARNs. For ex - Some functions have SNS ARNs in env variables how do I change that Account ID respectively?

https://redd.it/10ii89f
@r_devops
What can we do better?

I work at a small startup and we offer a system on the web. We currently have 500 subscribers.

Our most pressing issue is that we don't always deliver updates with actual quality and end up hurting our clients in the process.

A few of our latest issues have been:

A guy dropped an index on our database and thought his query to create another index to replace it had run successfully, but it hadn't. Our database was overloaded for about an hour, until he realized his mistake.

A developer was using an ORM to generate queries but one of the generated queries ended up using the wrong date field, which had no index. Many clients reported the system was slow as a result of that.

A front-end developer fixed a bug but ended up bringing back another bug, which had already been fixed before.

I updated our Redis cluster (which changed its hostname) and forgot that there was a pretty important Lambda function which used that hostname. Only found out a week later.

Our main system is in Java, with a mix of Spring and Struts. It's also pretty monolithic. We're currently in the process of migrating most of our SSR pages to Angular and we're also making our back-end available as a public API through AWS API Gateway.

All our developers get full dumps of our production database whenever they need it. The downside of that (security aside) is that they have to wait for hours for their local database to be ready. The back-end developers can also connect directly to our production database (running on AWS RDS) when they need to debug.

Our back-end has a few tests, but they're all very basic and they were only introduced because someone said "hey, we need tests". No new tests have been added since September, even though we've made a lot of updates since then.

Our front-end has zero tests.

We create a different Git branch for every new issue. After the developer finishes their work, they send it to the staging girl to test it. The staging girl manually checks if everything is working as intended. A big issue with that is that she cannot catch any bugs that demand a lot of traffic to reproduce, which end up being the most serious bugs, since they affect everyone.

After staging is done, the developer opens a pull request. Another guy reviews the PR and approves it. The code will, then, go through a pipeline which builds it using Maven and uploads it to Elastic Beanstalk. When traffic isn't too high, we start the updates manually, selecting the version we want and rolling back if there was any issue.

The infrastructure for our main system was created manually, through the AWS console. I've been using IaC (AWS CDK) for new micro-services and when I need to move a service to a different kind of infrastructure. There is no pipeline for infrastructure; updates are performed manually.

Whenever there's a performance/stability issue, I use CloudWatch metrics and logs, as well as VisualVM, to diagnose it. One problem we have is that we don't have a history of our JVM metrics. If we don't happen to be at the office at the time of the issue, we have no way of telling what went wrong until the problem resurfaces.

https://redd.it/10inb1i
@r_devops
How do y'all do Self Service/ Ease of setup for Observability with Dev's?

I am becoming the observability guy for a larger company. We are getting better doing DevOps patterns but our observability really sucks.

I am trying to setup new standards and make it easier for our devs a la platform engineer style.

So seeking input on how you all did it or would do differently (We have to use Elk but willing to implement new tech) .

Part I can't really figure out how to make it easier for the dev's to do this without a lot of extra demand on them.

We use Elk mostly and have logs, metrics, and traces for areas that are willing to take the time to implement but they are rare. Looking to remove the obstacles for other devs.

https://redd.it/10io87o
@r_devops
Take home assignments during recruitment (Poll)

Got take home assignment and tbh its not difficult. I estimate 8hr of work to finish and Test it (to make sure all is ok). We are talking about fully automated deployment. I eventually refused to do it completely (did most of it except cicd part and vpc peering) as I think it is a waste of time for Senior devops and those questions can be easily asked during technical interview.

I'm quite frustrated that I spent 4 hours to do useless thing. Is this i norm in the industry ?


Here is the assignment:
Create 3 vpc (database, application and public) with multi az
Create an application loadbalancer in the public subnet
Create an RDS database for the application
Create an ECS or Kubernetes application with a simple NGINX with any kind of hello world
Create a way for developers to push changes and have that deployed to AWS.

so according to me what they want is:
3 VPC, vpc peering (as you can't link security groups otherwise), lbs, target group, ecs (it it faster then making full blown k8s cluster), ecr, iam roles, cloudwatch log group, rds setup, ci/cd deployment (most likely aws codecommit/build/deploy/pipeline) , code all in terraform, make it nicely with modules, variables and ideally remote-exec to build image and upload to ecr.

Is this i norm in the industry ? Could we just vote to know general opinion about this ?

Thanks

​

TL;DR What is your opinion about take-home assignments during recruitment?

View Poll

https://redd.it/10iohr4
@r_devops
Where have you had secrets leaked?

So obviously git is the obvious place for secrets to get leaked, with accidental files / changes being commited etc.

There has been some research recently into places like pypi, with secrets in all code being pushed up in packages.

I was just wondering where else people have seen this kind of issue happen. I guess docs systems are another candidate?

https://redd.it/10ir2n0
@r_devops