Reddit DevOps
269 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Devops or Full Stack Engineer - Career Path

Hi Im at a standstill as to what direction I should take my career. I recently got laid off my Technical Support Role and want to change my career. I have the option to enroll in a very good coding full stack bootcamp or to do a program preparing me to become a devops engineer.

​

I have a friend who does devops and he said it is not a bad job, however he is going to do something else as he does not light the odd hours. I also have fears that devops will change fast in the next 10-20 years. I want something high paying but stable. Please advice. Thanks

https://redd.it/10got1o
@r_devops
🚨 Terraform from 0 to Hero Blog Series

In the following weeks, I will be releasing a series around Terraform with beginner-friendly content that engages juniors and even non-technical people. I am going to take you through my 6-year journey with Terraform and how I believe you should learn it.

The first 3 episodes are already up and you can use this article as a table of contents: https://techblog.flaviusdinu.com/terraform-from-0-to-hero-0-i-like-to-start-counting-from-0-maybe-i-enjoy-lists-too-much-72cd0b86ebcd

Hope this will help beginners get a better grasp on the concepts and on what they should learn in order to get better.

https://redd.it/10hrk2s
@r_devops
junior dev ops here - need to configure Linux and Windows build/dev workstations on demand, for CI/CD pipelines and on-premise developers with special drivers/install processes that sometimes take 2-3 days manually. ML/AI. What tech stacks would you advise for config?

small shop. i'm currently working with devs in Machine Learning/AI and often we need to configure computers that utilize GPU/CUDA manually.

i'm in the process of setting up our build pipelines on gitlab with on prem workstations, but even that is taking quite a bit of time - we need both windows and linux runners and whenever a developer wants to integrate a new tool, we're going into each runner and going through the manual install process - AND ensuring each dev workstation is also updated. it just seems to be getting worse and worse each time and i'm struggling to keep up.

my knowledge of devops is really limited up to automated testing/build of applications and now it's going into IT infrastructure and I'm not exactly sure what tools I should be using. I'm manually installing drivers, configs on each computer (linux, windows) and sometimes there are so many areas of for human error or just losing track of what is installed on what.

on linux, i'm writing these extensive bash scripts that check and install the necessary dependencies (even downloading from our local nas ...) which devs can easily run and it'll update their workstations (or our runners) and I don't even know where to start on windows (the idea of maintaining a seperate set of powershell scripts that replicate the same purpose sounds insane to me in the long run).

Am I missing something? What tools should I be looking into?

https://redd.it/10ht39r
@r_devops
Fullstack DevOps is real and this is what it really means. And why you're probably not one..

DevOps is just a collaboration between developers and system administrators to help speed up the development process. It's NOT a mindset or culture as some of the people here like to say. Yes this closer working collaboration which can help to create a culture, but it's inaccurate to define it as such. True DevOps engineers are highly experienced full-stack developers.. Meaning they know both Dev side of things aswell as the Ops side.. Most people either only know Dev or Ops.. Its just that simple..

https://redd.it/10hvcim
@r_devops
Does anyone know the current status of Chick-fil-A’s per-restaurant Kubernetes cluster?

In 2018, CFA published a Medium post describing how they put a Kubernetes cluster in every restaurant to cache IoT events, auth, and a few other things.

Does anyone know if this is still running, and if so, what’s changed since this post?

https://medium.com/@cfatechblog/edge-computing-at-chick-fil-a-7d67242675e2

https://redd.it/10hw3yt
@r_devops
I created an open source secrets manager and Y Combinator just invested in it!

Super pumped to continue working on this and reduce some of the common pain points with secrets management us devs face. It's end to end encrypted like Vault but much easier to use with a growing list of integrations. Check it out! https://github.com/Infisical/infisical

https://redd.it/10i6ra1
@r_devops
Does trunk-based development still work for mlops and data science / AI heavy teams?

If you google trunk based development + mlops, you get very few hits. I'm curious to see if anyone here works with teams that build and publish machine learning models with decent success using trunk based development. As far as I know, the predominant model in the ML teams I've worked with was branch per environment, so, dev/stage/prod branches but we all know the challenges that style brings.

The reasoning I was always given was that data science / ml is much messier than pure software dev and therefore doesn't map well. I'm unconvinced.

So it was a surprise to see it recommended as the approach here by a thought leader in the ML world : https://www.databricks.com/explore/data-science-machine-learning/big-book-of-MLOps#page=1.

If you practice trunk based development on an ML team, please can you share how your team does it?

https://redd.it/10i2ixz
@r_devops
Hashicorp terraform on psionline for non-English speakers

I have a doubt, I've already taken online exams through Pearson Vue and I know they offer a text chat for people who are not fluent in English.

​

Does PSI online have the same tool for those who are going to take an online exam without being fluent in English?

https://redd.it/10ic13p
@r_devops
how to automate AWS marketplace publishing with Ansible - A beginner's guide

Hello everyone,

I've been a long-time subscriber to this subreddit, but this is my first post. I recently published an article on automating AWS marketplace publishing using Ansible. If you're new to Ansible or are looking to streamline your AWS marketplace publishing process, this article is for you!

In this article, I cover the basics of Ansible, how to create an EC2 instance, create an Amazon Machine Image (AMI), and how to use Ansible to automate the publishing process on the AWS marketplace.

I also share some tips and best practices for using Ansible to automate your AWS marketplace publishing.

You can find the article here: https://medium.com/@arshad.zameer/getting-started-with-ansible-for-aws-marketplace-publishing-a547cc13d182

I hope the article is helpful to you. If you have any questions or feedback, feel free to comment.

Thanks for reading!

\#Ansible #AWS #AWSMarketplace #Automation

https://redd.it/10iaq9a
@r_devops
Salary Sharing Thread January 2023

This thread is for sharing recent offers you've gotten or current salaries.

Please only post an offer if you're including hard numbers, but feel free to use a throwaway account if you're concerned about anonymity.

Education:

Company/Industry:

Title:

Years of technical experience:

Location:

Base Pay

Relocation/Signing Bonus:

Stock and/or recurring bonuses:

Total comp:

Tech Stack:

Last thread was a huge success so bringing it back on popular demand

https://redd.it/10i1hq5
@r_devops
What's your thoughts on Crossplane ?

Hello,

Am trying to get into IaC and it seems that there are three options in terms of technologies:

Terraform, Pulumi and Crossplane.

​

I definitely like the Kubernetesque-way of handling things (like Crossplane does).

But my questions are these :

​

What’s your experience/opinion on Crossplane so far (having in mind the other tools as well)?

Why should one use Crossplane instead of Pulumi or Terraform?

​

Any opinion or recommendation would be much appreciated.

Thanks

https://redd.it/10iix3j
@r_devops
Automating lambda functions

We have around 20 python lambda functions, so far whenever there is a change in function, I manually go and change it in all three envs (dev, uat and prod) so I am looking for a way to automate this.

First problem that comes to my mind is should I create a Single repo for all of them or separate, I also thought of creating a single repo but separate branch for each function. Separate repos will be a pain to manage and for small functions, it seems unnecessary. I prefer single repo but I do not want to trigger them all when there is a change in one function, so I came across Git Submodules features, which sounds exactly what I am looking for and even CodeBuild has toggle of "Use Git Submodules" but I do not understand how will CodeBuild know which build to trigger. I am not very clear with this point.

​

Now, once I version, I want to replicate this change across envs. I thought of using SAM/Cloudformation but how do I change my account number in ARNs. For ex - Some functions have SNS ARNs in env variables how do I change that Account ID respectively?

https://redd.it/10ii89f
@r_devops
What can we do better?

I work at a small startup and we offer a system on the web. We currently have 500 subscribers.

Our most pressing issue is that we don't always deliver updates with actual quality and end up hurting our clients in the process.

A few of our latest issues have been:

A guy dropped an index on our database and thought his query to create another index to replace it had run successfully, but it hadn't. Our database was overloaded for about an hour, until he realized his mistake.

A developer was using an ORM to generate queries but one of the generated queries ended up using the wrong date field, which had no index. Many clients reported the system was slow as a result of that.

A front-end developer fixed a bug but ended up bringing back another bug, which had already been fixed before.

I updated our Redis cluster (which changed its hostname) and forgot that there was a pretty important Lambda function which used that hostname. Only found out a week later.

Our main system is in Java, with a mix of Spring and Struts. It's also pretty monolithic. We're currently in the process of migrating most of our SSR pages to Angular and we're also making our back-end available as a public API through AWS API Gateway.

All our developers get full dumps of our production database whenever they need it. The downside of that (security aside) is that they have to wait for hours for their local database to be ready. The back-end developers can also connect directly to our production database (running on AWS RDS) when they need to debug.

Our back-end has a few tests, but they're all very basic and they were only introduced because someone said "hey, we need tests". No new tests have been added since September, even though we've made a lot of updates since then.

Our front-end has zero tests.

We create a different Git branch for every new issue. After the developer finishes their work, they send it to the staging girl to test it. The staging girl manually checks if everything is working as intended. A big issue with that is that she cannot catch any bugs that demand a lot of traffic to reproduce, which end up being the most serious bugs, since they affect everyone.

After staging is done, the developer opens a pull request. Another guy reviews the PR and approves it. The code will, then, go through a pipeline which builds it using Maven and uploads it to Elastic Beanstalk. When traffic isn't too high, we start the updates manually, selecting the version we want and rolling back if there was any issue.

The infrastructure for our main system was created manually, through the AWS console. I've been using IaC (AWS CDK) for new micro-services and when I need to move a service to a different kind of infrastructure. There is no pipeline for infrastructure; updates are performed manually.

Whenever there's a performance/stability issue, I use CloudWatch metrics and logs, as well as VisualVM, to diagnose it. One problem we have is that we don't have a history of our JVM metrics. If we don't happen to be at the office at the time of the issue, we have no way of telling what went wrong until the problem resurfaces.

https://redd.it/10inb1i
@r_devops
How do y'all do Self Service/ Ease of setup for Observability with Dev's?

I am becoming the observability guy for a larger company. We are getting better doing DevOps patterns but our observability really sucks.

I am trying to setup new standards and make it easier for our devs a la platform engineer style.

So seeking input on how you all did it or would do differently (We have to use Elk but willing to implement new tech) .

Part I can't really figure out how to make it easier for the dev's to do this without a lot of extra demand on them.

We use Elk mostly and have logs, metrics, and traces for areas that are willing to take the time to implement but they are rare. Looking to remove the obstacles for other devs.

https://redd.it/10io87o
@r_devops
Take home assignments during recruitment (Poll)

Got take home assignment and tbh its not difficult. I estimate 8hr of work to finish and Test it (to make sure all is ok). We are talking about fully automated deployment. I eventually refused to do it completely (did most of it except cicd part and vpc peering) as I think it is a waste of time for Senior devops and those questions can be easily asked during technical interview.

I'm quite frustrated that I spent 4 hours to do useless thing. Is this i norm in the industry ?


Here is the assignment:
Create 3 vpc (database, application and public) with multi az
Create an application loadbalancer in the public subnet
Create an RDS database for the application
Create an ECS or Kubernetes application with a simple NGINX with any kind of hello world
Create a way for developers to push changes and have that deployed to AWS.

so according to me what they want is:
3 VPC, vpc peering (as you can't link security groups otherwise), lbs, target group, ecs (it it faster then making full blown k8s cluster), ecr, iam roles, cloudwatch log group, rds setup, ci/cd deployment (most likely aws codecommit/build/deploy/pipeline) , code all in terraform, make it nicely with modules, variables and ideally remote-exec to build image and upload to ecr.

Is this i norm in the industry ? Could we just vote to know general opinion about this ?

Thanks

​

TL;DR What is your opinion about take-home assignments during recruitment?

View Poll

https://redd.it/10iohr4
@r_devops
Where have you had secrets leaked?

So obviously git is the obvious place for secrets to get leaked, with accidental files / changes being commited etc.

There has been some research recently into places like pypi, with secrets in all code being pushed up in packages.

I was just wondering where else people have seen this kind of issue happen. I guess docs systems are another candidate?

https://redd.it/10ir2n0
@r_devops
Rating for my two clusters and their storage IOPS



I appreciate if you could rate my iops read/write(disk) on my two clusters

Tier one plus (4hosts) has 24 VMs of total Storage (mix of local datastores and nvme storage)

Highest disk|read : 7144 and average of 6000

Highest disk|write: 5363 and average of 4000

Tier one (6hosts) has 87 VMs of total storage (mix of local datastores and nvme storage)

Highest disk|read : 49879 and average of 35000

Highest disk|write: 11820 and average of 9500

Is this normal or should more tweaking be done on limiting iops?

https://redd.it/10is4bo
@r_devops
Challenging myself with DevOps - want to see if I’m on the right track

Hey DevOps friends - I’m a hobbyist who historically used Heroku for all my app deployment. I’m using their recent pricing changes as a good excuse to push my skills of deployment further, and would love some guidance if I’m veering off track. I'm comfortable with building a frontend, spinning up a backend server/API, and general DB management, and now I really want to dive head into the DevOps world. Heads up: It's a lot of questions/information! Here's the current set up:


1. I have a monorepo set up with NPM workspaces. One workspace has two repos (Svelte frontend and Express backend); the other is a “common” workspace with shared schemas, env configurations, etc. General structure:

-deploy
-backend.dockerfile
-frontend.dockerfile
-packages
-backend
-index.js
-package.json
-...routers, controllers, db, etc.
-frontend
-build/
-src/
-esbuild.config.js
-package.json
-common
-schemas
-zod-schema-files/
-package.json
-env
-env-configs/
-package.json
-package.json
-package-lock.json
-docker-compose.yml
-nginx.conf



2. I use esbuild to compile and bundle all my frontend assets into a build/ folder. I have an NGINX file set up to serve these assets.


3. There are two Dockerfiles in a deploy/
folder (one for the frontend, one for the backend). The frontend file uses the NGINX image, copies the frontend assets, then copies and installs everything from the frontend repo as well as the shared repo.


This is the config of my frontend.dockerfile, which installs the packages from the frontend repo as well as both common ones, runs my build script, then copies the appropriate files over for my NGINX config:

FROM node:alpine AS web

WORKDIR /build

COPY ./package*.json ./
COPY ./packages/frontend/ ./packages/frontend/
COPY ./common/schemas/ ./common/schemas/
COPY ./common/env/ ./common/env/
RUN npm install
RUN npm run build

FROM nginx:latest

COPY ../nginx.conf /etc/nginx/conf.d/default.conf
COPY --from=web /build/packages/frontend/build /usr/share/nginx/html/



And the backend file, which copies the backend repo and the common ones again (this seemed extraneous but I couldn't get it running without copying in both places). It also generates my Prisma instance:

FROM node:alpine

WORKDIR /usr/src/app

COPY ./package*.json ./
COPY ./packages/backend/ ./packages/backend/
COPY ./common/schemas/ ./common/schemas/
COPY ./common/env/ ./common/env/
RUN npm install
RUN cd ./packages/backend/ && npx prisma generate

EXPOSE 3003

CMD ["npm", "start"]



These are the contents of my docker-compose file:

version: "3"
services:
web:
build:
context: .
dockerfile: ./deploy/frontend.dockerfile
ports:
- 8000:80
node:
build:
context: .
dockerfile: ./deploy/backend.dockerfile
ports:
- 49160:3003
depends_on:
- web



And my nginx.conf file:

server {
listen 80;
root /usr/share/nginx/html;
gzip on;
gzip_types text/css application/javascript application/json image/svg+xml;
gzip_comp_level 9;
etag on;
index index.html index.htm;

error_page 404 /404.html;
error_page 500 502 503 504 /50x.html;

location ~* \.(?:css|js|map|jpe?g|gif|png|ico)$ { }

location / {
autoindex on;
try_files $uri $uri/ /index.html;
}
}


Running things locally works fine (I can open the app in localhost, navigate around, hit my API, query my DB, etc.), but I have some questions and confusion on next steps:

1. I see some set ups include things like Postgres and Redis images in their Docker setup. Is this a best practice, or typical just for local testing? Both my prod and dev DBs are set up through AWS, and I have my .env file pointing to my dev DB URL, and I'm planning to add Redis for caching. I'm struggling to see the benefit of including images in Docker here for either.


2. My plan is to host everything via AWS. If my proposed end state is:
-Frontend is routed to my-domain.com, served via
NGINX
-Backend is routed to api.my-domain.com, also served via NGINX

Should the NGINX config have locations for both my top-level and subdomain? Or should there be two separate NGINX configurations? If one file, does it matter if the NGINX configuration is with my frontend image?


3. If planning to use AWS for deployment, what's the best way if using Docker Compose? I saw a tutorial where they deployed a web build image via ECS, but I've also seen tutorials recommending EC2. Or could/would be used in conjunction?


4. I see a lot of different recommendations online about setting up SSL certs (note: I have my domain and certifications purchased through AWS already). From what I gather, it'll be its own Docker image, though I think it's largely dependent on how the project will ultimately be deployed.


5. Presumably the CI/CD should manage the updates after all tests/checks have passed. This would update the images/containers wherever this is ultimately hosted in AWS, as well as any updates made to my dev DB schema and apply them to my prod DB, correct?


Again: This was a lot, but any guidance for a newbie in DevOps world would be great. I know I'm boiling the ocean to a degree, but that's part of the fun.

https://redd.it/10iwm68
@r_devops
Which monitoring system do you use in your company?

Please explain why do you think it is good or bad!

https://redd.it/10iztux
@r_devops