Reddit DevOps – Telegram

Reddit DevOps

269 subscribers

4 photos

31K links

Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels

Download Telegram

About

Blog

Apps

Platform

269 subscribers

junior dev ops here - need to configure Linux and Windows build/dev workstations on demand, for CI/CD pipelines and on-premise developers with special drivers/install processes that sometimes take 2-3 days manually. ML/AI. What tech stacks would you advise for config?

small shop. i'm currently working with devs in Machine Learning/AI and often we need to configure computers that utilize GPU/CUDA manually.

i'm in the process of setting up our build pipelines on gitlab with on prem workstations, but even that is taking quite a bit of time - we need both windows and linux runners and whenever a developer wants to integrate a new tool, we're going into each runner and going through the manual install process - AND ensuring each dev workstation is also updated. it just seems to be getting worse and worse each time and i'm struggling to keep up.

my knowledge of devops is really limited up to automated testing/build of applications and now it's going into IT infrastructure and I'm not exactly sure what tools I should be using. I'm manually installing drivers, configs on each computer (linux, windows) and sometimes there are so many areas of for human error or just losing track of what is installed on what.

on linux, i'm writing these extensive bash scripts that check and install the necessary dependencies (even downloading from our local nas ...) which devs can easily run and it'll update their workstations (or our runners) and I don't even know where to start on windows (the idea of maintaining a seperate set of powershell scripts that replicate the same purpose sounds insane to me in the long run).

Am I missing something? What tools should I be looking into?

https://redd.it/10ht39r
@r_devops

junior dev ops here - need to configure Linux and Windows...

small shop. i'm currently working with devs in Machine Learning/AI and often we need to configure computers that utilize GPU/CUDA manually. i'm...

6 views15:47

Fullstack DevOps is real and this is what it really means. And why you're probably not one..

DevOps is just a collaboration between developers and system administrators to help speed up the development process. It's NOT a mindset or culture as some of the people here like to say. Yes this closer working collaboration which can help to create a culture, but it's inaccurate to define it as such. True DevOps engineers are highly experienced full-stack developers.. Meaning they know both Dev side of things aswell as the Ops side.. Most people either only know Dev or Ops.. Its just that simple..

https://redd.it/10hvcim
@r_devops

Fullstack DevOps is real and this is what it really means. And why...

**DevOps** is just a collaboration between developers and system administrators to help speed up the development process. It's **NOT** a mindset...

7 views16:47

Does anyone know the current status of Chick-fil-A’s per-restaurant Kubernetes cluster?

In 2018, CFA published a Medium post describing how they put a Kubernetes cluster in every restaurant to cache IoT events, auth, and a few other things.

Does anyone know if this is still running, and if so, what’s changed since this post?

https://medium.com/@cfatechblog/edge-computing-at-chick-fil-a-7d67242675e2

https://redd.it/10hw3yt
@r_devops

5 views21:53

I created an open source secrets manager and Y Combinator just invested in it!

Super pumped to continue working on this and reduce some of the common pain points with secrets management us devs face. It's end to end encrypted like Vault but much easier to use with a growing list of integrations. Check it out! https://github.com/Infisical/infisical

https://redd.it/10i6ra1
@r_devops

GitHub - Infisical/infisical: Infisical is the open-source platform for secrets, certificates, and privileged access management.

Infisical is the open-source platform for secrets, certificates, and privileged access management. - Infisical/infisical

5 views01:52

Does trunk-based development still work for mlops and data science / AI heavy teams?

If you google trunk based development + mlops, you get very few hits. I'm curious to see if anyone here works with teams that build and publish machine learning models with decent success using trunk based development. As far as I know, the predominant model in the ML teams I've worked with was branch per environment, so, dev/stage/prod branches but we all know the challenges that style brings.

The reasoning I was always given was that data science / ml is much messier than pure software dev and therefore doesn't map well. I'm unconvinced.

So it was a surprise to see it recommended as the approach here by a thought leader in the ML world : https://www.databricks.com/explore/data-science-machine-learning/big-book-of-MLOps#page=1.

If you practice trunk based development on an ML team, please can you share how your team does it?

https://redd.it/10i2ixz
@r_devops

The Big Book Of MLOps

A data-centric approach to establish and scale machine learning

5 views02:48

humor AWS CDK – Proposed Slogans

We are switching from CloudFormation (SAM) to AWS CDK.

It feels quite productive. This is my way to express gratitude.

https://ilya-sher.org/2023/01/19/aws-cdk-proposed-slogans/

https://redd.it/10ib1u1
@r_devops

AWS CDK – Proposed Slogans

Below, despite the humor, is my honest praise to the AWS CDK team and the product. Finally bringing code into “infrastructure as code”(sorry Puppet, Ansible, CloudFormation, SAM, Terraf…

5 views04:52

Hashicorp terraform on psionline for non-English speakers

I have a doubt, I've already taken online exams through Pearson Vue and I know they offer a text chat for people who are not fluent in English.



Does PSI online have the same tool for those who are going to take an online exam without being fluent in English?

https://redd.it/10ic13p
@r_devops

Hashicorp terraform on psionline for non-English speakers

I have a doubt, I've already taken online exams through Pearson Vue and I know they offer a text chat for people who are not fluent in...

5 views05:52

how to automate AWS marketplace publishing with Ansible - A beginner's guide

Hello everyone,

I've been a long-time subscriber to this subreddit, but this is my first post. I recently published an article on automating AWS marketplace publishing using Ansible. If you're new to Ansible or are looking to streamline your AWS marketplace publishing process, this article is for you!

In this article, I cover the basics of Ansible, how to create an EC2 instance, create an Amazon Machine Image (AMI), and how to use Ansible to automate the publishing process on the AWS marketplace.

I also share some tips and best practices for using Ansible to automate your AWS marketplace publishing.

You can find the article here: https://medium.com/@arshad.zameer/getting-started-with-ansible-for-aws-marketplace-publishing-a547cc13d182

I hope the article is helpful to you. If you have any questions or feedback, feel free to comment.

Thanks for reading!

\#Ansible #AWS #AWSMarketplace #Automation

https://redd.it/10iaq9a
@r_devops

Getting started with Ansible for AWS marketplace publishing

Welcome to my Medium blog, where I, Arshad Zameer, an AWS-certified professional with over 10 years of IT industry experience, will share…

6 views05:53

Salary Sharing Thread January 2023

This thread is for sharing recent offers you've gotten or current salaries.

Please only post an offer if you're including hard numbers, but feel free to use a throwaway account if you're concerned about anonymity.

Education:

Company/Industry:

Title:

Years of technical experience:

Location:

Base Pay

Relocation/Signing Bonus:

Stock and/or recurring bonuses:

Total comp:

Tech Stack:

Last thread was a huge success so bringing it back on popular demand

https://redd.it/10i1hq5
@r_devops

Salary Sharing Thread August 2022

This thread is for sharing recent offers you've gotten or current salaries. Please only post an offer if you're including hard numbers, but feel...

5 views08:52

What's your thoughts on Crossplane ?

Hello,

Am trying to get into IaC and it seems that there are three options in terms of technologies:

Terraform, Pulumi and Crossplane.



I definitely like the Kubernetesque-way of handling things (like Crossplane does).

But my questions are these :



What’s your experience/opinion on Crossplane so far (having in mind the other tools as well)?

Why should one use Crossplane instead of Pulumi or Terraform?



Any opinion or recommendation would be much appreciated.

Thanks

https://redd.it/10iix3j
@r_devops

What's your thoughts on Crossplane ?

Hello, Am trying to get into IaC and it seems that there are three options in terms of technologies: Terraform, Pulumi and...

5 views12:52

Automating lambda functions

We have around 20 python lambda functions, so far whenever there is a change in function, I manually go and change it in all three envs (dev, uat and prod) so I am looking for a way to automate this.

First problem that comes to my mind is should I create a Single repo for all of them or separate, I also thought of creating a single repo but separate branch for each function. Separate repos will be a pain to manage and for small functions, it seems unnecessary. I prefer single repo but I do not want to trigger them all when there is a change in one function, so I came across Git Submodules features, which sounds exactly what I am looking for and even CodeBuild has toggle of "Use Git Submodules" but I do not understand how will CodeBuild know which build to trigger. I am not very clear with this point.



Now, once I version, I want to replicate this change across envs. I thought of using SAM/Cloudformation but how do I change my account number in ARNs. For ex - Some functions have SNS ARNs in env variables how do I change that Account ID respectively?

https://redd.it/10ii89f
@r_devops

Automating lambda functions

We have around 20 python lambda functions, so far whenever there is a change in function, I manually go and change it in all three envs (dev, uat...

6 views14:52

What can we do better?

I work at a small startup and we offer a system on the web. We currently have 500 subscribers.

Our most pressing issue is that we don't always deliver updates with actual quality and end up hurting our clients in the process.

A few of our latest issues have been:

A guy dropped an index on our database and thought his query to create another index to replace it had run successfully, but it hadn't. Our database was overloaded for about an hour, until he realized his mistake.

A developer was using an ORM to generate queries but one of the generated queries ended up using the wrong date field, which had no index. Many clients reported the system was slow as a result of that.

A front-end developer fixed a bug but ended up bringing back another bug, which had already been fixed before.

I updated our Redis cluster (which changed its hostname) and forgot that there was a pretty important Lambda function which used that hostname. Only found out a week later.

Our main system is in Java, with a mix of Spring and Struts. It's also pretty monolithic. We're currently in the process of migrating most of our SSR pages to Angular and we're also making our back-end available as a public API through AWS API Gateway.

All our developers get full dumps of our production database whenever they need it. The downside of that (security aside) is that they have to wait for hours for their local database to be ready. The back-end developers can also connect directly to our production database (running on AWS RDS) when they need to debug.

Our back-end has a few tests, but they're all very basic and they were only introduced because someone said "hey, we need tests". No new tests have been added since September, even though we've made a lot of updates since then.

Our front-end has zero tests.

We create a different Git branch for every new issue. After the developer finishes their work, they send it to the staging girl to test it. The staging girl manually checks if everything is working as intended. A big issue with that is that she cannot catch any bugs that demand a lot of traffic to reproduce, which end up being the most serious bugs, since they affect everyone.

After staging is done, the developer opens a pull request. Another guy reviews the PR and approves it. The code will, then, go through a pipeline which builds it using Maven and uploads it to Elastic Beanstalk. When traffic isn't too high, we start the updates manually, selecting the version we want and rolling back if there was any issue.

The infrastructure for our main system was created manually, through the AWS console. I've been using IaC (AWS CDK) for new micro-services and when I need to move a service to a different kind of infrastructure. There is no pipeline for infrastructure; updates are performed manually.

Whenever there's a performance/stability issue, I use CloudWatch metrics and logs, as well as VisualVM, to diagnose it. One problem we have is that we don't have a history of our JVM metrics. If we don't happen to be at the office at the time of the issue, we have no way of telling what went wrong until the problem resurfaces.

https://redd.it/10inb1i
@r_devops

What can we do better?

I work at a small startup and we offer a system on the web. We currently have 500 subscribers. Our most pressing issue is that we don't always...

7 views16:52

How do y'all do Self Service/ Ease of setup for Observability with Dev's?

I am becoming the observability guy for a larger company. We are getting better doing DevOps patterns but our observability really sucks.

I am trying to setup new standards and make it easier for our devs a la platform engineer style.

So seeking input on how you all did it or would do differently (We have to use Elk but willing to implement new tech) .

Part I can't really figure out how to make it easier for the dev's to do this without a lot of extra demand on them.

We use Elk mostly and have logs, metrics, and traces for areas that are willing to take the time to implement but they are rare. Looking to remove the obstacles for other devs.

https://redd.it/10io87o
@r_devops

How do y'all do Self Service/ Ease of setup for Observability with...

I am becoming the observability guy for a larger company. We are getting better doing DevOps patterns but our observability really sucks. I am...

7 views18:52

Take home assignments during recruitment (Poll)

Got take home assignment and tbh its not difficult. I estimate 8hr of work to finish and Test it (to make sure all is ok). We are talking about fully automated deployment. I eventually refused to do it completely (did most of it except cicd part and vpc peering) as I think it is a waste of time for Senior devops and those questions can be easily asked during technical interview.

I'm quite frustrated that I spent 4 hours to do useless thing. Is this i norm in the industry ?

Here is the assignment:
Create 3 vpc (database, application and public) with multi az
Create an application loadbalancer in the public subnet
Create an RDS database for the application
Create an ECS or Kubernetes application with a simple NGINX with any kind of hello world
Create a way for developers to push changes and have that deployed to AWS.

so according to me what they want is:
3 VPC, vpc peering (as you can't link security groups otherwise), lbs, target group, ecs (it it faster then making full blown k8s cluster), ecr, iam roles, cloudwatch log group, rds setup, ci/cd deployment (most likely aws codecommit/build/deploy/pipeline) , code all in terraform, make it nicely with modules, variables and ideally remote-exec to build image and upload to ecr.

Is this i norm in the industry ? Could we just vote to know general opinion about this ?

Thanks



TL;DR What is your opinion about take-home assignments during recruitment?

View Poll

https://redd.it/10iohr4
@r_devops

Take home assignments during recruitment (Poll)

Got take home assignment and tbh its not difficult. I estimate 8hr of work to finish and Test it (to make sure all is ok). We are talking about...

7 views20:28

Where have you had secrets leaked?

So obviously git is the obvious place for secrets to get leaked, with accidental files / changes being commited etc.

There has been some research recently into places like pypi, with secrets in all code being pushed up in packages.

I was just wondering where else people have seen this kind of issue happen. I guess docs systems are another candidate?

https://redd.it/10ir2n0
@r_devops

Where have you had secrets leaked?

So obviously git is the obvious place for secrets to get leaked, with accidental files / changes being commited etc. There has been some...

5 views21:52

Rating for my two clusters and their storage IOPS

I appreciate if you could rate my iops read/write(disk) on my two clusters

Tier one plus (4hosts) has 24 VMs of total Storage (mix of local datastores and nvme storage)

Highest disk|read : 7144 and average of 6000

Highest disk|write: 5363 and average of 4000

Tier one (6hosts) has 87 VMs of total storage (mix of local datastores and nvme storage)

Highest disk|read : 49879 and average of 35000

Highest disk|write: 11820 and average of 9500

Is this normal or should more tweaking be done on limiting iops?

https://redd.it/10is4bo
@r_devops

Rating for my two clusters and their storage IOPS

I appreciate if you could rate my iops read/write(disk) on my two clusters Tier one plus (4hosts) has 24 VMs of total Storage (mix of local...

5 views22:40

Challenging myself with DevOps - want to see if I’m on the right track

Hey DevOps friends - I’m a hobbyist who historically used Heroku for all my app deployment. I’m using their recent pricing changes as a good excuse to push my skills of deployment further, and would love some guidance if I’m veering off track. I'm comfortable with building a frontend, spinning up a backend server/API, and general DB management, and now I really want to dive head into the DevOps world. Heads up: It's a lot of questions/information! Here's the current set up:

1. I have a monorepo set up with NPM workspaces. One workspace has two repos (Svelte frontend and Express backend); the other is a “common” workspace with shared schemas, env configurations, etc. General structure:

-deploy
 -backend.dockerfile
 -frontend.dockerfile
-packages
 -backend
  -index.js
  -package.json
  -...routers, controllers, db, etc.
 -frontend
  -build/
  -src/
  -esbuild.config.js
  -package.json
-common
 -schemas 
  -zod-schema-files/
  -package.json
 -env
  -env-configs/
  -package.json
-package.json
-package-lock.json
-docker-compose.yml
-nginx.conf

2. I use esbuild to compile and bundle all my frontend assets into a build/ folder. I have an NGINX file set up to serve these assets.

3. There are two Dockerfiles in a deploy/
folder (one for the frontend, one for the backend). The frontend file uses the NGINX image, copies the frontend assets, then copies and installs everything from the frontend repo as well as the shared repo.

This is the config of my frontend.dockerfile, which installs the packages from the frontend repo as well as both common ones, runs my build script, then copies the appropriate files over for my NGINX config:

FROM node:alpine AS web

WORKDIR /build

COPY ./package*.json ./
COPY ./packages/frontend/ ./packages/frontend/
COPY ./common/schemas/ ./common/schemas/
COPY ./common/env/ ./common/env/
RUN npm install
RUN npm run build

FROM nginx:latest

COPY ../nginx.conf /etc/nginx/conf.d/default.conf
COPY --from=web /build/packages/frontend/build /usr/share/nginx/html/

And the backend file, which copies the backend repo and the common ones again (this seemed extraneous but I couldn't get it running without copying in both places). It also generates my Prisma instance:

FROM node:alpine

WORKDIR /usr/src/app

COPY ./package*.json ./
COPY ./packages/backend/ ./packages/backend/
COPY ./common/schemas/ ./common/schemas/
COPY ./common/env/ ./common/env/
RUN npm install
RUN cd ./packages/backend/ && npx prisma generate

EXPOSE 3003

CMD ["npm", "start"]

These are the contents of my docker-compose file:

version: "3"
services:
  web:
    build:
      context: .
      dockerfile: ./deploy/frontend.dockerfile
    ports:
      - 8000:80
  node:
    build:
      context: .
      dockerfile: ./deploy/backend.dockerfile
    ports:
      - 49160:3003
    depends_on:
      - web

And my nginx.conf file:

server {
    listen 80;
    root   /usr/share/nginx/html;
    gzip on;
    gzip_types text/css application/javascript application/json image/svg+xml;    
    gzip_comp_level 9;
    etag on;
    index  index.html index.htm;
 
    error_page 404 /404.html;
    error_page   500 502 503 504  /50x.html;

    location ~* \.(?:css|js|map|jpe?g|gif|png|ico)$ { }  
    
    location / {
        autoindex on;
        try_files $uri $uri/ /index.html;
    }
}

Running things locally works fine (I can open the app in localhost, navigate around, hit my API, query my DB, etc.), but I have some questions and confusion on next steps:

1. I see some set ups include things like Postgres and Redis images in their Docker setup. Is this a best practice, or typical just for local testing? Both my prod and dev DBs are set up through AWS, and I have my .env file pointing to my dev DB URL, and I'm planning to add Redis for caching. I'm struggling to see the benefit of including images in Docker here for either.

2. My plan is to host everything via AWS. If my proposed end state is:
-Frontend is routed to my-domain.com, served via

7 views00:52

NGINX
-Backend is routed to api.my-domain.com, also served via NGINX

Should the NGINX config have locations for both my top-level and subdomain? Or should there be two separate NGINX configurations? If one file, does it matter if the NGINX configuration is with my frontend image?

3. If planning to use AWS for deployment, what's the best way if using Docker Compose? I saw a tutorial where they deployed a web build image via ECS, but I've also seen tutorials recommending EC2. Or could/would be used in conjunction?

4. I see a lot of different recommendations online about setting up SSL certs (note: I have my domain and certifications purchased through AWS already). From what I gather, it'll be its own Docker image, though I think it's largely dependent on how the project will ultimately be deployed.

5. Presumably the CI/CD should manage the updates after all tests/checks have passed. This would update the images/containers wherever this is ultimately hosted in AWS, as well as any updates made to my dev DB schema and apply them to my prod DB, correct?

Again: This was a lot, but any guidance for a newbie in DevOps world would be great. I know I'm boiling the ocean to a degree, but that's part of the fun.

https://redd.it/10iwm68
@r_devops

Challenging myself with DevOps - want to see if I’m on the right track

Hey DevOps friends - I’m a hobbyist who historically used Heroku for all my app deployment. I’m using their recent pricing changes as a good...

7 views00:52

Which monitoring system do you use in your company?

Please explain why do you think it is good or bad!

https://redd.it/10iztux
@r_devops

Which monitoring system do you use in your company?

Please explain why do you think it is good or bad!

7 views02:53

Is there a GitHub Actions equivalent to CircleCI dynamic config?

I’m using a monorepo and only want to run workflows for my affected projects. In CircleCI, this was pretty simple using https://circleci.com/docs/dynamic-config/. GHA doesn’t seem to mention anything similar in documentation. It seems like some people have done something similar https://stackoverflow.com/questions/65384420/how-to-make-a-github-action-matrix-element-conditional.

Have you guys done dynamic workflows in GHA, if so, how’d you do it?

Best answer gets a beer 🍻

https://redd.it/10j4100
@r_devops

7 views04:52

Location of cloud builder workspace?

When you clone a repo in gcloud cloud builder, where is the location actual local workspace?

https://redd.it/10iue0o
@r_devops

Location of cloud builder workspace?

When you clone a repo in gcloud cloud builder, where is the location actual local workspace?

7 views05:52