Reddit DevOps
270 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Project Roadmap for Learning DevOps practices and tooling - Looking for feedback

Background: Network Security Engineer with good knowledge in Python, JS, Bash, Linux, Azure, and AWS. Interested in DevOps. I am not looking into getting a new job. I am just very fascinated by the field and want to learn and fill in the blanks and understand the assets/processes I am protecting on the network.

Challenges: After googling around, and searching Reddit, Pluralsight, and Github, I noticed that most project ideas are very small in scale which is understandable because most are focused on beginners who are looking to get into the field so they need quick wins. In my case, I need something bigger with more functionality and integrations to understand the whole picture.

So based on what I have been reading and watching I came up with a project roadmap. However, before I start, I would appreciate some feedback to know if I am heading in the right direction since I have no immediate contacts with DevOps engineers in my life this subreddit is the best next thing. Please feel free to leave a comment with any tips on making this roadmap better in terms of tooling, services, or overall design. I want this to be as close to a production ENV as possible. My goal is to learn as much as possible about DevOps before the AI gods take over...

# Project Roadmap:

Phase 1: Simple Static Web Application Deployment
Objective: I will deploy a simple, containerized web application to Azure.
Key Focus Areas: Learning Docker for containerization, getting familiar with Azure services, and beginning to use Terraform for Infrastructure as Code.
Azure Services: Azure Container Instances (ACI) for hosting the containerized application, Azure Container Registry (ACR) for storing Docker images.
Outcome: My web application is accessible over the internet, and deployed using a basic CI/CD pipeline.
Phase 2: Expand to a 3-tier Application with Auto-Scaling and High Availability

Objective: I will evolve the application into a full-stack solution with a frontend, backend, and database. I'll implement auto-scaling and ensure high availability.
Key Focus Areas: Architecting a full-stack application, utilizing Azure database services, implementing auto-scaling, and achieving high availability across multiple regions.
Azure Services: Azure App Service for hosting web applications with auto-scaling capabilities, Azure SQL Database or Cosmos DB for data persistence, Azure Traffic Manager or Azure Front Door for high availability and traffic management.
Outcome: My application is resilient and scalable, capable of handling variable loads and maintaining availability during infrastructural changes.
Phase 3: Implement Infrastructure Monitoring and Logging
Objective: I will integrate monitoring and logging solutions to maintain visibility into the application's performance and infrastructure health.
Key Focus Areas: Setting up and configuring monitoring and logging tools, integrating these with the existing Azure infrastructure.
Azure Services: Azure Monitor and Azure Log Analytics for monitoring and logging, Application Insights for application performance monitoring.
Outcome: I have comprehensive monitoring and logging, supporting proactive issue detection and efficient troubleshooting.
Phase 4: Implement Security and Compliance Automation
Objective: I will enhance the project with automated security scans, compliance checks, and vulnerability assessments.
Key Focus Areas: Integrating security tools in the CI/CD pipeline, adopting compliance as code practices, and conducting regular vulnerability scans.
Azure Services: Azure Security Center for security management and threat protection, Azure Policy for enforcing compliance policies.
Outcome: My infrastructure and applications are secure, minimizing the risk of security breaches and data
Reverse proxy options with dynamic lookups

Context: I need a reverse proxy that can dynamically lookup a key in dynamodb and then use a value to interpolate the backend dns entry. For example, I got customer “foo” and he has a “backend” value mapped to “99”, I then send him proxy the request to alb-99.internal. I can’t really have a giant config that is loaded into the proxy because there’s about 100 million accounts. What are my options? It needs to handle about 100k RPS.

I was thinking of using caddy and writing a custom Go module but our company is a node company, so this would be the first non-typescript app. I could also just write in Typescript but I would rather start with a well known reverse proxy with a plugin versus just building another app. Envoy has xDS but I think the config would be too large to pull. Something like nginx with Lua might work too but I’ve haven’t seen Lua work at that scale so I’m not sure. Can traefik do dynamic lookups like this? Any other suggestions out there?

https://redd.it/1bs0n4j
@r_devops
How to get good at MLOps

Hi fellows, I am a chemical engineer turned DevOps Engineer, it's been 1 and half year since I am at DevOps , I am loving it but, I am worried about all the fuss about AI taking over jobs, So I am trying to enhance my skillset and learn MLOps, So has anyone here tried getting into MLOps, Any advice is appreciated.

https://redd.it/1bs3eyu
@r_devops
Need career advice

Hello Guys,
I am from Bangalore, India.
Have around 4 1/2 years of experience in azure. In my first company I only learned about vms, storage account, Linux.
In my current organisation I have exposure to all sorts of services like app gateway,load balancer, az firewall, apim, vms etc...but we don't have docker/containerization and kubernetes. They dont have linux :( That is bit of a background about technical knowledge. I do write small powershell scripts.
Right now I am confused whether I should be pursuing my career further as a devops engineer or should I switch to development. I know the basics of python and java. I have interest in development.
What would be a good path for me take further down road especially in India?
I have heard devops jobs are not very consistent across many companies and it is just another way of calling sysadmin's. Pay-wise/peace of work comparatively which one is better as my career progresses?
Please ignore this post if you feel it is a low quality one. Any advice would be appreciated. I am just confused.

https://redd.it/1bs4961
@r_devops
How to promote aws Terraform from staging to prod?

Hello,

I have a small development fully deployed on AWS...

I manage the infrastructure via terraform and the code deployment via ansible.

I have two folders in my project for terraform:
- one for staging
- one for prod

Once in a blue moon I have some infra changes/updates which i tests in staging g before to go in production...

How do you promote the terraform code from staging to prod? Do I copy and paste?

Sorry, trying to get the best practice here

https://redd.it/1bs7vx8
@r_devops
CypherMate

**🌟 Introducing CypherMate: A Leap Towards Secure Corporate Communications**


Today, I am incredibly proud to present CypherMate, an open-source project created by me, designed to revolutionize the way corporations handle secure communications within Slack. In our digital age, the protection of sensitive information is not just a necessity but a cornerstone of successful business operations.


**What is CypherMate?**


CypherMate is a cutting-edge Slack bot designed to make password sharing and sensitive information exchange both secure and effortless. With just a few simple commands, you can encrypt messages, generate one-time secure links, and ensure that your data is accessible only to the intended recipients.
Key Features:

* Encrypt & Decrypt Messages: Securely share encrypted information right within Slack, with easy decryption for the recipient.
* One-Time Secure Links: Share sensitive documents or messages through links that expire after a single use, adding an extra layer of security.
* User-Friendly: CypherMate simplifies complex encryption processes, making secure communication accessible to everyone in your organization.


**Why CypherMate? 🛡**


In an era where data breaches can have catastrophic consequences, ensuring the security of your corporate communications is paramount. CypherMate offers:
Enhanced Data Security: By encrypting your messages and using one-time links, CypherMate significantly reduces the risk of data leaks and unauthorized access.


Streamlined Workflow: Securely share information without disrupting your team’s workflow. CypherMate’s seamless integration with Slack means no more switching between apps or complicated encryption tools.


Peace of Mind: Know that your sensitive information is protected with state-of-the-art security measures, giving you the confidence to share what’s important.


**Ideal for Every Corporation**


Whether you’re a startup or a Fortune 500 company, CypherMate is the tool you need to secure your Slack communications. It’s not just about protecting data; it’s about fostering a culture of security and responsibility.





[https://github.com/Pyshios/CypherMate/tree/main](https://github.com/Pyshios/CypherMate/tree/main)





https://redd.it/1bs8u8n
@r_devops
How to start a "DevOps advocacy project"?

Hi, we've decided to try and start a DevOps advocacy project because we've had issues with "organic" learning among developers.

We need to give them a basic understanding of the DevOps principles and the tools and platform we use to run the apps.

I'm not looking for any technical advice but for organizational stuff. How do you go about the "training", how to do it for frontend or backend developers, ideal scope size for the trainings, how often, does pair programming work, etc.?

Thank you all for your insights.

https://redd.it/1bsayfe
@r_devops
AWS hourly spend cost bot

At a former job, we had this AWS cost bot that would post a graph to Slack about our spend on an hourly basis or so and we could see at a glance if there was some weird spike.
Does anyone know what this tool is? I'd like to set one up at my current job. Or do you think it was just something set up using a maybe a lambda and calling some cost explorer api's?

https://redd.it/1bscioc
@r_devops
Failed to connect to your instance after deploying mern app on aws ec2 instance

i dockerized my mern app (Next js, node js , mongodb) and trying to deploy it on aws ec2 instance. when i try to access my backend on port 5000 via aws public ip then it works fine when i try to access frontend then terminal stuck and if i try to reload the terminal then ssh gives error.
i am getting error if i try to reload the terminal
Failed to connect to your instance
Error establishing SSH connection to your instance. Try again later.
. then i have to stop the instance and start the instance. then again backend works fine and when try to access frontend it gives error. this is my folder structure looks like

myecommerce folder then it have two more folders backend frontend nginx (nginx have two files one is dockerfile and second is nginx.conf) docker-compose.yml

this is how my nginx docker file looks like

FROM nginx:latest


RUN rm /etc/nginx/conf.d/*

COPY ./nginx.conf /etc/nginx/conf.d/

CMD [ "nginx", "-g", "daemon off;" \]
this is how my nginx.conf file looks like

events {}

http {
server {
listen 80;
server_name here my aws public ip;

location / {
root /usr/share/nginx/html;
index index.html index.htm;
try_files $uri $uri/ /index.html;
}
}
}
this is my frontend folder docker file

FROM node:20-alpine

WORKDIR /app

COPY package*.json ./

RUN npm install

COPY . .

EXPOSE 3000

CMD npm run dev
this is my backend folder docker file

FROM node:20-alpine

RUN npm install -g nodemon


WORKDIR /app

COPY package*.json ./
RUN npm install

COPY . .

EXPOSE 5000

CMD ["npm", "run", "dev"\]
this is how my docker-compose.yml looks like

version: '3'
services:
frontend:
image: my frontend image from docker hub
ports:
\- "3000:3000"

backend:
image:my backend image from dockerhub
ports:
\- "5000:5000"

nginx:
image: my nginx image from dockerhub
ports:
\- "80:80"
later i want to setup github ci cd pipelines for it and using custom domain to access the website later. i am not sure if i am using docker-compose i still need to setup pm2. i am also posting my inbound rules i dont know why frontend is not working. guys i am beginner in aws deployment and dockerization. i am improving my skills please help me i am stuck in this from many days i saw alot of videos and watched multiple videos but not a single article or video doing what i am actually trying to do. Thanks in advance


https://redd.it/1bseijj
@r_devops
Failed to connect to your instance after deploying mern app on aws ec2 instance

i dockerized my mern app (Next js, node js , mongodb) and trying to deploy it on aws ec2 instance. when i try to access my backend on port 5000 via aws public ip then it works fine when i try to access frontend then terminal stuck and if i try to reload the terminal then ssh gives error.
i am getting error if i try to reload the terminal
Failed to connect to your instance
Error establishing SSH connection to your instance. Try again later.
. then i have to stop the instance and start the instance. then again backend works fine and when try to access frontend it gives error. this is my folder structure looks like

myecommerce folder then it have two more folders backend frontend nginx (nginx have two files one is dockerfile and second is nginx.conf) docker-compose.yml

this is how my nginx docker file looks like

FROM nginx:latest


RUN rm /etc/nginx/conf.d/*

COPY ./nginx.conf /etc/nginx/conf.d/

CMD [ "nginx", "-g", "daemon off;" \]
this is how my nginx.conf file looks like

events {}

http {
server {
listen 80;
server_name here my aws public ip;

location / {
root /usr/share/nginx/html;
index index.html index.htm;
try_files $uri $uri/ /index.html;
}
}
}
this is my frontend folder docker file

FROM node:20-alpine

WORKDIR /app

COPY package*.json ./

RUN npm install

COPY . .

EXPOSE 3000

CMD npm run dev
this is my backend folder docker file

FROM node:20-alpine

RUN npm install -g nodemon


WORKDIR /app

COPY package*.json ./
RUN npm install

COPY . .

EXPOSE 5000

CMD ["npm", "run", "dev"\]
this is how my docker-compose.yml looks like

version: '3'
services:
frontend:
image: my frontend image from docker hub
ports:
\- "3000:3000"

backend:
image:my backend image from dockerhub
ports:
\- "5000:5000"

nginx:
image: my nginx image from dockerhub
ports:
\- "80:80"
later i want to setup github ci cd pipelines for it and using custom domain to access the website later. i am not sure if i am using docker-compose i still need to setup pm2. i am also posting my inbound rules i dont know why frontend is not working. guys i am beginner in aws deployment and dockerization. i am improving my skills please help me i am stuck in this from many days i saw alot of videos and watched multiple videos but not a single article or video doing what i am actually trying to do. Thanks in advance


https://redd.it/1bseijj
@r_devops
Container orchestration vs. VM orchestration in the cloud.

I'm trying to understand the specific use cases where we'd prefer to use container orchestration (Kubernetes) as opposed to VM orchestration (Nomad) in a cloud setting.

It seems to me that clearly, if you're focused on batch jobs, you're working with single-purpose VMs that are started then destroyed after doing their specific bit of work, so setting up a VM image to provision them with everything they need would seem to me to introduce less overhead into the cluster, and it wouldn't make much sense to use Kubernetes for a case like this. The distinguishing properties of the cloud that makes it easy to find one or more VMs that match the required scaling seem to me to make it as elastic and malleable as a container-level orchestration.

In what specific cases would you prefer to use Kubernetes?

https://redd.it/1bshdqx
@r_devops
Coursera Plus at 90% off

I will be inviting you to use Plus for a year (worth $399) on your email (corporate invites) at $39 and obviously you won't be paying me without any proof that you require from and before you are satisfied. If anyone is needy and actually needs it, can dm me. I'll help them!

https://redd.it/1bsj1nc
@r_devops
Vulnerability Management Lifecycle in DevSecOps

This is the first entry in a series on a technology-driven, automated approach to DevSecOps architecture! This post helps you set up your teams for success in making sense of all the noise that comes from various vulnerability scanners.

https://blog.gitguardian.com/vulnerability-management-lifecycle-in-devsecops/

https://redd.it/1bskn2k
@r_devops
How do you monitor the uptime of different microservices in k8s?

tl;dr: Got a bunch of third party cybersecurity tools/services running in our k8s cluster, I need to figure out a way to measure/benchmark the uptime of different microservices that these tools spin up so we can establish some SLOs.

Bit of background, I am on a small devops team that is there to support the internal security team of my company, which in turns supports 5000+ devs.

Almost everything we run is vendor tooling for all kinds of different security scanners, using their helm charts/manifests/w.e. Some of these tools have their own monitoring, but they are ok at best for our needs.

I am looking for a solution to help me monitor the uptime of all the different microservices that get spun up by these tools. We do have grafana/prometheus setup, and I've got prometheus blackbox exporter running for probing HTTP endpoints without too much logic built into it, but that doesn't always paint the whole picture.

It'd be nice to aim for 99% uptime, but 95% as a start is also acceptable. The stuff we run isn't super critical except for a few times per year, but we keep a close eye on the cluster during that time anyways. So whatever solution I come up with, it needs to check every 5-10 minutes to give a good enough granularity for measuring up to 99%. Two main options that I am considering and one kind of crazy one:

- Expand upon Blackbox exporter, try and get it to hit as many API endpoints as possible. I think most things we run have some kind of an API that I can use to check whether a service is up or not. I'd want to avoid this though because I am personally not a fan of writing too much logic in YAML
- Add service specific labels to each pod, so if ALL pods with that label go down, I know the particular service is degraded.
- Write a custom operator? Never wrote one before, but maybe this is the answer?

https://redd.it/1bsmhdx
@r_devops
Need advice about end to end testing

Hi all,

I’m new to the world of dev ops and while there is a lot to learn, I am enjoying it so far. In particular, I like that dev ops allows me to increase confidence in my deployments, and have better control over quality.

One of the areas in which I’d like to improve is in my frontend deployment. My stack consists of a backend in one repository, and then several decoupled React front ends where each live in its own repository. I want to have full confidence that I don’t accidentally break the integration between the frontend and backend when deploying new frontend code, I.e. that the frontend successfully calls the API of my backend every time I deploy.

The way I am thinking of approaching this is:

In my GitHub actions workflow, add a build step for an end to end test in my frontend repository. This build step accesses the repository for my backend, deploys a production-like environment, and then run end to end tests on this environment. Once the test are over, tear down the test environment.

I am wondering if this is a valid approach? I’m curious how mature organizations handle this sort of thing.

https://redd.it/1bsn6l1
@r_devops
AWS cost limit.

I’m an absolute beginner to aws and I have only on-premise or private cloud experience.
I like experimenting with new technologies and I’m not afraid to break things. However, this never applied to aws because I was afraid of financial ruin.
However, this situation sucks. I would like to learn aws in a save environment knowing that whatever comes I will never be charged more than e.g. 30$ per month.
Does such an option exist?

https://redd.it/1bslxt7
@r_devops
What types of SLOs are you creating?

Do you guys have service level SLOs? On my team, we only have SLOs for CUJs and have a couple of SLOs per CUJ which encompass all services involved in that CUJ. There are no SLOs on any individual service.

A lot of documentation that I read seems to talk about service level SLOs. If you use these, do you alert on them? What CUJ do you group them into as a single service could belong to multiple CUJs? Do you use CUJ and service level SLOs?

I am trying to figure out if we are doing things incorrectly and should create SLOs per service as well

Also this seems to point to doing more product-level SLOs: https://sre.google/resources/practices-and-processes/product-focused-reliability-for-sre/#measure-performance



https://redd.it/1bsrbpu
@r_devops
Introducing Templater: A Simple CLI, inspired by helm, for Text File Templating for Developers

Today, I'm both excited and humbled to share a project that's been a labor of love and necessity: Templater.

This journey started with a personal frustration I encountered in my development work. The need for a simple, powerful way to template text data without diving into the deep end of another programming language led me to create Templater, drawing inspiration from Helm's templating capabilities.
Check out Templater on GitHub
Templater is an open-source tool that leverages the Sprig library, allowing you to template not just individual text files but entire directories. It's designed to be intuitively familiar for those of you who've worked with Helm. The main difference is you can feed any files, or directory.
I've included a practical example that demonstrates Templater's real-world application: a multi-region Packer build. This example, found in the examples directory, illustrates how Templater can streamline and simplify complex tasks, making it an invaluable tool in your development arsenal.
I warmly invite you to explore Templater, try it in your projects, and share your feedback. Your insights and contributions will be invaluable as we continue to refine and expand Templater's capabilities together.
Thank you for your support and curiosity. Let's make the development process a bit easier for everyone.
Warm regards,

Rajesh Rajendran

https://redd.it/1bsow5n
@r_devops