Reddit DevOps
270 subscribers
6 photos
31.1K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
If you could define your own responsibilities...

I've had the golden offer, a job where I define my own responsibilities in a senior role within a large tech organisation

​

If you were in this position, what would you do?

https://redd.it/10x66dd
@r_devops
I want to get certified on GCP, Azure or AWS. Which organisation would give me the best future prospects?

Title says it all really. What do you think?

It feels like AWS is more popular.

https://redd.it/10x62ur
@r_devops
Master's Grad in 2022. Cannot land a first Round interview.

Hello y'all, Trying to get into Mid - Sr : SRE/DevOps

Having a tough time since I graduated and can't seem to land even first round interviews. Spent 100+ hours on this resume thinking something I am doing is wrong.Still unable to land even a SINGLE interview round!

I know my DSA's, have 3+ years of legit experience but things are getting bad very steeply. Shooting a shot here to see if someone can critique my resume.

XXX-XX-XX | XXXX, Open to relocation | LinkedIn Profile | GitHub | [email protected]

An AWS Certified Solutions Architect with a master’s degree in Computer Engineering and 4+ years of hands-on experience in developing event driven cloud native applications in public clouds like AWS with services like Lambda, Fargate, ECS,S3 etc... I am highly skilled in platform reliability, microservices design, serverless technologies, test automation and DevOps practices.

SKILLS:Languages (4+ years): Python (boto3, NumPy, Requests), Bash (Linux/CentOS), JavaScript (React/Redux/Node).

AWS services (5+ years): AWS CLI/SDK | Lambda, Fargate, CloudFormation, ECS, CloudWatch, S3, RDS, Kinesis.DevOps (4+ years): Network Administration, NetApp, Docker, Kubernetes, Jenkins, CircleCI, Git, GitHub.Amazon Web Services (5+ years): AWS CLI/SDK, Serverless Framework, AWS Backups, VPC, API Gateways, Lambda, EC2, EBS, EKS, CloudFormation, CloudWatch, S3, DynamoDB.

Web Development / Others: Authentication/Authorization (OAuth, JWT, RBAC, SSO), Microsoft Power BI, Databases (MySQL, PostgreSQL, Redis).

PROFESSIONAL ACHIEVEMENTS

Software Engineer – DevOps 2018 - 2021Organization Durham, NC

· Raised KPI’s expected in an DevOps role, consistently delivered 99.99% SLA, took ownership of all infrastructure for release/change management, performed zero downtime deployments, and completed a fully automated software testing system in under 2 months.

· Spearheaded agile teams in implementing well architected frameworks/best practices in AWS which led to successful ISO 9001:2015 Quality Management certification for XXXX.

· Patched 500+ issues in React/Redux front end systems, Relational Databases and configuration management.

· Created/Managed custom AMI’s, Docker Images and Kubelets to complete deployments. Performed patch management for EC2 instances, shifted security and QA practices to the left of SDLC.

· Authored reusable Infrastructure as Code templates (CloudFormation, Makefiles) for dynamic provisioning of AWS resources like EKS, EC2, Lambda, S3 within private/public VPC’s on AWS cloud.

· Authored workflows that preprocessed and analyzed code (CircleCI, Jenkins, Veracode) to identify many software vulnerabilities in early stages of SDLC. Competitively performed Code Reviews for Python, ReactJS and Infra.

· Managed Public/Private VPC’s. Ensured highly available and resilient architectures for enterprise software platforms.

EDUCATION

Master’s in Computer Science Jan 2021 - Dec 2022

(Some research information here) GPA: 3.69

Bachelor’s in Computer Science and Engineering Aug 2013 – May 2017 GPA: 3.75

​

​

https://redd.it/10x5b4w
@r_devops
How to set up alert monitor for data dog over a function of time?

Trying to setup a data dog monitor to check if a pod is in a certain status for over ten minutes? How can I do this? Datadog sub Reddit is locked and I could not find the relevant information in datadog docs, maybe I’m not looking in the right place. Help would be appreciated!

https://redd.it/10x4rv3
@r_devops
A better way to manage secrets in Kubernetes

Wrote an article on how to better manage secrets in kubernetes by using a custom operator i made. The operator will fetch secrets and put them in a Kubernetes secret and can auto reload deployments that depend on the fetched secrets. You can think of it like a wrapper around native kubernetes secrets

Article: https://maidul.medium.com/kubernetes-secrets-management-on-autopilot-36e0c6373024

https://redd.it/10xjxtf
@r_devops
Looking for DevOps learning partner

Hello everyone, I’ve recently started learning devOps and also looking for someone who is eager to learn and share knowledge together.
I intend to study AWS, Azure DevOps, Docker, Kubernetes, Terraform, an other related technologies.
Hit me up if you’re interested

Discord username: Illusive man#1442

https://redd.it/10xko9j
@r_devops
work sucks

The best vote to end this wins, submit your ideas! maybe your idea will make the news!!! how exciting!

https://redd.it/10xmibq
@r_devops
Do these sentences make sense?

I'm a tech writer interviewing DevOps engineers who are english as a second language. I just wanted to ask, if these sentences would make sense and are properly ordered. I know some of the tools, but not all.

Just want to make sure it's not something redundant like I know adobe creative cloud, photoshop, illustrator, creative suite, figma...

What tools for DevOps have you worked with?

”I’ve worked with many tools with CI/CD like Jenkins, ArgoCD, Github Actions, Ansible, Terraform, CloudFormation, Docker and Chef."

What's your ideal tech stack?

"Docker, Python, Kubernetes, AWS EKS, ArgoCD, Github Actions, Terraform Cloud. I'm already working with most of them."

https://redd.it/10xmwsr
@r_devops
Don’t have a CS degree, but want to learn CS fundamentals and practices (not necessarily a specific language). Where can I do this?

I graduated with a B.S. in IT (not CS specifically) in 2011, so I’ve been working for over a decade. Help desk, sysadmin, systems engineer, etc…

I’m currently a “Software Engineer”, but I do DevOps. Working with AWS, Terraform, Jenkins, Kubernetes, etc… It’s more Ops than Dev.

I self-learned Bash scripting and Python, but struggle to keep up with a coworker who has a legitimate CS degree. When he starts talking about “strongly-typed object-oriented programming languages”, I get lost. I just write bad Python scripts to make API calls, process YAML files, etc…

Where can I learn programming fundamentals? I don’t want to only learn a specific language, I want to learn programming jargon, best practices, architecture, etc…

I found this course on edX which I’m considering signing up for. Thoughts?

https://www.edx.org/course/software-engineering-basics-for-everyone

https://redd.it/10xmrgv
@r_devops
How can I implement terraform cd in bitbucket server?

Currently we are using bitbucket server managing our codes, we also has Jenkins, I want to implement terraform cd like github action:

comment "/tf plan" in PR-> run terraform plan, output result to comment and slack

comment "/tf apply" in PR -> run terraform apply, if apply succussed, then automatically merge PR

I have used github action, the above procedure is easy to implement with an operator server

I am wondering whether bitbucket server can easily achieve this or not.

Could you give me some ideas?

https://redd.it/10xp0q9
@r_devops
Github Actions vs CircleCI for 'advanced workflows'

Hello!


I'm currently considering to move our sizable CircleCI setup (multiple pipelines, about 50 active developers) over to Github Actions, with pricing being one of the main arguments for doing so.

I have at least basic knowledge and some experience with both tools.

One thing I keep reading while researching this is that "CircleCI has better support for advanced workflows", without explicitly stating what they mean exactly with this.

Could anyone point me to specific features/workflows that are supported by CircleCI that we'd be missing in Github Actions? And are there any arguments I should know off for sticking with CirclecI?

https://redd.it/10x39cx
@r_devops
Start a bat file remotely which never returns anything (jmeter-server.bat)

So we are doing distributed testing of our web-app using JMeter. For that you need to have the jmeter-server.bat file running in background as it acts as sort of a listener. The problem arises when one of the slave machine out of 4 restarts due to the load and the test is effectively stuck right there as the master machine expects some output from the 4th machine. Currently the automation is done via ansible-playbooks which are called in Jenkins. There are more or less 15 tests that are downstream to one another. So even if one test is stuck, the time is wasted until someone check on the machines.

​

Things I've tried so far:

​

1. I've tried using the Windows Task Scheduler and kept the jmeter-server.bat to run without any user loggin in, but it starts the bat file in background which in-turn spawns all the child processes in the background as well i.e. starts Selenium Chrome in headless mode.

2. I've tried adding the jmeter-server.bat in startup and configuring the system to AutoLogon without any password to trigger a session which will call the startup file. But unfortunately the idea was scrapped by IT for being insecure.

3. Tried using the ansible playbook by using the win_command but it again gets stuck as the batch file never returns anything.

4. Created a service as well for the bat file, but again the child processes started in background.

https://redd.it/10xrhzr
@r_devops
Using renovatebot to generate one PR per file, regardless of how many changes

Hi folks,

Recently I wrangled my renovate config to ensure that I'd get a single PR generated per-file, even if that file included multiple changes from multiple "managers". In my case, I needed to combine helmrelease updates, as well as helm values (for image updates).

I wrote up the process here: https://geek-cookbook.funkypenguin.co.nz/blog/2023/02/07/consolidating-multiple-manager-changes-in-renovate-prs/

I'd be grateful for feedback, or suggestions for improvement!

D

https://redd.it/10xqrsy
@r_devops
Is ELK overkill for this?

I have a use case where I want to parse application logs in real time (real time mean that the applogs are updating continuously and not that there is a time deadline to process the logs in), search for an error log, count the number of errors above 10 in the last 10 minutes, and then notify an external REST API endpoint.

Currently we are using filebeat to push to elastic and then setting up an alert in Grafana that notifies the REST endpoint.
This adds alot of "middlemen" to the alerting and notification system which I see as points of failure.
Is there a simpler way to bypass all of this and just write a linux bash service that continuously keeps track of the errors in a logfile and alert based on a threshold? Is this possible in bash?

https://redd.it/10xtfk4
@r_devops
DevOps Tutorials Twitch Channel

Just wanted to throw out a shameless plug for some new work I'm doing on Twitch and will be converting some of these to YouTube videos but I'm showing on stream how to setup things like:

Kubernetes

Chatbots

CI/CD for chatbots using GitHub Actions, etc

Python coding

​

And just overall topics related to new tech, etc. I play some video games here and there but if learning more about some of these topics interest you, would love for ya to come follow. I'll be setting up a more regular schedule after I get another job.

​

I want to use it as a platform for community sourced learning and just talk about topics and try things, I don't know everything and I think we are always learning in this field so it would be cool to have discussions.

​

Last night I was messing around with a Rasa chatbot (python open source) and talking to the Nasa API to pull a image of the day from them, you can interact with the bot in my twitch chat 24x7.

Would love any feedback or topics you might want to see.

​

https://www.twitch.tv/devopswithbrian

​

Thanks!

https://redd.it/10xt9p2
@r_devops
DevOps course for small companies and individuals

Hello everyone,

I've made a DevOps course covering a lot of different technologies and applications, aimed at startups, small companies and individuals who want to self-host their infrastructure.
To get this out of the way - this course doesn't cover Kubernetes or similar - I'm of the opinion that for startups, small companies, and especially individuals, you probably don't need Kubernetes. Unless you have a whole DevOps team, it usually brings more problems than benefits, and unnecessary infrastructure bills buried a lot of startups before they got anywhere.

As for prerequisites, you can't be a complete beginner in the world of computers. If you've never even heard of Docker, if you don't know at least something about DNS, or if you don't have any experience with Linux, this course is probably not for you. That being said, I do explain the basics too, but probably not in enough detail for a complete beginner.

Here's a 100% OFF coupon if you want to check it out:

https://www.udemy.com/course/real-world-devops-project-from-start-to-finish/?couponCode=FREEDEVOPS2302FIAPO

Be sure to BUY the course for $0, and not sign up for Udemy's subscription plan. The Subscription plan is selected by default, but you want the BUY checkbox. If you see a price other than $0, chances are that all coupons have been used already.

I encourage you to watch "free preview" videos to get the sense of what will be covered, but here's the gist:

The goal of the course is to create an easily deployable and reproducible server which will have "everything" a startup or a small company will need - VPN, mail, Git, CI/CD, messaging, hosting websites and services, sharing files, calendar, etc. It can also be useful to individuals who want to self-host all of those - I ditched Google 99.9% and other than that being a good feeling, I'm not worried that some AI bug will lock my account with no one to talk to about resolving the issue.

Considering that it covers a wide variety of topics, it doesn't go in depth in any of those. Think of it as going down a highway towards the end destination, but on the way there I show you all the junctions where I think it's useful to do more research on the subject.

We'll deploy services inside Docker and LXC (Linux Containers). Those will include a mail server (iRedMail), Zulip (Slack and Microsoft Teams alternative), GitLab (with GitLab Runner and CI/CD), Nextcloud (file sharing, calendar, contacts, etc.), checkmk (monitoring solution), Pi-hole (ad blocking on DNS level), Traefik with Docker and file providers (a single HTTP/S entry point with automatic routing and TLS certificates).

We'll set up WireGuard, a modern and fast VPN solution for secure access to VPS' internal network, and I'll also show you how to get a wildcard TLS certificate with certbot and DNS provider.

To wrap it all up, we'll write a simple Python application that will compare a list of the desired backups with the list of finished backups, and send a result to a Zulip stream. We'll write the application, do a 'git push' to GitLab which will trigger a CI/CD pipeline that will build a Docker image, push it to a private registry, and then, with the help of the GitLab runner, run it on the VPS and post a result to a Zulip stream with a webhook.

When done, you'll be equipped to add additional services suited for your needs.

If this doesn't appeal to you, please leave the coupon for the next guy :)

I hope that you'll find it useful!


Happy learning,
Predrag

https://redd.it/10xttch
@r_devops
How proficient should Solution Architects be at writing code?

I am a lead architect for a retail enterprise org that has a small global footprint.

I have an engineer on the engineering team who comes from a Cloud Consultant background (Think Cloudreach/Logicworks) who can write some bash, but is terrible with writing any other language.

We have a tool that generates CloudFormation templates using Python, which is our IAC tool that leverages an open-source Python library. We live and die by this tool for our IAC, which I have implemented and is rock solid for our org, but this individual is on the struggle bus to use the tool at anything but a beginner level due to his lack of coding skill, and has little to no desire to level-up.

He is very strong with how AWS services work and how they should be implemented in an org (Account management, config enforcement, etc), and understands when/how apps should be designed from the "pillar perspective".

I got talking with my leader, who seems to think that "Most Solutions Architects don't need to have strong development chops", which I strongly disagree with.

While I understand in the context of big cloud consulting agencies, a team of people with each skill is presented to clients to deliver, all with the specialties needed to fulfill the SOW, and from that perspective it might be fine, but in Enterprise, I was wondering what your experience is.

In our org, we need to empower our developers by helping them learn the way by implementing code in pipelines across multiple languages, which almost always involves reading their code, and suggesting implementation changes, in addition to our own IAC tools. On top of that, the normal architecture guidance of reliability, fault-taulerance, etc comes in.

IMO, having an idea of how it should be designed is only part of the picture, Architects should also know how to hand off the implementation and be familiar with the tools to implement the design, which in our org consists of Groovy and Python based tools.

So, what does reddit think? How proficient should an Architect be at writing code?

https://redd.it/10xvhd1
@r_devops
Comparison among techniques to share GPUs in Kubernetes

I recently released an [opensource library to dynamically leverage GPU with NVIDIA MIG and with MPS](https://github.com/nebuly-ai/nos), and the most appreciated component of the comparison among sharing technologies, so I wanted to share it here.

There are three approaches for sharing GPUs in Kubernetes:

1. Multi-Instance GPU ([MIG](https://github.com/NVIDIA/mig-parted))
2. Multi-Process Service ([MPS](https://docs.nvidia.com/deploy/mps/index.html))
3. Time Slicing ([TS](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/gpu-sharing.html))

# Multi-Instance GPU (MIG)

**Workload isolation**: best

**Pros**

* Processes are executed in parallel
* Full isolation (dedicated memory and compute resources)

**Cons**

* Supported by fewer GPU architectures (only Ampere or more recent architectures)
* Coarse-grained control over memory and compute resources

**References**: [Tutorial on how to use Dynamic MIG Partitioning](https://towardsdatascience.com/dynamic-mig-partitioning-in-kubernetes-89db6cdde7a3)

# Multi-Process Service (MPS)

**Workload isolation**: medium

**Pros**

* Supported by almost every GPU architecture
* Processes are executed parallel
* Fine-grained control over memory and compute resources allocation
* It lets you setup memory limits

**Cons**

* No memory protection and error isolation

**References**: [Comparison of sharing techniques and tutorial on how to use MPS](https://towardsdatascience.com/how-to-increase-gpu-utilization-in-kubernetes-with-nvidia-mps-e680d20c3181)

# Time Slicing

**Workload isolation**: none

**Pros**

* Supported by almost every GPU architecture
* Processes are executed concurrently

**Cons**

* No resource limits
* No memory isolation
* Lower performance due to context-switching overhead

**References**: [Time-Slicing GPUs in Kubernetes](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/gpu-sharing.html)

​

**Resources**

* [Dynamic GPU Partitioning documentation](https://docs.nebuly.com/nos/dynamic-gpu-partitioning/overview/)
* [NVIDIA GPU Operator documentation](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/getting-started.html)
* [NVIDIA MIG User guide](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/)

https://redd.it/10xty21
@r_devops
Need a catchy name for a data migration tool

My company is developing a new data migration tool and we're trying to come up with a catchy name. Open to any and all suggestions!

https://redd.it/10xy5j9
@r_devops
Terraform vs. Cloudformation for an all-AWS Environment in 2023?

Current company uses Cloudformation for everything. I work in an AWS-only environment (except for a few data workloads on GCP, which use Terraform, but they're an exception and not worth considering in this question).

I'm wondering — in 2023, is there a tangible benefit for ripping up all our Cloudformation and rewriting it all in Terraform? Assuming we have no plans to migrate off of AWS anytime soon?

What are the pros and cons of Terraform vs. Cloudformation in 2023 for an AWS-only company?

https://redd.it/10y3p20
@r_devops