Reddit DevOps
270 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Udacity's nanodegree reviews

I bought Udacity's SRE nano-course to upskill and get better for interviews. It promised 10 hours weekly would take you four months to complete.


I literally completed it in a day, their statements are a joke. Also worse they try to get you to sign up for four months for a single-class. They do not have a subscription model where you can take different courses or nanos whatever they call them. Luckily they offer a 7 day refund period


In terms of the content, it mainly consists of very short videos and reading some modules, and completing some simple quizzes. There was only one hands on assignment -- and the final project very much resembles this assignment that is even too easy for a junior engineer. The instructions/solutions for one lab was wrong and outdated -- had to go to forums to see that they dont update their content. Final project was literally setting up Prometheus/Grafana on a K8s cluster and monitoring four metrics.

One of there other Cloud Engineer courses was reputable, and they have other programs too https://www.udacity.com/course/cloud-native-application-architecture-nanodegree--nd064 but it seems sus after my encounter. Maybe their other nanos like ML/robotics is good but the quality of this course was a falsly advertised.

https://redd.it/13jrxqw
@r_devops
Best Metrics for Scaling Django REST API with KEDA?

Hey! Exploring KEDA for scaling Django API services. Is the request count from Prometheus NGINX Ingress:

round(sum(irate(nginx_ingress_controller_requests{ingress="api",exported_service="timelog"}[1m])) by (exported_service), 0.001)

a reliable metric for KEDA?

https://redd.it/13jrghc
@r_devops
Modeling EC2 on-demand vs reserved instance pricing

Managing cloud costs can be quite complex. You've got to predict the future while understanding how pricing structures impact all possible load patterns. In this post I discuss a how re-framing on-demand instances as a "premium" instead of reserved instances as a "discount" can shift your thinking. There's a break even point at around 15 hours per day when on-demand and reserved instance pricing meet (across almost all instance types). I walk through how to calculate costs and model costs in a formal way. This can help you save the most money when auto-scaling.

https://redd.it/13iuxop
@r_devops
Introducing Terrateam Self-Hosted

https://github.com/terrateamio/terrateam

https://terrateam.io/blog/terrateam-self-hosted

I'm excited to announce the self-hosted version of Terrateam designed for Terraform users that need to meet strict security and compliance requirements or just prefer to host their own deployment.

Really looking forward to folks kicking the tires, providing feedback, and submitting feature requests.

https://redd.it/13jvaxr
@r_devops
Venting - CI/CD requirements

This is just for general venting, I don't necessarily need any advice. I work for a company that just acquires other companies, it's their whole gimmick. That being said, when an acquisition happens it's always a mad house getting them integrated. The product I work on was acquired a few years ago and that integration just never happened. I like to joke that it's the forgotten child. What happened was the entire devops/infrastructure team quit immediately after the acquisition and the week before I started.

So I stepped into this absolute shit show of an environment and spent near 2 years patching holes and migrating to new infrastructure. Recently our build/deploy pipelines in Jenkins shit the bed more or less permanently. It grew out of control largely because of neglect on my part which was due to me being bogged down with other requirements.

I proposed and more or less was given hand wave approval to move to gitlab as the fix and future solution since our code is already there, as well as a huge cost saving, to empower the dev team to build their own pipelines and all that jazz. So, I start working on this and now I'm about halfway done. Our build and deploys are in gitlab and it's time to start setting up our automated tests. As I'm doing this it's now time to get financial approval to move to this.

All in all we'll be saving about 30k a year to ditch jenkins for gitlab. Builds, deploys, and testing (I've already set up a couple) are significantly improved in the respect time to complete and reliability (jenkins builds would lose connection to the host node intermittently among other things). It seems like an easy win.

Email gets sent off for approval from my boss to upper management and then a shit show ensues. My company largely uses BitBucket and a different Jenkins setup. Another of our devops teams now want to own the builds and also want to move from gitlab to bitbucket. I tell them this is a terrible idea as we have many client commitments through the end of the year and this will pose various issues and blockers while they get this product integrated. But the other team insists that this is the way it has to be.

So fine, I did my part with discussing and arguing over a few weeks and ultimately lost the argument. The other team used the bitbucket importer as a test run on one of our nastier repos and it goes horribly. We lost previous PRs and comments and whatnot. We have various public repos for reasons and the internal system/security standard is more or less a hell no with public content. That should have been a show stopper but I lost that argument as well. So they decide "fine, you can stay in gitlab for now as your SCM but builds have to be set up in the other jenkins". I stressed that the load we'll put on their small-ish minimal set up will cause problems, but again I lost the argument. So they set up the web hooks and whatnot then go "you gotta convert these libraries of scripts into declarative jenkinsfiles." Again, stupid busy idea that just prolongs the inevitable and wastes time as we're blocked on a few repos.

So I start rewriting these jenkinsfiles and whatnot, sure enough the nasty repo starts beating the shit out of their jenkins. Now other products are blocked due to jenkins crashing. So we submit our tickets that its broke and the other team starts working on fixing it. They increase the server resources and all that but it still isn't enough and jenkins keeps crashing. Its now a mad house on their team because they were busy enough, I can't do much because I don't have access to that infrastructure and I'm twiddling my thumbs and working on a couple of other projects.

I'm like 90% sure they'll give up in a week or two but by that time we'll be 3 weeks behind schedule on commitments and sprints that we were booked through the year on. I'm a bit cynical and I love morbid humor. So, I've done nothing but have a laugh at the whole dumpster fire.

I've obviously glossed over a bit of detail but I think the gist is there.

TL;DR: I need to migrate our
pipelines to something else, gitlab being the target. Halfway through, another team won an argument on not doing that in favor of other options. Despite evidence showing it was going to end terribly, they won the argument. And its gone terribly, so I'm having a morbid laugh at the situation.

https://redd.it/13jvxzq
@r_devops
Repo for small scripts - What's the best practice

We use Azure Devops as a source control repository. The primary language we use is C#. Occasionally, some devops engineers write small automation scripts in python or bash. e.g. Generating a list of stale branches, or delete large number of files from S3 bucket.


What is the best practice to store such scripts? I am thinking of creating one dedicated repo just to store such scripts. This will provide all source control benefits.


Couple of downsides I can think of are as following but I don't think these are major issues.
1) This repo will grow over time and engineers will need to pull all of it before contributing their own scripts
2) Engineers will need to be more careful about not pushing any secrets in those scripts

https://redd.it/13jyvue
@r_devops
Is Backstage a good solution for the needs of our project?

I'm not sure if this is the right subreddit for this question, but here it goes:

Our clients usually demand a Minimum viable product or Proof of concept before commiting to a project.

Right now we do this MVP from scratch. That means that we set up a dev, test, and prod environment, we setup the right CI/CD workflows, create necessary documentation and so on for each proposal to the client.

A lot of the time the programming languages, frameworks and tools are different so it has been difficult to reuse basically anything.
On top of that, these MVPs are created by different teams inside the organization, making the MVP an isolated project that no one has acces to besides the development team.
And seeing that the developer's experiences ranges from seniors to fresh graduates, the whole thing becomes a mess (Or in other words, it doesnt have the consistency or quality that we are aiming for).

Our department has been given the task to create a solution that automates this process.

The requirements are:

1- With "minimum work" we need to be able to deliver a blank project (for example a webpage that displays hello world after calling a backend API) + tests.

2- This blank project must have all the documentation needed for the developers to start working efficiently and it has to be easily accessible.

3- Needs to have all the CI/CD workflow working from the start. It would be a plus if it can be customizable (could use different tools depending on the project needs or clients request).

4- Optional: have terraform code ready to be used.

Another member of the team suggested that we build our own solution and I'm mostly against it (Mainly because of time and money constrains aswell as lack of seniors in our team).
I have been researching Backstage and it looks like its a good solution for our needs, but if I'm being completely honest, I'm having a hard time understanding it.

I want to ask if someone that used or knows of Backstage knows if its the right tool.
Also, I'm open to any other suggestions since I'm a little bit lost with all the tools there are.

Sorry for the long post and thank you in advance!

https://redd.it/13k0h4n
@r_devops
Didn’t get hired because interview was too good

Been studying my ass off and i used GPT to generate interview questions and answers i might be asked during the interview to practice. Unfortunately, I practiced a bit too much and they gave the offer to their second choice because my interview was perfect. Any advice on what i should do to avoid this outcome again?

https://redd.it/13k6pvf
@r_devops
Infrastructure As Code - Trying to setup an automation around a very messy tech stack

As the title stated, our tech stack is unique and rough around the edges. I want to see how can I make the best out of it.We currently have:

1. Setting up requests in Service-Now (For Hardware - Kubernetes Clusters)
2. Trigger Pipelines (via Jenkins) for creating namespaces, deploying ISTIO & Nginx
3. Requesting Certificates (Internal & third party vendor cert requests) & uploading them.
4. Deploying OpenTelemetry Agents (elk, splunk... etc etc etc)
5. Configure ISTIO Secrets, Confit-Gateways

​

I know I can't leverage a single IaC tool (like Terraform or Ansible) to set these up. I want to get different perspectives here in the group to get more ideas on the topic.

https://redd.it/13k9wy7
@r_devops
Open-source IAM Access Visualizer

Hey folks!

Recently created an IAM access visualizer that displays access relationships between AWS identities and resources.
It’s part of an open source cloud security platform that we maintain.

Some potential use cases we wanted to address:

Which IAM roles can become effective admin?
Which IAM roles can read data on your sensitive S3 bucket?
What's the blast radius of an EC2 instance compromise?
What IAM privilege escalations exist in your environment?


Would love your feedback on if something like this is helpful for your cloud IAM workflows!


Click around the Sandbox Environment
Check out our Loom Demo
Check out the Github Repo

https://redd.it/13k8qao
@r_devops
Create Service Now requests via Ansible - Possibility

I am currently working on updating our configuration management system and want to see this possibility of creating Service-Now requests via Ansible.

Are there api's available from Service-Now for us to automate request creations?

​

Cheers!!!

https://redd.it/13kd8xk
@r_devops
Vagrant alternatives?

I really like Vagrant, but it has a severe flaw. It's painfully slow on windows and it makes it basically unusable for me. Is there a good alternative or a way to make it faster? I know there's docker, but since it isn't free anymore I'd rather not use it.

https://redd.it/13kckev
@r_devops
Terraform question. Do I need to worry about state management for a small Lab?

I am currently deploying through Github Actions, a single VM which gets created by Terraform code.

I don't fully understand the problem of state management, at least not for my own small lab environment.

\- Should I use Terraform Cloud for state management

\- Can I just store states in my Github repo (not ideal I know, but for a small lab)?

\- What If I just don't do state management? (they get lost on each run if I don't save them somewhere)

https://redd.it/13jymsk
@r_devops
How did you handle burnout?

I'd like to read about experiences with burnout. I had two weeks where I couldn't focus, and I feel that my performance is lower than it was one or two months ago. I think that this is temporary, so I'm not worrying too much about it. However, like most developers before experiencing burnout, I was working more hours than usual due to anxiety about growth. Now, I'm trying to track my work hours to be more efficient. I prefer to work for 5 or 6 hours without social media or anything that can distract me. So, my questions are:
\- How did you feel with burnout?
\- How did you manage this situation?
\- What was your strategy for getting back to performing well?

https://redd.it/13kiqcm
@r_devops
You already reused the code of your company outside company?

DevOps daily produces code that is not part of company product, for example, an script to install Kubernetes or some automation on AWS. You already used these codes in a personal project or in another company?

https://redd.it/13k1ne8
@r_devops
Introducing Digger v4.0 - An Open Source GitOps tool for Terraform that runs within your existing CI/CD tool. (+ A brief history of our journey so far)

We have been building [Digger](https://github.com/diggerhq/digger) for over 2 years with multiple iterations in between. Today we are launching Digger v4.0 - An Open Source GitOps tool for Terraform.

A brief history of our journey:

🚜 [Digger Classic](https://app.digger.dev) (v1.0)

Initial focus was to build a “heroku experience in your AWS”.

We wanted to handle everything from infrastructure, CI, monitoring, logs, domains support etc. There were several design issues in this version:

The split from services to environments confused users a lot

Several types of deployments (infrastructure, software) confused customers, they didn’t know when infrastructure is needed versus a software deployment

The concept of “environment target” for the whole infrastructure had its limitations especially for customisation of existing infrastructure.

This led to the birth of Axe,

🪓 [AXE](https://dashboard.digger.dev) (v2.0)

With AXE project we wanted to improve some UX points by focusing more on “apps” which are individuals pieces that developer would want to deploy.

The main idea was to have the ability to capture whole environment was missing in this model, it was something that was appreciated in classic (albeit confusing)

While infrastructure generation was more flexible in this model, there were still pieces which didn’t fit such as creation of VPC and other common cross-app resources. This could have been solved with more thought and notion of app connectivity.

Biggest problem was reliability. Since we were taking on responsibility of creating infrastructure and building and deploying successfully, our success rate for users was not high. This affected our ability to attract more users and grow the product

This subsequently led to the birth of v3.0, Trowel,

🧑‍🌾 [Trowel](https://dashboard.digger.dev/create) (v3.0)

In this version we limited our scope further to generating and provisioning infrastructure-as-code. The idea was to introduce a “build step” for Terraform - the user describes the infrastructure they want in a high-level config file, that is then compiled into Terraform. Or perhaps a “framework” to abstract away the implementation details, similar to Ruby on Rails.

We no longer touched application deployment, meaning that we could focus on the core proposition infrastructure generation and customizability. This however, did not seem to interest end users we were speaking to. The challenging part was not so much writing the terraform code but rather making sure it’s provisioned correctly. The framework idea still looks promising, we haven't fully explored it yet; but even with a perfect framework in place that produces Terraform, you'd still need something to take the output and make sure the changes are reflected in the target cloud account. This was the one missing piece in the toolchain we decided to further “zoom into”.

🧑‍🌾 [Digger](https://digger.dev) (v4.0)

Digger is an open-source alternative to Terraform Cloud. It makes it easy to run terraform plan and apply in the CI / CD platform you already have, such as Github Actions.

A class of CI/CD products for Terraform exists (Spacelift, Terraform Cloud, Atlantis) but they are more like separate full-stack CI systems. We think that having 2 CI systems for that doesn't make sense. The infrastructure of asynchronous jobs, logs etc can and should be reused. Stretching the "assembly language" parallel, this is a bit like the CPU for a yet-to-be-created "cloud PC".

So it boils down to making it possible to run Terraform in existing CI systems. This is what Digger does.

Some of the features include:

* Any cloud: AWS, GCP, Azure
* Any CI: GitHub Actions, Gitlab, Azure DevOps
* PR-level LocksPlan / apply preview in comments
* Plan Persistence
* Workspaces support
* Terragrunt support
* PRO (Beta): Open Policy Agent & Conftest
* PRO (Beta): Drift detection (via Driftctl)
* PRO (Beta): Cost Estimates (via