Reddit DevOps
270 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Dynamically creating cloud resources and managing them. Is Terraform the answer? CDKTF?

I'm trying to build a system where I dynamically create and manage cloud resources.

It takes a certain set of dynamic variables where the "base" is standard.
So say a user requests a new VM with some things already configured, but some things are up to the user.

My first thought was just using terraform files to handle the cloud resources. But I'm unsure if this is the way to go with dynamic variables? What would the workflow look like?

Maybe something like CDKTF the way to go to solve something like this?


Right now I'm imagining a workflow that looks like the following:

input variables from external source
create a new set of terraform files (this just seems better to be saved as an object no?)
somehow plug variables into the terraform variables file
apply terraform
new resources created
take output from resource creation and save it in a db for further monitoring and handling


All of this would probably be hosted in a github repo where the terraform apply commands are run by a managed identity/IAM role using github actions as to minimize the amount of manual labour.


I feel like I'm missing something here though. Or is this a good way to handle a workflow like this?

Should I look into using CDKTF instead? It seems like it's better suited for cases like this where you dynamically create and manage cloud resources?


How have you guys solved similar problems and situations?

https://redd.it/12rmi7x
@r_devops
Is Using Pulumi or TerraForm overkill for starting out with ephemeral environments?

We have the classic DEV -> STAGING/QA/TEST -> PROD setup. At the moment we are getting bottlenecks where small code changes are waiting for bigger features to pass testing before it all of it can go to prod.

I would like to start moving towards ephemeral environments where we could test different features/fixes/changes in isolation. E.g Pushing a specific tag/branch would kick of a process that ends up with an email saying: "You can start testing at this unique URL: https://somerandomID.ephemeral.testing.com"

We already have dedicated infrastructure for staging (CloudSQL instance, Redis instance, Kubernetes node etc) and the problem I am facing is I DONT WANT TO CREATE MORE INFRASTRUCTURE. I want to use the existing instances. E.g All ephemeral databases sit on the same DB instance with a specific naming convention (E.g somerandomID_epehmeral).

Once we are done with the environment we will simply clean up after ourselves by deleting the DBs, K8S namespace etc.

Question: Is using a tool like Terraform or Pulumi overkill? We want to in the future migrate to spinning up infrastructure, but for now its not viable. Has anyone faced the same situation? Should I just stick to some bash/JS scripts to do this for me?

Thanks!

https://redd.it/12rntw5
@r_devops
Thoughts on windmill.dev?

I've been looking for a "Jenkins replacement" for a very long time now. Something that also supports building a UI for specific internal processes. I came across windmill.dev and I have to say I'm surprisingly impressed. Not only self hosing it is completely open sourced, It feels like a combination of Jenkins and Retool but with a modern twist. I also find it super interesting that the UI is the source of truth and the GitOps is just the backup (pulling as opposed to pushing, although you can do otherwise but that's hard because you need to edit JSON files that the UI generates), although it's a new working methodology I'm not used to.

I have to say the UX is a bit messy but it's still an improvement from Jenkins.

Anybody used it? Are the any caveats I should know about before converting my Jenkins pipeline to Windmill flows and apps?

https://redd.it/12rp0th
@r_devops
Is there a way to track who's messing with Pods/Namespaces

Our setup is still old school Kubernetes where everyone can access most of the Kubernetes Resources (Namespaces, pods). So people keep on deleting the pods or modifying them straight from the cli rather than the pipeline (we are fixing this and its a long way ahead on us).

Currently we syslog the logs to ELK and I can't see the info on who's messing with these pods manually in Kibana logs.

I did some research and found out that Kubernetes even don't keep track of this level of info. So how can I solve this issue on tracking who's messing with my env's and kick their a** :-)

https://redd.it/12rqzds
@r_devops
Junior DevOps Engineer here

Hello fellow DevOps engineers :)

somewhat 6 months ago my employer asked me to be part of the DevOps team for our company, which, in its way was entirly newly set up for me and my colleague (Head of Infrastructure and M365 Admin for the hole company). Bevor that i have never got in touch with ansible, docker or anything else, only knew a little bit about linux, and have been working in Customer Support for a Windows Server environment

6 month later i am 80% familiar with out hole infrastructur and I feel somewhat confident working with monitoring through zabbix docker, docker-compose, terraform and ansible. next one is going to be k8s and/or nomad.

Now to my consideration. I (still) feel extremly slow working in comparison to my colleague who has been in IT for about 10 years and setup himself k8s cluster for fun.


I am somewhat anxious and unsecure about my skills, but I really dont now if this is justified.

(Working in Germany btw)

https://redd.it/12rv2wj
@r_devops
IAMbic, A multi-account identity-centric IaC

Hi there, I'm one of the founding engineers at Noq and am responsible for a lot of IAMbic's architecture and implementation.

We created IAMbic to make it easy to unify all cloud identities, going beyond access to manage complex cloud permissions, tracking access all the way from users to cloud resources, and presenting everything in a human-readable, as-code, in an open-source format.

IAMbic supports bidirectional syncing and round-trip capabilities in a GitOps workflow, and includes the following key features:

* **Universal Cloud Identity**: Integrate identities from AWS IAM and Identity Center, Okta, Azure AD, and Google Workspace with more to come.
* **Dynamic AWS Permissions**: Multi-account roles with different permissions and access rules on different accounts.
* **Temporary Access**: Declaratively define and automate expiration dates for cloud access, fine-grained permissions, and identities.
* **Drift prevention**: Prevent out-of-band changes to IAM resources you want to be exclusively managed via IAMbic, like cookie-cutter roles or sensitive identity provider groups.
* **Change History**: Keeps a full audit trail of IAM changes in Git, regardless of whether these changes happened through IAMbic

We’re just getting started on our journey to change the way cloud IAM is managed. We’re huge fans of open source and eager to grow together through your feedback and contributions. Try out IAMbic by following the [Getting Started guide](https://docs.iambic.org/getting_started/). We’d love to chat and hear about your experiences in our [Slack community](https://communityinviter.com/apps/noqcommunity/noq).

https://redd.it/12ryryt
@r_devops
I created a simple project that allows you find some good resources around devops tooling

Hey Everyone ,

I created a simple application that helps you find some good resources around devops tooling , please have a look and give me some feedback
https://devopsupgrade.com/

https://redd.it/12s399u
@r_devops
KubeCon EU 2023 Job Board?

Hi, remember that during earlier editions of KubeCon you could find a “job board” where companies posted open positions. I remember from Barcelona 2019 edition.

Has this translated to virtual editions too? Is there something like a virtual job board now?

Thanks!

https://redd.it/12s7dcv
@r_devops
Whether 'tis nobler in the mind to upgrade Argo or to take arms against a sea of inconsistencies and by opposing migrate.

A previous employee installed argo and basically abandoned it. As far as I can tell, the last installation I can see is either 1.7 or 2.1. I'm not sure which because in the default argo installation yamls it has the argo-server image set to :latest and pull policy set to Always. As pods are rescheduled either because of cluster/node upgrades or failures, we ended up in a situation where we are running the latest version of the server but with outdated or entirely missing CRDs. There is no ApplicationSet resource despite running argo-server v2.6 for example. ApplicationSets were bundled with the main installation in v2.3.

I feel like my two options are to either upgrade in place (I don't know if this is safe) or roll an entirely new install, copy the apps/projects, flip dns, and destroy the old one.

I know the second option would be more work but it would also give the opportunity to clean up a lot of inconsistencies with the app configs. Not sure if it's really worth it though.

Anyone else run into this situation?

https://redd.it/12s863x
@r_devops
Can we be kinder to people doing ops work?

Whether they are called DevOps Engineer or not, if you are a SWE and throwing support of your code over the wall to be someone else’s problem, give them some god damn respect. Do you talk shit to people who prepare your food at a restaurant too?

As someone who busts my ass and can code effectively in several languages and all across the stack, but whose current title happens to be “DevOps Engineer” am so sick of seeing comments saying that DevOps people are “not engineers” or “yaml engineers”, or saying they are not doing anything. There are lazy people and poorly performant workers in these positions just like there are in every type of SWE position too, can’t tell you how many “frat party” SWE teams I have seen.

I don’t know why other engineers seem to get a free pass from everyone to look down on the people who ironically support the lazy code they write all the time.

Stop being assholes. You’re not that special.

https://redd.it/12s8n5o
@r_devops
Workflow-Watcher: Pause a GitHub Actions workflow and wait for another workflow to complete before continuing

“One of the fundamental principles of Continuous Delivery is Build Binaries Only Once. Subsequent deployments, testing and releases should be never attempt to build the binary artifacts again, instead reusing the already built binary. In many cases, the binary is built at each stage using the same source code, and is considered to be ‘the same’. But it is not necessarily the same because of different environmental configuration or other factors.” - Kei Omizo

In case you are looking for how to achieve the same principle using Github Workflows, I have created the following Github Action that will help you in controlling your workflow execution for a specific commit sha that your development team might have pushed to multiple environments at once and then you can use the same artifact everywhere instead of building the same commit multiple times

https://github.com/mostafahussein/workflow-watcher

https://redd.it/12s7dcl
@r_devops
Checkout automstion feature

Hi readers, thank you for your patience!
I have ZERO coding experience, but resolve is what matters here so let me explain real quick
I need help with understanding how to integrate a checkout feature.
I have 2 options:
-option 1/ I let the client pay with card and integrate a system that converts the amount in crypto, and then splits it in % to different wallets;
-option 2/ I force the client to pay with crypto and integrate Mycelium Gear's custom widget to get the funds into a wallet, but then I also need to automate the splitting process of the funds towards the other wallets and I still have no idea on how to do that

The crucial part is being able to split all incoming transactions between different addresses.
If someone can explain me how can I achieve something like that I would be so much grateful for it guys, thank you in advance

https://redd.it/12shagg
@r_devops
Best Practices for GitHub Actions (and maybe CI/CD in general)?

I'm a Go developer filling in my gaps around, well, everything outside of the server and database (Docker, Kubernetes, IaC, CI/CD). I've got a GHA workflow working for a sample project. I'd welcome any feedback or critiques but I had a general question around file size and splitting up logic.

What is the rationale for splitting up yml files in the .github/workflows directory? If there is a rationale, would that apply to other things in the devops world?

https://redd.it/12sgxtr
@r_devops
Beginner DevOps Intern / Need Help with Jenkins!

Hello! Been on this subreddit for a week or two now and it's been very insightful for me as a COOP student working my first DevOps job (been here for 3 and a half months now). Now, my boss has asked me to fix our Jenkins server and I have no idea where to start. Currently our front end team can’t build code as our frontend build for our server is giving an exit code 3 error. Mind you I did not write the pipeline script. The npm install stage script stage:

('npm install') {
steps {
script {
sh 'sudo rm -rf ./node_modules'
sh 'sudo npm install --f'
}}}

I ran this locally on putty it says my our node.js is at v18.0.19.0 when the error in Jenkins says differently, not sure what to do next as my boss has no idea either and there’s no one on the devops team
(Wish I could show images in the post, perhaps if you need more clarification shoot me a DM!)

https://redd.it/12sk2ur
@r_devops
No DNS address record found for repo.maven.apache.org

If your builds are failing, it appears that there's an issue with the Central Repository. However, your builds shouldn't be failing because you should be using a local artifact repository manager to proxy and cache this stuff. ;)

https://www.nslookup.io/domains/repo.maven.apache.org/dns-records/

https://redd.it/12sjj61
@r_devops
Which course to choose?

Hi all,

I'm a beginner in dev and I recently got a week of free access to this program. Can you recommend any course in particular which is suitable for me to cover in a week and I can learn something from it. Thanks in advance

https://redd.it/12sngfl
@r_devops
Automated Release Notes in Azure Devops

So, this has been a personal work goal for a couple of years, but can't ever find the time to devote to finding out a solution to this without having to purchase some add on (not that an add on is out of the question).

So, what I want to do is to generate a markdown file for release notes that I can post in the project Wiki in Azure Devops and onto a Teams wiki for the business to view.

I'm sure this is a problem that has been solved, so don't want to reinvent the wheel but just need some help.

https://redd.it/12sjxm6
@r_devops
I tried writing a serverless app at home before doing it at work and this is what happened

So I banged out a tf to deploy the resources, wrote three Lambda's 1 in c#, one in go, one in rust. Don't get excited by this, the go Lambda returns "hello" the Rust one returns world and the c# one concatenates them.

I want to find out gotchas and complexities.

So I mentioned it at work.

My boss said.

You'll either have to delete it all or hand it over to us.

WHAT...

He then tried to tell me any code I wrote while employed belonged to the Corp, including personal stuff.

My own domain, my own aws account, my own thinkpad all bought before I took the job

Bit like telling a carpenter he can't make a flower bed for his home cos his knowledge of hammering wood together with nails to build a flowerbed at his home belongs to his boss.

Madness

https://redd.it/12sqwjf
@r_devops
CI/CD pipeline architecture in repository containing multiple services

It's a sort of a older project, started as backend and frontend, had one pipeline which tested, built and deployed everything.

As the time went by, there was some heavy work on backend, and as result few services were also added, but in the same project as backend and frontend, not in separate repos.

Currently, there exists one pipeline which tests, builds and deploy everything which isnt effective because I have to wait a long time for a run to finish if I only changed one service.

I think I have two options to fix this:

1. Make a separate pipeline for each service, save them in repo, reuse the code from parts of main pipeline.
2. Edit existing pipeline in a way to add triggers and checks for each folder containing service so it runs only based on changes.

For reference, this is Azure DevOps, I think I am leaning more on option 1 because it seems logical ( Single responsibility principle), what do you guys think?

https://redd.it/12sskxz
@r_devops
Loadbalancer - round robin with 100%

I have 2 RPC servers that need to be serving requests with 100% uptime.


I have 2 questions:

1) To archive 100% uptime I need to deploy a load balancer, let's say with round robin. That health checks every RPC server every N milliseconds. The rpc request is around 10ms. So there may be situation when health check is not yet done, but server is down (update or outage). So request will return an error. What should I do in that case? Make every request with retries and some small timeout ? So it will at some point be routed to a healthy node, before next health check will invalidate a bad node?



2) How to make load balancer resilient ? I don't want to go with kubernetes, as it's really hard to master it for me. What can I use ?

https://redd.it/12ss262
@r_devops