Reddit DevOps
269 subscribers
4 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Consolidation into DataDog -- questions and experiences

Hi,

We're considering consolidating CloudWatch, SumoLogic and Sentry into DataDog. We're currently using DataDog for APM, Tracing and so on, just not logs or error management.


I was curious whether folks here have done it before and what your experience was like, any lessons learned and any questions you'd recommend we ask in the process.

https://redd.it/1i2sbvl
@r_devops
Docker vs CapRover vs Bare Metal

Hi,

I'm a software engineer launching a web application.

I'm running the app on a basic 4GB RAM and ~45 GB hard drive.

I've been running everything bare metal, which means
- I've had to set up pgBackRest for backups
- systemd scripts for my app
- systemd script for nginx
- nginx configuration, nginx for reverse proxy to provide an additional layer of security and sometimes at some point I had a demo app running besides the main one
- letsencrypt ssl with a cron job to renew it

I'm now dealing with DB migrations, so I started making a script to copy the postgres backups to my local machine, make a restore to load them in the DB, run the app with auto migrations off so that it can generate the migrations needed. The plan after this is done is to review the changes manually, and if accepted, have another script which will connect to the server to do the migrations, and then run the script which deploys the latest version of the app to the server.

It's my first deployment on my own not as part of a big company, so even though I did have to do a bit of devops in the past, some of these are a first to me.

I'm wondering now whether I should've just use a PaaS open source tool to take care of a lot of these things so that I can just focus on developing the software.

Does anyone have experience running a PaaS tool in a low powered server? What's the real overhead? What's a minimum ram so that the overhead is not too much? Will a PaaS tool really help, or will I just be bogged down configuring the PaaS instead of writing simple bash scripts and it's not actually worth it unless running multiple servers?

Do you have recommendations between (or anything else):
- https://github.com/dokku/dokku
- https://github.com/CapRover/CapRover
- https://github.com/Dokploy/dokploy
- https://github.com/coollabsio/coolify

I'm not running your standard django or ruby application, this is in a more obsucre language, so I probably need to anyways have some script for deployment. Right now my script git clones my repo, installs dependencies, and then restarts the systemd process. Just wanted to point that out in case these tools are too tightly coupled with the popular frameworks.

Thanks in advance!

https://redd.it/1i2tt7q
@r_devops
Eks auto mode for existing clusters with blue-green node groups.

Are EKS version upgrades with auto mode possible with a blue/green node groups ? if so, how?



https://redd.it/1i2uvna
@r_devops
Is there a current "state of the art" consensus? What's still going to be good 5 years from now?

I'm with a group whose infra and applications are all nearing end of life and have been tasked with designing (then presumably implementing) the infrastructure and processes around rebuilding the apps over the next 5 years.

What I think of as our best options are from when I learned it and am wondering what the current consensus is around the best infra (we're an Azure shop and I don't see that changing), security, testing, monitoring/alerting, CI/CD, etc.

So is there any consensus around the medium term future is for those areas or even a good resource for updating my understanding to what current and coming?

https://redd.it/1i2sf02
@r_devops
What does your devops support ticket lifecycle look like?

We'd love to learn how your team handles devops support queries. Where do the requests live? (Jira??), what are the different stages of solving them? how many tickets per day does your team get? What are the most repeated queries?

We'd love to learn! We're working on an AI devops agent to automate the repetetive bits that teams handle on a day to day basis, it's super early, here's a demo.

We'd love any and all insight.

https://redd.it/1i2zvhk
@r_devops
Courses recommendations for someone already working in the field?

Hello, I recently got a job where I worked on a project where I did things that I would consider Devops:

Kubernetes deployment and management
Ansible automation
CD/CD pipeline automation
Deployment of apps and services and their integration (LDAP, SSO, etc)

While I already had varying levels of familiarity with most of the concepts, I practically never had hands-on experience with them, I've been able to learn on the go and deliver on my tasks, but I feel like I have huge knowledge gaps that make my job harder than it should.

I was wondering if you know if any recommended courses that takes you thru a project to learn hands-on?

https://redd.it/1i31ib4
@r_devops
How do you manage large file transfer

Just like the subject I'm curious to know how y'all manage to transfer files from let's say SFTP server like Citrix or files.com to gcs bucket or S3. Do yall use any scripts or any services/tools?

https://redd.it/1i3202x
@r_devops
Any "must read" suggestions? 12factor app, etc..

I have a decade (...fuck, time flew) worth of professional tech/devops experience, and I only just now read and understood the value of "the 12 factor app." Does the community have any other suggested readings (or good-faith rebuttal reads) for dev, devops, ops, network, linux, cloud, or -whatever- engineers? I feel like there are paradigms I should know by now that I'm simply unaware of.

https://redd.it/1i34812
@r_devops
what are the must open source devops tools in CI/CD to learn?

What are the must open source devops tools in CI/CD to learn? I've been looking through tools that are required or needed to build proper CI/CD pipeline and there are just so many I can't learn everything. What are the must tools that one should understand throughly in the process? Another word, what are the most used tools in CI/CD pipeline from end to end? Sorry if this is not the right question to ask here. if that is the case, can someone point me to the right place?



https://redd.it/1i37t3n
@r_devops
Drift Detection Tools

Anyone in here using and happy with IaC drift detection tools? Here are a few I've found searching:

* [https://controlmonkey.io/](https://controlmonkey.io/)
* [https://spacelift.io/](https://spacelift.io/)
* [https://www.firefly.ai/](https://www.firefly.ai/)

I'd love to hear if anyone has experience with these or others they could recommend. Thanks!

https://redd.it/1i3a66u
@r_devops
Open Source Projects where I can contribute

Hi there!

For quite some time, I’ve been feeling a bit stuck at work. It feels like I’m not growing or developing my skills anymore. Unfortunately, the current job market in the EU isn’t looking great, so I’d rather keep my current job for now.

That said, I’d love to contribute to some open source or voluntary projects in my free time, especially ones that could use DevOps expertise. I have experience with automating processes, and while my current job doesn’t involve cloud technologies, I’d be excited to work with them as well (though it’s not a must).

Could you recommend any platforms, communities, or specific projects where I could find opportunities like this? I’d really appreciate any advice

https://redd.it/1i3d3yd
@r_devops
Multiple projects in one repository

This is my first time using Git ,so I have a backend dotnet api project and frontend Flutter project and want to store them in a single repository in Azure devops by creating a parent folder in the repository and inside it the backend and frontend folder respectively. Is this possible? And will it be a feasible approach

https://redd.it/1i3e7jl
@r_devops
Anyone using NEL reporting in production?

Hey :)

Is anyone here using Chrome's Network Event Logging in production? What is your experience with this, has it helped you / your org?

https://redd.it/1i3eef4
@r_devops
Just Created a Beginner-Friendly Docker Tutorial and Hands-on samples

Hi, everyone!

I'm a cloud, container, devops enthusiast and recently wrote a Docker tutorial on DEV. I wanted to make Docker easier to understand for beginners and share some practical examples.

If you’re new to Docker or want to refresh your skills, I’d love for you to check it out!

Link in the comment.

Feedback, questions, or suggestions are more than welcome. Let me know if there are topics you'd like me to cover in future posts. 😊

https://redd.it/1i3h6s6
@r_devops
People pleasing behavior

Maybe not totally DevOps related, but since I'm working in DevOps and crisis happens a lot in this area, then I think some of you may be able to help. I have a tendency to be afraid to be honest if something requires more time due to some issue, or if I have made a mistake, especially to my higher ups/customers. I'm afraid somehow of them being disappointed or angry/throw disappointment messages at me.

For a concrete example, due to lack of experience, I did not configure clustering to a MongoDB instance I deployed in k8s, causing it to be only 1 replica and thus if it goes down for some reason, then any other services can't access data in it. It did went down one day, and I tried my best to debug and hot-fix it on the fly without telling anyone, since I'm afraid they will be mad at me why I couldn't implement it the first time correctly. There are many other examples like this.

Sometimes I could solve things but have to sacrifice my sleep, weekends etc. so that it is fixed before anyone notices and I have to give excuses/apologize. And if I couldn't and they express their disappointment/anger, it demotivates me and stresses me out, causing me to hate my job even more.

Does anyone else suffer from this? How can I change this soft-skill? This people pleasing and afraid of other people's disappointment/opinion behavior causes me a lot of stress at work. Perhaps I had a childhood trauma or/and afraid of getting my contract/employment terminated. Perhaps there are resources/books I can use to get better?

https://redd.it/1i3jobr
@r_devops
Our image retention policy

For compliance reasons, our internal customers have to keep essentially everything that has ever been deployed to production. But code used for dev or test or the like can be culled after a while. We're trying to set a container image retention policy to suit. Our thinking at the moment is we could allow customers to tag production images with SemVer or CalVer style tags, and then set the container registry to keep those forever.

Setting a regex to match SemVer seems easy enough. I believe a sufficient pattern would be:

'^(.)?[0-9]+\.[0-9]+\.[0-9]+(.)?'

But CalVer? Oh boy. Looking at the site for it, it looks to me like pretty much anything can be called "CalVer" as long as somebody says it matches a calendar cycle. I have one customer using "Q" (for quarter) in their tags. I feel like if we allow "CalVer" we're going to be configuring a custom retention policy for each and every customer who says they're using it.

https://redd.it/1i3ld4u
@r_devops
EC2 k8s with NLB for control plane HA endpoint issue

Hi everyone,



Currently I have no choice but to create ec2 instances to run on aws for the simple k8s set up. It is a dev environment. So I have 3 control plane nodes.

They have joined the cluster as a control plane and I have a Network Load balancer setup with a target group for both. Everything is running smoothly. I am using the TCP 6443 for the NLB health check.

this is how I do the init at the beginning



kubeadm init \
--control-plane-endpoint 10.50.230.100 \
--apiserver-cert-extra-sans elbendpointurl \
--upload-certs \
--pod-network-cidr 172.16.0.0/16 \
--service-cidr 192.168.0.0/20



I have 3 control plane nodes with cluster joined. And then I update each /etc/kubernetes/kubelet.conf and use the nlb endpoint for that.

But what I noticed is that when I shutdown my master1 for the HA testing, I could not run kubectl get node command on my laptop, master2 and master3. I am not sure what I did wrong. As on prem, I could easily use kube-vip for doing that and achieve the HA. However, we are running out of IP in the subnet in dev environment and I decided to use the NLB as the control plane HA endpoint entry.

Do you encountered this issue before? Do you have any suggestion/recommendations for me? Thanks!



https://redd.it/1i3n523
@r_devops
Is there any alternative to Jenkins as a code for platform engineering solutions?

In my company we use OKD (kubernetes), ArgoCD, Vault, Artifactory, Gitlab, Windows AD and Jenkins (with JasC obvoiusly) for our platform engineering solutions. Basically with few clicks (running few pipelines in "MASTER jenkins") we can set new project with new namespace in OKD, new namespace in Vault, new project in argocd, new jenkins with all necessary plugins and settings (basic pipelines for building the projects images), separate space in artifactory, AD entries etc. Gitlab is only for git repository, maybe we could replace it with something simpler (Gitea?) but it would be a lot of work and not sure if it would work well with AD.

All this works quite well. Until there is some update. Usually its the jenkins plugins are ones who cause most of the issues. So are there any alternatives to it? Or maybe whole platform without spending hundreds of thousands dollars for subscriptions?

https://redd.it/1i3nysl
@r_devops
Why do you need GitOps tools like ArgoCD and Flux if already deploying with CICD pipelines ?

Hello,

In our DevOps team we have everything deployed to development and production envs via CICD pipelines on Github Actions that basically run helm commands in each job to lint/test/deploy each new application into our K8s cluster.

We also have the same with Terraform, where all out infrastructure is stored as TF code on Github and deployed/updated/destroyed via pipelines/GH Actions. Therefore a large part, if not most of our infrastructure is stored in git as the source of truth.

In this case, does it make sens to add a GitOps solution like FluxCD or Argo, since most of the state is already handled by git via CI/CD, what could these tools add to the table according to your experience ?

https://redd.it/1i3rzr5
@r_devops
Can someone who uses GitHub Actions chime in for a multi-region cloud deployment?

How are you running GitHub Actions for your cloud multi-region app deployments? Are you doing your builds in one region, pushing the image to another region, and then deploying to your compute across multiple regions? I want to understand how everyone is deploying to different regions and how I should structure my GitHub workflows. I want to know what works best and what are some things I need to be on the lookout for. Thanks for the help!

https://redd.it/1i3seze
@r_devops