Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Why Interviews have become so one-sided nowadays

I have been giving interviews these days and have encountered so many instances where I found that the interviewers are not even trying to interact with interviewee. They are just starting the process start grilling like if they are facing their enemy and then in last with very less interest asking do you have any questions.

I had given lot of interviews in past but this time I'm seeing it completely different. They are looking for everything to be perfect in an hour call and based on that they are going to decide whether you're a fit or not.

Folks please add your thoughts.

https://redd.it/1iuwewc
@r_devops
On-Premise Minio Distributed Mode Deployment and Server Selection

Hi,

First of all, for our use case, we are not allowed to use any public cloud. Therefore, AWS S3 and such is not an option.

Let me give a brief of our use case. Users will upload files of size \~5G. Then, we have a processing time of 5-10 hours. After that, we do not actually need the files however, we have download functionality, therefore, we cannot just delete it. For this reason, we think of a hybrid object store deployment. One hot object store in compute storage and one cold object store off-site. After processing is done, we will move files to off-site object store.

On compute cluster, we use longhorn and deploy minio with minio operator in distributed mode with erasure coding. This solves hot object store.

However, we are not yet decided and convinced how our cold object store should be. The questions we have:
1. Should we again use Kubernetes as in compute cluster and then deploy cold object store on top of it or should we just run object store on top of OS?
2. What hardware should we buy? Let's say we are OK with 100TB storage for now. There are storage server options that can have 100TB. Should we just go with a single physical server? In that case deploying Kubernetes feels off.

Thanks in advance for any suggestion and feedback. I would be glad to answer any additional questions you might have.

https://redd.it/1iuy4xk
@r_devops
How does everyone handle versioning/releases with monorepos?

We are using Trunk Based Development & a monorepo setup for around 50 services.

Ideally, I would like to have each service individually versioned as having a version for all doesn't scale well, mainly around the fact it would trigger a release pipeline for every service, even if it has no changes.

How does everyone approach this around releases?

It is not scalable either to have the developers or owner cut a release branch for every single service release/service1/1.0.0 or release/service2/1.0.1 for example. It would take a while and would just be a tedious job.

How does everyone approach this situation?

I was thinking some sort of pre-release pipeline which runs git diff to determine which release branches should be cut, the only issues with this is figuring how to get the pipeline to determine which version should be bumped, we are using semver.

https://redd.it/1iuvs6y
@r_devops
Kubernetes Ingress Controller Guide

If you are interessted in learning how to expose services in Kubernetes, read through my new blog article! It's a step by step guide, how to setup an NGINX Ingress Controller via Helm charts.

Medium Blog Article Link

https://redd.it/1iv23xw
@r_devops
Private tf module registry still a thing?

Long story short, we have tons of terraform module re-use and copy/paste across repos and services, so we are looking to create a central module registry/monorepo.

Is this still what most folks are doing? Is this still an adequate way of providing self-service to some extent to product engineers without them having to worry about how their infrastructure is being provisioned.

I know there's a lot of new tooling and platforms in his space so curious as to what others are doing. Things move so fast so it always feels like we are doing things incorrectly.

Thanks

https://redd.it/1iv6tfl
@r_devops
Windows vs Linux on enterprise level

In which case scenarios is Windows Server better than Linux?

https://redd.it/1iv9gfh
@r_devops
My first web server

I am configuring a web server for the first time, I literally have a physical server in my hands and I am deploying web apps and REST APIs.

This is my first experience using any server OS so I choosed Windows Server, I know that it is probably not the safest or most efficient choice for a web server but I thought it was the fastest way to start and learn server concepts in aa practical way. This machine has 3 disks (1TB each), I used one for the OS and configured a RAID 1 for the other two.

As a web server in software level, I am just using an simple Express web server to deploy every single web application, and all the APIs that are deployed are also developed in Express so yeah, Express everywhere. I am using PM2 to handle node processes. When there are any code changes, I pull the code from Github, perform any task needed (building, installing dependencies, etc.), and reload the process. As the applications are used in the same local network, I create reules in the windows firewall defender to open the ports in which the web services or web applications are listening.

What should I do next to improve and learn in a good rythm? What would be the next step? My main priority is to learn about all fundamental concepts of a server in a practical way.

https://redd.it/1iv9ezf
@r_devops
Gitlab pipeline timeout when uploading security scan to defect dojo

Hi Everyone,

I am facing a issue trying to integrate defect dojo with my gitlab ci/cd.

Here is the breakdown:

I am using gitlab built in security scanning templates for dependency scanning,container scanning.

These template generate json reports after scanning.

I am using a python script to upload these json reports to defect dojo

From my local  machine we access mydomain.defectdojo.com via vpn

I can curl with with vpn enabled and upload results.

But in gitlab pipeline the requests api i use to upload throws connection timeout to  mycompany.defectdojo.com 

I also tried running direct curl in the pipeline but it showed  couldnt connect to server

Is this due to vpn not in pipeline ?

How can i fix this issue?



https://redd.it/1ivbcp2
@r_devops
Secure way to share flutter mobile app without sharing code

Hi, in my company we have to give our onboarding flutter app to the vendor whose trading app we’re using and intergate our app with theirs. Now is there way to share our apk in a way that they can integrate it but not get access to the code.

https://redd.it/1ivclgt
@r_devops
Azure RM API Deprecations in Q1 2025 – What It Means for Terraform Users

If you’re managing infrastructure with Terraform on Azure, Q1 2025 will bring preview API deprecations for Azure Resource Manager (Azure RM), including APIs for Azure Kubernetes Service (AKS) and other resources. Now is the time to check your provider versions and ensure compatibility.

# What’s Changing?

Azure RM provides a structured way to manage and deploy Azure resources. Microsoft frequently introduces preview APIs, but these can change, get deprecated, or be removed entirely. Terraform’s azurerm provider depends on these APIs, which means unexpected changes can break your infrastructure.

# What You Should Do

Identify the Azure services in your Terraform-managed infrastructure. Whether it’s AKS, Storage, App Services, or Databases, knowing what you rely on is the first step.
Check the API versions your provider is using. Terraform’s azurerm provider often includes preview APIs, making it important to track which ones are in use. Example: Containerservice APIs in version 3.105.0 link
.
Monitor upcoming API deprecations. Azure phases out older APIs regularly, and failing to update could lead to outages.
Review your Terraform provider versions. New releases may introduce breaking changes, so read the release notes before upgrading.
Test changes in a lower environment before deploying. Validate any updates in a controlled environment to avoid unexpected failures.

Keeping up with API deprecations is key to maintaining reliable Terraform deployments. If you haven’t reviewed your setup yet, now is the time.

https://redd.it/1ivf6ae
@r_devops
Bootstrapping CD for Terraform + Docker

TLDR: What's the best practice for managing infra with custom Docker based images using Terraform?

We primarily use GCP and for a lot of simple services we use Cloud Run with GAR (Google Artifact Registry) to store the Docker images.

To manage the infra, we generally use Terraform and we use GitHub Actions to do CI & CD.



Deployments to new environments comprise of the following steps:

1) [Terraform\] Create a new GAR repository that Docker can push to

2) [Docker\] Build and push the Docker Image on the newly created GAR and then

3) [Terraform\] Deploy the Cloud Run service which uses the GAR, along side any other infrastructure we might need.

This 3 step process is usually how our CD (GitHub Actions) is structured and how our "local" dev (i.e. personal dev projects) works, both usually running with just as the command runner.

Terraform needs to have a "bootstrap" environment which gets deployed in the first step, separate from the "main" one used in the third. Although, instead of using a separate bootstrap environment, you can also use -target to apply just the GAR but that has its own downsides imo (not a fan of partial apply, especially if bootstrap involves additional steps such as service account creation and IAM role assignment).


It's possible to avoid having two Terraform apply steps by doing one of the following:

\- Deploy the Cloud Run services manually using the gcloud CLI - but then you cannot manage it well via Terraform which can be problematic for certain situations.

\- Perform the bootstrap separately (perhaps manual operations?) so normal work doesn't require it - but this sounds like a recipe for non reproducible infra - might make disaster recovery painful

\- Run the docker commands as part of some terraform operator (using either a null resource with local exec or perhaps an existing provider such as kreuzwerker/terraform-provider-docker), but this might be slow for repetitive work and might just not integrate that well with Terraform



Any suggestions how we can do this better? For trivial services it's a lot of boilerplate stuff that needs to be written, and it just drains the fun out of it tbh. With some work I suppose it's possible to reuse some of the code, but we might put some unnecessary constrains and abstracting it right might take some work.

In a totally different world from my day job, my hobby NextJS apps are trivial to develop and a lot more fun. I can focus on the app code instead of all this samey stuff which adds 0 business value.

https://redd.it/1ivepjr
@r_devops
Am I Ready for DevOps?

I started off learning about DevOps soon after I got into self hosting and running my own homelab, fast forward a few years this has become my addiction. I work with VoIP currently and play around with Linux a bit for work but nothing with containers or DevOps tools, so i have just been learning with my homelab.

Anyways, Im sick of VoIP and my current role, and would like to start applying for some Jr DevOps roles but am curious from the people who actually do this as a job if you would think I am prepared enough just based on my homelab.

Personally I think i need to get better with Ansible, Kubernetes, adding more things to Terraform/OpenTofu, and learning coding languages, this is what I am working on currently.

All of the config can be located here https://git.mafyuh.dev/mafyuh/iac or on Github here https://github.com/Mafyuh/iac

Please critique and let me know what you think, this is my first time ever posting in DevOps so dont really know what to expect but id love to hear it all, good or bad. Thank you

https://redd.it/1ivhjrn
@r_devops
SPRING BOOT MICROSERVICES ISSUE : even when i deployed my spring boot microservices in Digital Ocean droplet , i am not able to use that ip address inside POSTMAN why ? is there any reason or i lack some information about this ? for eg. https://111.11.11.111:8082/register/user but i error coming,

help me please !! Could not send request
Error: connect ECONNREFUSED 111.11.11.1111:8082
i deployed all my microservices and they are running through digital ocean with .jar file but still this why???

https://redd.it/1ivhhfz
@r_devops
How Are You Handling Professional Training – Formal Courses or DIY Learning?

I'm curious about how fellow software developers, architects, and system administrators approach professional development.

Are you taking self-paced or instructor-led courses? If so, have your companies been supportive in approving these training requests?

And if you feel formal training isn’t necessary, what alternatives do you rely on to keep your skills sharp?

https://redd.it/1ivldyd
@r_devops
Packing RPMs from source - what are you using at scale?

Hi there,

We're running a largish AWS deployment (about 5k EC2 instances), a mixture of Alma 8 + 9 on aarch64. We have a number of packages we run on these nodes that are significantly out of date on the public mirrors e.g. Strongswan (nobody is packaging Strongswan 6 for Alma on aarch64 yet). How can we deal with this? We attempted to use Fedora Copr to build from source and package as RPM - however we had to write our own SPEC files and these kept failing.

We were thinking of using something like Github actions linked to an ARM EC2 runner to build form source? This still doesn't give us an RPM though.



https://redd.it/1ivljyg
@r_devops
NEED for MENTORSHIP and guidance

Am a pre final year CSE Cloud computing student, and i have develpoed an immenese liking for devops and cloud witha basic understadning of cloud and cloud services. I am so desperate for finsing an internship but i have no knowledge of where to begun , i have roadmaps and all but all i need is ine mentor who can guide me well throught the chaos of my mind and make me proficient in devops and cloud . As of now , i cant say i have any skill set i am well versed in , and yeah ik , its a disgracing thing ..but now i want to leanr with full focused and with correct resources, cuz i cant let my parents's money go ointo paid courses where i dont have a proper guidance and mentorship who can be with me on my journey ...

i need your guys' help and support

https://redd.it/1ivoohe
@r_devops
Is This a Scam Placement Company?

I received a message on LinkedIn from someone claiming to be with a placement company called HireEaze. They said they would provide resume building, interview coaching, and send out my resume to several companies per week. They also guarantee placement within 45 days. The catch is that they want 15% of my first year's salary, and the initial document they sent over is full of spelling and grammatical errors. Everyone I've talked to on the phone has an Indian accent, but the phone numbers are American. Has anyone used this company or one like it? Or is this just a scam?

https://redd.it/1ivo4bb
@r_devops
Do you have a list of project topics for POC-ing?

I would say that there are two types of PoC projects - super small, where you just write "Hello World" to a console, and slightly bigger one where you want to have a real topic behind the code.

For example, if I need a web service of some sort, my go-to project would be a pizza selector. Developers can have a list of pizzas available, and users can randomly select what pizza they want to order next time. I used that couple of times already and it is getting old :)

Do you have a similar type of project that is funny, somewhat useful and can be easily implemented/explained?

https://redd.it/1ivtlak
@r_devops
Icosic AI: Perplexity For Your Company’s Server Logs

Hello!

I'm Zuri, founder of Icosic AI, a startup based in San Francisco - we are Perplexity for your server logs.

The problem:

- searching through and filtering your logs using keywords is tedious at best

- semantic search is a step up, but still has no real intelligence regarding your query or your server logs

- engineers spend around 10 hours per week sifting through logs to investigate issues and uncover insights

The solution:

- Icosic AI is an intelligent search engine for your all of your company's server logs

- We use LLMs to intelligently understand your search query and intelligently understand all of your logs

- This gives you insights and answers that previously would take your engineers hours to uncover

- For example, a fintech company's engineer could ask "Why has there been a spike in transaction failures this morning?"

- Another example: "Tell me all instances where we got a high latency warning within 2 minutes of a transaction failure"

The time and cost savings:

- A typical example is a company with 100 engineers, where 20 of them each look through logs 10 hours a week to investigate issues and uncover insights and information

- If they're paid $70/hour, that's $70 * 10 hours * 4 weeks * 20 engineers = ~ $56,000 / month searching through logs. Our search engine does ALL of that for you.

More:

- You can integrate with your existing observability platforms like Datadog and Splunk to use logs that you've indexed there

- You can also just use logs that you've got on a cloud server somewhere at a specified path, for example /var/log/example.log

- You can use unstructured or structured logs, or both!

If you’re interested in finding out more, feel free to schedule a call with us from our landing page:

https://icosic.com

Also, you can start playing around with the product using our demo logs right away, no sign in required:

https://app.icosic.com

Feedback would be much appreciated!

What other integrations would you like to see? Let me know in the comments!

Thanks,
Zuri Obozuwa

https://redd.it/1ivx0db
@r_devops
Pipeline for dev containers to ecs?

Hey all! Just kind of thinking out loud here.

So I have pipelines etc in place that handle deployments to ecs. But these are tightly integrated with other services and I handle the deployments.

If I wanted to create a portal & pipeline where devs could enter the resource reqs and specify their repo / branch for a container image that’s built then deployed to a sandbox ecs env that has endpoints for common services and flexible network constraints. Is there any good resources to reference for this?

I feel like I’m excluding features and use cases I haven’t thought of that would be really cool here to improve the dev experience and give them some more autonomy in dev deployments. So any ideas, or similar setups you have and how you use it I’d love to hear about!

Cheers.

https://redd.it/1ivxly0
@r_devops