Reddit DevOps
267 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Is anyone feeding their server or application logs to AI?

Either using some paid service, burning API credits or self hosting an LLM? I'm about to start experimenting.

https://redd.it/1gdxo8y
@r_devops
Serverless vs Serverful

Hi all,

Novice full-stack dev here. I need your opinion regarding the tech stack + deployment of a greenfield, multi-tenant web app for which I have 2 interested customers (payment plan pending) whose pain points are resolved, with hope to have many in the future but not more than 10k users globally.

My initial impulse is to have zero deployment costs, with a dockerized monolith backend (hosted on an always-free Oracle cloud VM), an Angular frontend hosted per Netlight / Cloudflare, and database hosted on Supabase. The reasoning is that “if” I’ll have an increased demand, I’ll simply scale these services vertically, and maybe even go cloud-native in the future.

Competing with this thought are my AWS cloud skills from work, which push me to going completely serverless and using managed services to speed up development and not think about infra scaling and security down the line. However, if I do it right, with API GW, WAF, etc. I’ll incur costs from the get go (even with free tier) without having seen a single payment from the customer(s).

In your experience, which option would you recommend in such scenarios? Would you recommend I disregard the minimal costs from AWS and go cloud-first to prevent future headaches when I’m focusing on delivering features / adapting business logic, or should I experiment with all-free services to wait until I have enough customers that support putting in effort/costs to go cloud-native (given that all code needs to be refactored / changed anyway)?

The application needs a REST API to perform CRUD operations on multiple related tables in a PostgreSQL DB, and start many task queue operations per user.

https://redd.it/1ge11ux
@r_devops
GitOps Setup: Security Concerns with Automated Deployments

**Current Setup**:I have a straightforward but powerful GitOps workflow that consists of these steps:

1. Developer pushes code to source repository (GitLab.com)
2. CI/CD pipeline builds Docker image
3. Automated pipeline updates image tag in application repository
4. ArgoCD detects change and deploys new version



**Problem**: While this setup works well, there are security concerns. External developers with access to one repository can theoretically manipulate image tags of other repositories by triggering the pipeline with different values. There's a lack of granular access control for the deployment process.



**Planned Solution**: I'm planning to develop a open-source service to address this security gap. Here's the current design:

API Endpoint for New Image Tags: Route: /api/v1/new-image-tag Parameters needed:

* image\_tag (e.g., 1234567890)
* branch (e.g., main, develop, or feature/my-cool-feature-1)
* secret\_key (from GitLab CI variables)
* repository (e.g., gitlab.com/my-application/frontend)

The service will:

1. Validate permissions using secret key and repository
2. Determine environment from branch name
3. Create feature-branch specific values files automatically
4. Update image tag in values-$STAGE.yaml

API Endpoint for Closed Merge Requests: Route: /api/v1/closed-mr Parameters needed:

* mr\_id (e.g., 45)
* secret\_key (from GitLab CI variables)
* repository (e.g., gitlab.com/my-application/frontend)

The service will:

1. Get branch name from merge request
2. Check for open merge requests in other repos with same branch name
3. Clean up feature branch configuration if no other MRs exist
4. Allow ArgoCD to remove obsolete deployments



**Questions for the community**:

1. Are there existing tools that solve this security issue?
2. What are the best practices for securely handling image tag updates in a GitOps setup?
3. Would developing a new open source solution for this be valuable?

I'm particularly interested in solutions that maintain the simplicity of GitOps while adding proper security controls. Before starting development on a new tool, I want to make sure I'm not reinventing the wheel. Any insights or suggestions would be greatly appreciated.

https://redd.it/1ge0t5g
@r_devops
Should we migrate our IaaC from Terraform to OpenTofu and deployment using Terragrunt with Terramate?

We manage all of infrastructure using Terraform only and because of this we have really big Terraform stacks even tho using modules we end up having 3000 lines in main.tf due to so many services and resources.
1. One issue we faced was, whenever we try to deploy the TF using Mac, we get some drift in the plan, but that is not the case in linux or windows machines, not sure of the file handling is different or some other issue,
2. Second issue we faced was that sometimes when planning we see some drift on DB resources and for production it really scares us like why there is showing changes in DB resources even theo all i did is just changed the values for computer resources,

For first problem we moved to gitops and do all the deployment through aws CodePipeline only, for second issue we decided to use terragrunt since it breaks the stacks and due to the structure, we can use singe repo to store multi region and multi environment deployment with less code and bette file structure, but in terragrunt we don't see change detection, for this we need to ise Terramate and tbh there's very less resources available for the same online, so I'm little worried if should we move out production IaaC to Terragrunt with Terramate and migrate from Terraform to OpenTofu?

If any one of you have done something similar, can you please share your experience considering these are somewhat in early stage, not sure how much these tools has become mature.

Please suggest,
Thanks!

https://redd.it/1ge8mwq
@r_devops
Looking for advice

Looking for a bit of advice


Hi all, I am a junior DevOps engineer I have been a DevOps for 3 years and my skills are mostly comprised of Terraform, AWS and GitLab.


Bit of background: I have a degree in Maths so this is my first experience in IT.
I have learned everything pretty much on the job, and in order to learn I used a variety of resources and some certifications (yes they are not everything but, for me certs are a good way of structural learning ).

Currently I have :
- Terraform Associate
- AWS SA Associate
- AWS DV Associate
- AWS Sysops Associate

We don’t have many cloud requests now at my company and I have been presented an opportunity to join a different project that is focused in Linux automation, Containers and Ansible pretty much to be a Sysadmin.

I am torn because:

- I think I have come along way and my AWS and Terraform knowledge is very good.

- I also feel like I don’t have much Linux knowledge, I can google stuff and find solutions but I always feel like I don’t have enough operational Linux knowledge to be a DevOps altho I have barely used Linux on my work.

- Changing environments is not as easy specially when I have to basically learn everything again.

- My ultimate goal is to be a Cloud architect, and I don’t want to go to far from this path (Although I have been assured that I will still be involved with AWS Projects).

Basically I’m just wondering for Seniors or more experienced DevOps engineer, if you were starting again and you were faced with the same situation how would you go about it?

https://redd.it/1gebsbb
@r_devops
Is devops/IT all doom and gloom?

I've been researching getting into the field of IT/devops and will have the 3 basic CompTIA courses in 3 months time and also starting a homelab with specific devops related projects.

I've read so many comments and posts of the industry going down and no jobs, is this genuinely the case and I'm wasting my time starting all this or is there still a future in the industry of course with the right work and effort? I'm based in the UK

https://redd.it/1ged01o
@r_devops
Live Coding for Interviews in DevOps roles

I've been requested for live coding for a DevOps role. I've been told I can use whatever tools I can for presenting my skills, but I'm confused as to what can I present in 30m/1h of an interview? I'm good with python if they want code but I think that DevOps requires more templates and architecture planning etc... Since the challenge is very open I think it has been hard for me to think of a case where I can show my skills as a DevOps because the daily job envolves a lot of troubleshooting and configuring? Do you guys have any tips for mastering this kind of interview? Thanks.

https://redd.it/1geiron
@r_devops
Can we talk about App Center's shutdown?

I'm looking for a distribution tools that do what App Center does, and I have found a few I think are worth investigating. My question - has anyone found a good alternative tool yet, or can recommend one, give insights into their tool experiences? The ones I've shortlisted are:

[Runway.team](https://www.runway.team/)
Previews
[Release Management](https://devcenter.bitrise.io/en/release-management/getting-started-with-release-management/connecting-another-ci-service-to-release-management.html)
Build Distribution.

Any advice is helpful! TY

https://redd.it/1geg9fm
@r_devops
What to use for mass deployment with default configs?

Hi everyone. I made Python scripts using Paramiko and Selenium(SSH is disabled by default on the switches) for mass deployment of networking gear. The configs are exactly the same for every single switch and router which means management IPs are the same for the switches, etc.

My Python script updates firmware first, then adds the configuration so that I don't lose the connection to the hardware. I'm trying to make the script better by making it a CLI tool or using a different tool which is what I'm asking here for.


Ansible, Netmiko, or stick with my current scripts?

I want to add real concurrency (Go?) instead of using starmap from the multiprocessing library.

https://redd.it/1gelwdz
@r_devops
Are there IaC security scanning tools that are not noiser and allow you to select what rules to scan?

The default choices are too noisy like checkov kicks tfsec

https://redd.it/1geniim
@r_devops
secret variables in GitHub actions - more than 100 env vars

I have been working on the containerization of our existing application and the applications uses a lot of env vars/keys to work, there are about >100 vars for each environment. Also, we do not want to push our .env config file to github. As per GitHub, You can store up to 1,000 organization variables, 500 variables per repository, and 100 variables per environment. The combined size limit for organization and repository variables is 256 KB per workflow run.

So what would be an alternative for it? and considering the vars changes based on the environment, what would be the best and efficient way to tackle this?

https://redd.it/1geoh7g
@r_devops
I want to learn a scripting language

I have been using Go for scripting for 6 months, but I would like to learn a more suitable language for scripting, like Python or Bash. Which scripting language would you recommend me to learn and why? It would also be nice if you shared any resources to learn the language.

https://redd.it/1gepl6x
@r_devops
Why Cloud Engineering & DevOps Are Essential for Modern Business Growth

The Technological landscape in today’s fast-paced world is changing. Businesses constantly seek ways to optimize their efficiency, scalability, and innovation. The rise of Cloud Engineering and DevOps has played a significant role in the changing dynamics of businesses. Many businesses’ successes involve having cloud engineers and DevOps departments in their company.

Learn More: **Why Cloud Engineering & DevOps Are Essential for Modern Business Growth**

https://redd.it/1gep5hq
@r_devops
Any Advantages to running nginx in a docker container?

Typically I run this with apt install nginx and then configure the config files. As the title suggests, are there any advantages with 'docker pull nginx' and running nginx separately in a docker container on my VM.

I haven't had any issues with it running globally, but assume if it crashes then the whole machine goes down, whereas with docker only the container would?

Thanks.

https://redd.it/1ger10o
@r_devops
advise for creating listening process in aws ecs

i have an application in EC2 with laravel to server as listener queues to standby receive any queue available in SQS to process. It is working fine with supervisorctl in a EC2 instance. Lately i try to dockerize it and run with ECS runTask by define the artisan queue command in the docker command to hang the session. But i notice it i have a new version of ECR how can i restart all the listener queue task i run in ECS ? roughly we have 21 listener queue so is impossible to run manually 1 by1.

https://redd.it/1ges7id
@r_devops
Should I go for GCP Ace or AWS associate developer?

So i just got into gcp cohorts where they will provide some discount or free cert for ACE of i qualify. So i am going to start my internship in January for Devops and company is AWS and Azure centric. I already have some experience with AWS so I don't think getting associate developer will take time. So any idea what should i do? I am too confused 😕

https://redd.it/1gesxaa
@r_devops
Jenkins jobs logging solution needed

Hi All,
I have around 200 Jenkins jobs running for a bunch of projects. Not all of them are deployment jobs only a very few & this query is not regarding them. I have other 170+ jobs which are created to run certain functionalities within few applications. They're like cron jobs (or batch jobs) you can say.

So these batch jobs are like taking file input from various SFTP servers & then executing them one by one.

Issue is that these jobs give success message even if one of the file from any SFTP server is not fetched. Let's say each job is fetching 10 files from different SFTP's and they miss out on 1 file but successfully execute 9 of those so it's still a success. Now it's not possible for me to go into console log of each job & see which of these executed all 10 file; if I'd do that then it will be very time consuming.

Is there any solution for cases like this where I can have a dashboard or anything which collects the logs from all specified jobs & I can check them all in as minimum time as possible? I was thinking something like ELK?

Thanks in advance.

https://redd.it/1geso87
@r_devops
Why do we need automated regression testing in CICD piplines?

Somke tests,integration,end to end. I am trying to grasp the whole picture. Why do we need regression testing? How it should be implemented? What are the pros and cons? Blog post or books on this,would be welcome.

https://redd.it/1gevfvz
@r_devops
Secure deployment on client's system

Hi,
I have an application which runs on multiple EC2 instances and around 10 Dockers running in that. It process some sensitive data.
Now what is the best method to deploy this on a client's AWS account? I need to protect my logic, sources code and some other data.
Since it's client's account, they can login to EC2 and see the contents. How can I prevent this? What are some best Industry practices?

https://redd.it/1gewgyz
@r_devops