Reddit DevOps
271 subscribers
21 photos
31.3K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
I recently accepted a job offer after my internship and it’s for DevOps. I’m mainly proficient in coding and my team mainly uses me for coding. Would it be unwise to move to a SWE position in the future given that I have a DevOps title?

Anyone have experience transitioning into SWE after doing DevOps and did it effect your career growth in any major way?

https://redd.it/1g2hohb
@r_devops
I’m looking to build a task management platform that keeps tracks of when a task is done processing. I’m looking to add a message broker, should I go with RabbitMQ or Kafka?

I have a Golang server that does the orchestration of a VPS server on the cloud and assigns each task its own VPS instance. Once the task is done, the Golang will delete the VPS instance. It works for small scale but I foresee that it won’t be scalable.

Should I be using RabbitMQ or Kafka as a message broker to handle this? What’s the most cost effective and scalable approach?

https://redd.it/1g2h5kp
@r_devops
Are you using LLMs in your DevOps work

Hello,

I would love to hear out your experiences in terms of how are you using LLMs in your daily work

Such as brainstorming architecture ideas or even writing scripts or IAC using it

https://redd.it/1g2j06f
@r_devops
Migrating from On-Prem SMTP Relay to Azure Communications Services - Seeking Input

Hey all,

I’m working on a project to migrate from our current cloud SMTP relay to Azure Communications Services. Right now, we have an on-prem SMTP relay at 12 global locations that forwards email traffic from various on-prem devices (potentially hundreds) to the cloud relay. I’m trying to figure out the best approach for this migration and would appreciate some input.

Here are the options I’m considering:

1.Service Principal for Each Device: Setting up a service principal for each on-prem device to directly use Azure Communications Services. This could scale to hundreds of devices, which seems like it could be an admin headache.

2. Keep On-Prem Relays: Retain the on-prem relays and have them forward into Azure Communications Services. This might help with managing the scale but could complicate the architecture.

3. Hybrid Approach in Phases: Implement both options in two phases—keeping the relays initially, then gradually moving to direct integration.

I’m also thinking about automation for provisioning new devices/services to ensure it’s not a bottleneck. In a past project, I decommissioned on-prem exchanges and moved to AWS SES, automating user provisioning with CloudFormation. However, this setup involved far fewer services compared to what I’m facing now in Azure.

Has anyone here gone through something similar, or have ideas on how best to tackle this without creating an admin nightmare? Any tips or best practices for scaling, automation, or managing the transition would be appreciated.

Thanks in advance!

AI was used to help articulate my thoughts.

https://redd.it/1g2kgdq
@r_devops
Flyway integration

Hello, I am currently looking for schema migration tools that can be used to replace our current one, anybody here used it before? Do you use it as a stand alone or do you have it partnered with other tech? As of now I have it running in jenkins, I write ddls on vscode and move it to the container dedicated to it and then just call the location via Jenkins, any advice on others setups and make it more automatic? Especially the naming part... Thanks in advance

https://redd.it/1g2mkgw
@r_devops
How do you size your internal engineering teams?

One challenge I've always had running teams that build/maintain internal tooling or provide internal support is getting hiring budget. It's easy to justify staffing up product teams, but when your impact on the business is less direct, how do you determine the "right" number of heads?

https://redd.it/1g2rh1p
@r_devops
How should Grafana deployment should set up on production environment?

I have a NodeJS application running on EKS cluster on AWS (deployed with Terraform). Each cluster pod exposes route /metrics exposing Prometheus metrics of the pod.

Next, I'd like to use Grafana to have visualziation and analytics of my cluster. I'm not sure how Grafana should be deployed in my system.

I thought on deploying Helm chart: https://artifacthub.io/packages/helm/grafana/grafana But Grafana should persist data (the dashboards) - so I think it's bad idea - because deployed Helm chart of Grafana -> I lose data persistence.

So I thought of deploying AWS Managed Grafana service (https://aws.amazon.com/grafana/), but now I'm not sure how do I connect this Grafana to my EKS cluster to collect the data?

---



I will clarify my question, currently I deploy Prometheus using Helm chart: https://artifacthub.io/packages/helm/prometheus-community/prometheus in my EKS cluster. This is the point I try to understand. While Grafana is responsible for data visualization - it needs the data to visualize. And as far as I know - Prometheus is responsible to hold this data. So my question is:

- Where Promethues stores my metrics data? how can I make it persistence?
- How do I connect my EKS Prometheus deployment with AWS Grafana?



https://redd.it/1g2tutb
@r_devops
Is there any free tool for security check for code

Hi

I am running a micro project with single developer, and need to scan the developer code for weaknesses.

I wonder if there is any tool that provides a free (even if limited feature) scan for the code to ensure that the code is secure and no mistakes or bugs, such as hardcoded password, stored key...etc

Thanks alot

https://redd.it/1g2umak
@r_devops
What's your next career step? Seeking advice

I essentially feel like I woke up one morning and realized I am 44. Been working in the infrastructure and devops field since the late 2000's, and still remember fondly learning that there were a lot of lord people than me at my first devops conference in early 2010's, which faced the same challenges and who essentially wanted to do things in a better way.



I look young (people are always shocked to learn my age and tell me they thought I am about 30. Partly because of my hobbies that keep me very fit, and partly because of baby face genes. I'm hesitant to change my behavior, but I also know that I don't have the mannerisms of someone in their mid 40's - for both good and bad.


I have a decent paycheck with decent benefits, but I also don't think it's a good idea to simply keep on trucking - it would make sense to me to have some sort of direction or intention. And other than having to work with scrum, and all what that means - life is pretty good, which is kind of surprising from having battled depression a lot throughout life.



Do you have plans for what happens in your mid 40's, career wise?

https://redd.it/1g2wvxu
@r_devops
Understanding openshift internal image registry

I had this week a weird bug.
We started using JFrog Registry instead of imageStreams in our namespace.While all my deployments had ImagePullPolicy: always gitlab-ci has on default the Pull policy „if not present“.
As soon as I understood what happens I could fix this quickly. However it seemed that until I solved the bug the image used was different on any pipeline run.

Question: if the internal registry caches an old image I would expect that it would deterministicly pull the same cached image.
However, it always pulled randomly different old images.
How does it happen? Does openshift has multiple internal registries? Does it depend on the Node? I couldn’t find any explanation.

Thanks in advance


https://redd.it/1g2ud4i
@r_devops
How much backend functioning should I know to be good at devops

So I recently worked a bit with a small startup alongside my full time job remotely. Now due to the job and commute i couldn't give much time to the startup ,my personal laptop stopped working around the time i joined so it was a mess. I was disappointed in the end as i could not contribute much and in the end after a few weeks they told me that we can't work remotely like this and you need to work more on your skills as well to keep up (it wasn't paid or contractual).
Now In this time I was working on the frontend and backend,a feature within their app,nothing crazy but i couldn't give enough time to it . Now they told me that I should know about backend concepts as they'd come helpful in devops and troubleshooting systems. Even in the small interview they did ,they mostly asked me node based questions ( they were not from devops background,they were interns themselves) .
I'm already not satisfied with what I get to work with at my company, and now this makes me question my skills even further. So how can I get a grasp of these concepts. I want to make some productivity tools so that I can do a bit of programming too. Please help with this 😕 I have been working in this devops role since 4 months and this is my first job as well. My senior kinda sucks and follows a lot of bad practices and does so much manual work. So I'm worried if being stuck in such an organisation at the start of my journey will ruin my future opportunities.

https://redd.it/1g38akv
@r_devops
No calls in DevOps despite 1.5 YOE feels low



Hi everyone 🙌
I am having 1.5 years experience in AWS DevOps, applied to so many company still no calls. My current company don’t have good clients and projects. Help me to land a job what can u do in my cv ? Is my cv that bad for DevOps role 🙂‍↕️.
Here is my CV 👇


Technical Skills
Tools: Ansible, Docker, Kubernetes, Terraform(IaC), Maven, CICD(Jenkins), Argo CD, Git & Git-Hub, ELK(Elastic, Logstash,
Kibana), Prometheus & Grafana
Scripting Languages: Bash & Python
AWS Services: AWS Route 53, EKS, IAM, RDS, DynamoDB, ASG, CloudWatch, SNS, S3, AWS Lambda, EC2
Experience
Loomtex Exports July 2022- May 2023
- Managment trainee

Fusion5 August 2023 – Present
- Junior DevOps Engineer Hyderabad, India


Projects
Multi-tier web application
• Deployed a 3-tier application (Front-end, Back-end, Database) on an EKS cluster using Terraform for IaC.
• Configured CI/CD pipelines using Jenkins, integrated with SonarQube for code quality and Nexus for artifact storage.
• Automated environment setup with Ansible playbooks for Jenkins, SonarQube, and Nexus.
• Managed source code with GitHub and utilized Maven and NodeJS for building and packaging application artifacts.
• Enhanced project security using SonarQube and Trivy to detect and mitigate vulnerabilities.
• Built Docker images, stored them on Docker Hub, and set up comprehensive monitoring for system and website
metrics.
Microservice application
• Engineered the implementation of EKS via Terraform and configured Jenkins and SonarQube using Ansible, boosting
deployment efficiency by 35%.
• Established 12 different Jenkins multibranch pipelines to streamline CI/CD processes.
• Implemented Webhooks to increase automation and minimize manual work.
• Created and integrated application components using build tools specified in the pipeline.
• Utilized Docker to create images, transferring them to the Docker registry, and employed Trivy for enhanced security.
• Launched the application on an EKS cluster, using Prometheus and Grafana for performance tracking, achieving a
30% increase in uptime and improving resource utilization by 20%.
AWS Cost Optimization
• Developed and deployed an automated solution using Boto3 and AWS Lambda to remove obsolete EBS snapshots
exceeding 30 days, reducing storage expenses by 20%.
Certifications
AWS Certified Cloud Practitioner Dec 2023 - Dec 2026
Education
Institute of Chemical Technology Bachelor of Technology in Fibers and Textile Processing Technology June 2022
Mumbai, Maharashtra

https://redd.it/1g3aubb
@r_devops
Seeking Advice: Implementing a Container Image Proxy - What Do You Wish You Knew Before?

Hello there,

We're planning to implement a container image proxy in our environment, and I wanted to reach out to see what advice you all might have.

For those of you who have already set this up, I’m curious:

1. What are the biggest challenges you faced when implementing your container image proxy?
2. Were there any "gotchas" or pitfalls you wish you had known beforehand?
3. What tools or approaches did you find most helpful?

Any insights would be greatly appreciated! We’re currently assessing potential proxies (Harbor, Nexus, etc.) and planning how to integrate this with our existing CI/CD pipelines and Kubernetes clusters.

Thanks in advance for your help!

https://redd.it/1g3curl
@r_devops
Suggestions for a tool that can perform deployments from a monorepo

I worked at a large org some time ago. Their cloud was deployed by an Azure devops pipeline which ran a powershell script. The script would calculate files from the commit(s) and based on directory and file paths of said files, perform relevant actions, e.g. terraform apply, run a powershell script, apply an Azure policy. The had been developed in-house organicly (=a mess), and my question today is if there are modern open source tools that can perform something similar? E.g. orchestrate shit, based on some kind of rule set, but in a well defined framework.

https://redd.it/1g3e9gx
@r_devops
What is the most reliable way to deploy a react application in production?

I'm trying to deploy a docker container of a create react app but the environment variables sometimes are not set by the github workflow.
Dockerfile and github workflow

# Use node 21.7.1 as the base image
FROM node:21.7.1

# Set the working directory in the Docker image
WORKDIR /app

# Accept REACT_APP_HOST_API_URL as a build argument
RUN echo "The environment variable REACT_APP_HOST_API_URL is https://20.0.0.120:8080"
RUN echo "The environment variable REACT_APP_ENV is development"
# Set the environment variable so it's available during the build and runtime
ENV REACT_APP_HOST_API_URL=https://20.0.0.120:8080
ENV REACT_APP_ENV=development
ENV NODE_ENV=production
COPY package*.json ./
RUN npm install
RUN npm install -g serve
COPY . .

RUN npm run build

CMD ["serve", "-s", "build", "-l", "3000"]

EXPOSE 3000




name: Build and Push Docker Image

on:
  push:
    branches:
      - main  # You can change this to the branch you want to trigger the workflow on

jobs:
  build:
    runs-on: ubuntu-latest  # Use the latest Ubuntu environment for the build

    steps:
      # Step 1: Checkout the code from the repository
      - name: Checkout code
        uses: actions/checkout@v3

      # Step 2: Set up cache for npm dependencies
      - name: Cache npm dependencies
        uses: actions/cache@v3
        with:
          path: ~/.npm  # Cache path for npm
          key: ${{ runner.os }}-node-${{ hashFiles('package-lock.json') }}
          restore-keys: |
            ${{ runner.os }}-node-

      # Step 3: Install dependencies
      - name: Install dependencies
        run: npm install

      # Step 4: Log in to Docker Hub
      - name: Log in to Docker Hub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}  # Your Docker Hub username
          password: ${{ secrets.DOCKERHUB_TOKEN }}     # Your Docker Hub token or password

      # Step 5: Build the Docker image using a custom Dockerfile (Dockerfile-dev.yml)
      - name: Build Docker image
        run: docker build -f Dockerfile-dev.yml -t my-user-name/react-app:latest .

      # Step 6: Push the Docker image to Docker Hub
      - name: Push Docker image to Docker Hub
        run: docker push my-user-name/react-app:latest


they env are not set when executing the docker image:


`docker exec -it agent-react-dev-react-agent-app-1 /bin/sh`


`# env`

`NODE_VERSION=21.7.1`

`HOSTNAME=db4d9df42f42`

`YARN_VERSION=1.22.19`

`HOME=/root`

`TERM=xterm`

`PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin`

`REACT_APP_HOST_API_URL=`

`REACT_APP_ENV=`

`PWD=/app`

`# exit`

https://redd.it/1g3d1p6
@r_devops
What methodology/best practice would you suggest for devops regarding an Angular Project?

I am new and looking to get some information in CI/CD area for a angular project. I use Gitlab as my repo manager

https://redd.it/1g3kqy9
@r_devops
I believe our devops team under new leadership 90% participates in self manufactured beurocratic process.

We now have a terraform architecture where every piece of infrastructure is a module and every account is a root config (read: one github repo for each module and root config). We are a relatively small actor in our enterprise and this has already led to at least 100 git repos. Every PR to every git repo requires a Jira ticket, a pull request labeled with that jira ticket, and a PR review. All modules are semantic versioned. Realize that any patch to any module requires N PR's to all its upstream modules which import it, which is N jira tickets too, N runs of CICD, and then those modules need to be updated on M other repos, until eventually you get to the root configs, where then you have to do a terraform apply on each of those. One bug, in a downstream module, can take a week to get fixed upstream with this approach. We have set up CICD to do some of the automatic jira ticket creation and PR update creation but we wouldn't need that if we had a more scalable system. I advised the new manager of this problem when they joined and they just said that that's how they did it in their last company and it worked. If we keep going like this it'll move into the thousands quickly I fear. I'm getting carpel tunnel just closing jira tickets and merging PRs for bugs.

I used a monorepo following gitops principles when I was in charge (story is I moved out of management). None of this would be a problem under that paradigm. I'm sure my process could have been improved, but this process is insane.

https://redd.it/1g3mxs2
@r_devops
Candidates Using AI Assistants in Interviews

This is a bit of a doozy — I am interviewing candidates for a senior DevOps role, and all of them have great experience on paper. However, literally 4/6 of them have obviously been using AI resources very blatantly in our interviews (clearly reading from their second monitor, creating very perfect solutions without an ability to adequately explain motivations behind specifics, having very deep understanding of certain concepts while not even being able to indent code properly, etc.)

I’m honestly torn on this issue. On one hand, I use AI tools daily to accelerate my workflow. I understand why someone would use these, and theoretically, their answers to my very basic questions are perfect. My fear is that if they’re using AI tools as a crutch for basic problems, what happens when they’re given advanced ones?

And do we constitute use of AI tools in an interview as cheating? I think the fact that these candidates are clearly trying to act as though they are giving these answers rather than an assistant (or are at least not forthright in telling me they are using an assistant) is enough to suggest they think it’s against the rules.

I am getting exhausted by it, honestly. It’s making my time feel wasted, and I’m not sure if I’m overreacting.

https://redd.it/1g3np7t
@r_devops
Will AI take DevOps roles?

One of the main reasons I decided to transition from software engineering to DevOps a couple of years ago is because i think that the SWE field may become increasingly saturated. As tools like ChatGPT continue to improve, more people will be able to rely on AI to complete tasks and potentially secure engineering roles. I see a future where AI significantly reduces the need for traditional software engineers.

DevOps, will be safer from this trend or more difficult to fully automate but do you think at some point this will happen also?

https://redd.it/1g3ttsd
@r_devops
Connect Cloud Build and Bitbucket Cloud

Hey guys, devops newbie here. Currently I’m trying to find an alternative to Bitbucket Pipelines due to some limitations related to its self hosted runners (lack of build concurrency, build step timeout only 2 hours).

I am trying to see if Cloud Build is a viable alternative due to its private worker pool option (it’s managed as well).

My company’s repositories are hosted in a workspace on Bitbucket Cloud behind an IP Allowlist.

I’m having trouble trying to connect Cloud Build to a repository since during the link repository process GCP uses an external IP address from a range it has allocated for itself. Google publishes the allocated ranges in a webpage as a json and updates them frequently.

However adding these ranges to our IP Allowlist cannot be a safe choice since this is a public IP range.

Before I move on to another CICD solution, is there something I’m missing to make Cloud Build work?

Please let me know if I need to provide more information.

https://redd.it/1g425zj
@r_devops
1 Month Until I Start My First Full-Time DevOps Role – Any Advice?

Hey everyone,

I’ve been working in IT/Cloud/Security for around 4 years, and recently started taking on some DevOps responsibilities at my current job. I’m really happy to share that I’ve just landed a full-time DevOps Engineer role at a great company. My official start date is exactly one month from today.

I’ve offered to visit the office before my start date to better get to know the team and familiarize myself with the projects they have going on. This way, I can gauge where I stand and identify any areas I might need to catch up on.

I’d really appreciate any advice or suggestions on how to best prepare for my first day. This is a big opportunity that I’ve worked incredibly hard to achieve, and I want to make sure I hit the ground running.

Small story time.... just a year ago I was feeling pretty lost. I was out of work, unsure of my next steps, and burned out from my previous role. I even questioned whether I wanted to keep pursuing the engineering path. I decided to take a break, regroup, and commit myself to turning things around. I hit the books, worked on projects, kept my public GitHub active, and sent out around 10 job applications every day. After countless rejections, I finally got the “yes” I had been waiting for.

Thanks for reading, and I’d love to hear any thoughts or advice.

https://redd.it/1g4440l
@r_devops