Reddit DevOps
270 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Is finishing Full Stack Open course (https://fullstackopen.com/) worth the time to get started?

Background: I currently work as a tech support. I have been taking online courses on web dev for some time now (have a few certs for basic courses on Coursera and Udemy), but I do not have any practical/professional experience with web development. My plan is to switch from tech support to web dev in the near future. I have just started Full Stack Open and I really like the course so far.


In my job, one of our tasks is to sometimes collaborate with our DevOps team. I have no experience and knowledge on devops, but I have been intrigued by what they do so I have been looking up on some info about the field. And I find it really interesting to the point where I'm thinking of changing my plan and focus on DevOps instead.

My questions are:

* Is it still worth the time to finish fullstackopen or should I try to look for and focus on more DevOps-focused course? The course covers a lot of skills required to be a full stack web developer, so I think it is still worth the time, but I could be wrong.
* The next best course I found is this specialization from Cousera: [IBM DevOps and Software Engineering Professional Certificate](https://www.coursera.org/professional-certificates/devops-and-software-engineering). I am thinking of either switching to this course now or taking this after completing fullstackopen. Is this a good course to get started?
* Brutally honest: is my goal realistic? That is, is it a common scenario for someone with no experience to take online courses, build a portfolio, and switch to a DevOps role (either within company or outside)?


Thank you!

https://redd.it/1i7utwd
@r_devops
Terraform Github Provider | Question for users

Hi,

I'm one of the administrators of a large GitHub organization. We are in the process of creating self-service capabilities for our users since we are struggling with the volume of support tasks for minor operations, such as assigning organization secrets to repositories and adding repositories to GitHub apps.

Currently, we are using self-service repositories (running custom scripts on GitHub Actions under the hood) where users can create pull requests to request changes.

I am considering migrating to Terraform since it is more robust, and we can manage the current state more effectively than with custom scripts. I would appreciate hearing from those who have experience with the Terraform GitHub provider. What are the pros and cons, and what potential hidden issues should we watch out for?

The key requirement is that users should still be able to create pull requests with suggested changes, so we need to keep the configuration files user-friendly.

Looking forward to your insights!

https://redd.it/1i7yqf6
@r_devops
From Marketing Problem Solver to Developer: Seeking Guidance to Build My Tech Portfolio!

I'm considering a career transition into software development and would appreciate your insights and recommendations.

I have a background in problem-solving for clients in the marketing field, where I've spent the last 15 years. Throughout this time, I've frequently engaged in building MVPs and solutions to address issues arising from various platforms' inability to communicate effectively. My experience includes extensive data-driven analysis using tools like SQL and BigQuery.

Fundamentally, I was trained in the old days of VB6, ASP, and even some C, along with various front-end web development technologies. Additionally, I have a working understanding of machine learning models and have utilized large language models (LLMs) in a few projects.

While I have accumulated a lot of practical knowledge over the years, I sometimes feel like I have "too much knowledge for my own good" without a clear direction on how to formalize it. I'm eager to create a tangible portfolio that I can showcase on platforms like GitHub. My goal is to prepare myself for more formal projects or job opportunities in the software development field within the next year or two.

As a newbie looking to break into this field, I'm seeking advice on how to effectively leverage my existing skills, resources for building a portfolio, or steps to take for transitioning into development. Any guidance would be greatly appreciated!

https://redd.it/1i7yxk3
@r_devops
Opengrep - a truly Open Source fork of the Code Security tool Semgrep - Announced

In December, the code security scanner Semgrep made a bunch of changes to their licensing model and scanning engine making it harder to use and share rules between various tools or use the free version at scale. Opengrep was launched by a consortium of vendors for a truly open source alternative: https://www.opengrep.dev/

https://redd.it/1i83yde
@r_devops
Cluster API to production: authentication with service accounts and RBAC using External Secrets and Kyverno

Hi everyone!
I've just published the third part of my Cluster API to production series: focusing on providing tenant clusters service accounts for the management cluster.
This is an important step in managing clusters, as it provides clusters credentials they can use to access a secret manager, container registry, object storage and more.

The series follows every step needed from where the Cluster API documentation ends to deploying production clusters managed with GitOps.
With this part we're finally done with boilerplate for tenant clusters.
The next couple of parts will explore setting up a telemetry exporter with OpenTelemetry Collector, and setting up automated DNS and certificate renewal.
Slowly making our way towards the final goal: managing clusters with GitOps.

I'm still in the beginning of my technical writing journey, and would appreciate any feedback.

https://redd.it/1i846pe
@r_devops
Building Reliable AI: A Step-by-Step Guide

Artificial intelligence is revolutionizing industries, but with great power comes great responsibility. Ensuring AI systems are reliable, transparent, and ethically sound is no longer optional—it’s essential.

Our new guide, "Building Reliable AI", is designed for developers, researchers, and decision-makers looking to enhance their AI systems.

Here’s what you’ll find:
✔️ Why reliability is critical in modern AI applications.
✔️ The limitations of traditional AI development approaches.
✔️ How AI observability ensures transparency and accountability.
✔️ A step-by-step roadmap to implement a reliable AI program.

💡 Case Study: A pharmaceutical company used observability tools to achieve 98.8% reliability in LLMs, addressing issues like bias, hallucinations, and data fragmentation.

📘 **Download the guide now** and learn how to build smarter, safer AI systems.

Let’s discuss: What steps are most critical for AI reliability? Are you already incorporating observability into your systems?

https://redd.it/1i89o6u
@r_devops
Thoughts on Theo’s viral Stripe repo? Where our research led us

yo all, Y Combinator alum here and bit of an outsider to this group. My cofounder and I still struggle with payments code, even at $2M ARR for one of our businesses - despite Stripe’s reputation for being easy. It took us weeks, sometimes months, to fix all the bugs. Then Theo shared his struggles with Stripe, and to our surprise, many other devs shared the same experience. [https://github.com/t3dotgg/stripe-recommendations](https://github.com/t3dotgg/stripe-recommendations)

We’re in the early days of building something new: 

* Delete your Stripe webhook
* Drop your “Payments” table
* Drop your “stripeCustomerId” and “stripeSubscriptionId” columns

It’s 2025. We don’t need to live like this lol.

We’re a Stripe alternative that requires no webhooks, a drop in payments and billing for React devs of sorts.

It works like the rest of your React stack: updates “propagate” down from our server to you, so you don’t have to manage any billing / payments state on your side. 

You just call our “billing()” function on your backend, or our “useBilling()” hook on your frontend when you need your customers’ billing state. 

Since data flows down from us, you can run pricing experiments without ever needing to open a pull request or redeploy. This is all built on top of Stripe still, but built around react’s “updates flow down” paradigm.

Does this resonate with you? Why or why not? What payment challenges do you face, and if you had a magic want, what would you want or fix?

Bonus points and tons of gratitude for [hopping on a call with us](https://cal.com/harrisontelyan/flowglad-chat) to treat us like your therapist to tell us about your payment problems. In exchange - down to provide a design critique on any projects you’re working on (RISD/founding designer of Imgur).

https://redd.it/1i8aox1
@r_devops
Need suggestion on: How to manage DB Migration across environment

# TLDR;

We have a PostgreSQL cluster with 4 DB, one for each environment.
We develop on Development env., we edit the structure of the tables through PGAdmin and everything works fine.
Recently we had to port all the modification to 2 other env. we weren't able to do so due to conflicts.
Any suggestion on how to work and fix this issue?

# Structure explained

So we are a team that has been destroyed by a bad project manager and we had to start over.
New platform in development, new life for the devs.

The managers wanted a P.O.C. about an idea we had, we built it in a couple of months, they presented it to all the clients, they liked it and the manager gave a date without asking anything.

We didn't have the time to think and research too much on how to build the structure but we had the experience on what didn't work before so we built everything on AWS, 4 env: Development, Test, Demo, Production.
Every environment has his own front end with it's alias on the lambda functions and it's DB inside the cluster.

The DB is an Aurora instance compatible with PostgreSQL

The FE is hosted through S3 behind CloudFront

# What does work?

The lambda thing works well. We have a console that manages every day more thing, from enabling the various env, to enabling logs, publishing new versions and binding alias to those new versions.

The FE deployment kinda works.
We don't have alias and version there but through tags and branched on git we can deploy old and new version as wonted in every env.

# What doesn't work?

The management of the DB.

At the moment 2/3 people are touching the structure of the DBs, one of witch is me.
We are doing all the stuff from PGAdmin through the UI.

It works for what we need but some days ago we were required to apply all the new developments done over the months in the Test and Demo env and the DB migration didn't go as planned.

We used the diff schema functionality offered by PGAdmin but the script was huge and the alters were all over the place.

Fortunately we have yet to release anything to the public so for now we were able to remove the old db and recreate it but when we will deploy the Production we won't be able to do so, obviously.

We don't have any CI/CD, this week I had the opportunity to do some researched and I landed on Jenkins, SonarQube and Gitea (our GitHub is an enterprise server instance self hosted witch don't ave Actions so we have to try something else) but we are more interested on CI at the moment.

I know we are not well organized but we try really hard and we are a small team that produces a bunch of code every day.
The pace can't be slowed down due to "business needings" and we are tired of having problems caused by little time dedicated to R&D

BTW the team is composed by 4 junior dev (I'm one of them) and a single senior dev that now have to manage the whole dev department.

I'm open to any suggestion.
Tanks to anyone who will help. <3

https://redd.it/1i8bvb7
@r_devops
Share artifacts between two jobs that run at different times

So the entire context is something like this,

I've two jobs let's say JobA and JobB, now JobA performs some kind of scanning part and then uploads the SAST scan report to AWS S3 bucket, once the scan and upload part is completed, it saves the file path of file uploaded to the S3 in an environment variable, and later push this file path as an artifact for JobB.

JobB will execute only when JobA is completed successfully and pushed the artifacts for other jobs, now JobB will pull the artifacts from JobA and check if the file path exists on S3 or not, if yes then perform the cleanup command or else don't. Here, some more context for JobB i.e., JobB is dependent on JobA means, if JobA fails then JobB shouldn't be executed. Additionally, JobB requires an artifact from JobB to perform this check before the cleanup process, and this artifact is kinda necessary for this crucial cleanup operation.

Here's my Gitlab CI Template:

stages:
- scan
image: <ecr_image>
.send_event:
script: |
function send_event_to_eventbridge() {
event_body='[{"Source":"gitlab.pipeline", "DetailType":"cleanup_process_testing", "Detail":"{\"exec_test\":\"true\", \"gitlab_project\":\"${CI_PROJECT_TITLE}\", \"gitlab_project_branch\":\"${CI_COMMIT_BRANCH}\"}", "EventBusName":"<event_bus_arn>"}]'
echo "$event_body" > event_body.json
aws events put-events --entries file://event_body.json --region 'ap-south-1'
}
clone_repository:
stage: scan
variables:
REPO_NAME: "<repo_name>"
tags:
- $DEV_RUNNER
script:
- echo $EVENING_EXEC
- printf "executing secret scans"
- git clone --bare
- mkdir ${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}_secret_result
- export SCAN_START_TIME="$(date '+%Y-%m-%d:%H:%M:%S')"
- ghidorah scan --datastore ${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}_secret_result/datastore --blob-metadata all --color auto --progress auto $REPO_NAME.git
- zip -r ${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}_secret_result/datastore.zip ${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}_secret_result/datastore
- ghidorah report --datastore ${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}_secret_result/datastore --format jsonl --output ${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}_secret_result/${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}-${SCAN_START_TIME}_report.jsonl
- mv ${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}_secret_result/datastore /tmp
- aws s3 cp ./${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}_secret_result s3://sast-scans-bucket/ghidorah-scans/${REPO_NAME}/${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}/${SCAN_START_TIME} --recursive --region ap-south-1 --acl bucket-owner-full-control
- echo "ghidorah-scans/${REPO_NAME}/${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}/${SCAN_START_TIME}/${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}-${SCAN_START_TIME}_report.jsonl" > file_path # required to use this in another job
artifacts:
when: on_success
expire_in: 20 hours
paths:
- "${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}_secret_result/${CI_PROJECT_TITLE}-${CI_COMMIT_BRANCH}-*_report.jsonl"
- "file_path"
#when: manual
#allow_failure: false
rules:
- if: $EVENING_EXEC == "false"
when: always
perform_tests:
stage: scan
needs: ["clone_repository"]
#dependencies: ["clone_repository"]
tags:
- $DEV_RUNNER
before_script:
- !reference [.send_event, script]
script:
- echo $EVENING_EXEC
- echo "$CI_JOB_STATUS"
- echo "Performing numerous tests on the previous job"
- echo "Check if the previous job has successfully uploaded the file to AWS S3"
- aws s3api head-object --bucket sast-scans-bucket --key `cat file_path` || FILE_NOT_EXISTS=true
- |
if [[ $FILE_NOT_EXISTS = false ]]; then
echo "File doesn't exist in the bucket"
exit 1
else
echo -e "File Exists in the bucket\nSending an event to EventBridge"
send_event_to_eventbridge
fi
rules:
- if: $EVENING_EXEC == "true"
when: always
#rules:
#- if: $CI_COMMIT_BRANCH ==
"test_pipeline_branch"
# when: delayed
# start_in: 5 minutes
#rules:
# - if: $CI_PIPELINE_SOURCE == "schedule"
# - if: $EVE_TEST_SCAN == "true"https://gitlab-ci-token:$[email protected]/testing/$REPO_NAME.git

Now the issue I am facing with the above gitlab CI example template is that, I've created two scheduled pipelines for the same branch where this gitlab CI template resides, now both the scheduled jobs have 8 hours of gap between them, Conditions that I am using above is working fine for the JobA i.e., when the first pipeline runs it only executes the JobA not the JobB, but when the second pipeline runs it executes JobB not JobA but also the JobB is not able to fetch the artifacts from JobA.

Previously I've tried using \`rules:delayed\` with \`start\_in\` time and it somehow puts the JobB in pending state but later fetches the artifact successfully, however in my use case, the runner is somehow set to execute any jobs either in sleep state or pending state once it exceeds the timeout policy of 1 hour which is not the sufficient time for JobB, JobB requires at least a gap of 12-14 hours before starting the cleanup process.

https://redd.it/1i8driq
@r_devops
Apple DevOps Interview

Hi I have a DevOps Engineer 60 min Interview with Hiring Manager coming up coming up for AI/ML team, wondering how to best prepare? Pls share any advice. Thank you in advance.

https://redd.it/1i8fi86
@r_devops
LGTM Stack with TF for AWS Infrastructure with Application Integration Running on AWS ECS Fargate

I'm looking for someone who has worked on something similar where he/she has integrated Current AWS ECS Fargate Application Infrastructure for Metrics, Logs & Traces using TF only with smooth integration + Dashboard creation as well. Something similar to i shared in my recent post.

- Application Running on AWS ECS Fargate
- Grafana Stack : Grafana Alloy running as Sidecar with ECS Tasks + Loki, Mimir & Tempo running on ECS/EKS and AWS Managed Grafana for smooth SSO Integration with AWS for easy Login
- Grafana Dashboard for Metrics, Logs & Traces using TF as well

Seperate consolidated dashboards for all the API where Metrics, Logs and Traces for each of them are coupled in single dashboard

Deployment using TF Apply only, no clickops approach.


Please let me know if you've done something similar.

Thanks.

https://redd.it/1i8cph4
@r_devops
GIT CI/CD Suggestions Html Templates inside databases

Hello 👋,

I have 3 databases (system integration testing, staging and production). Each have a table holding html templates for different contract types + specifications.

At the moment there is no versioning on the databases itself, so my suggestion was to version them in git, have 4 branches build, sit, stg and prd. I'm a bit green on CI/CD (work as an system engineer but trying to gain devops knowledge) but my idea was to push to build and then merge to the other branches, then eventually trigger a pipeline to test and deploy on the databases.

I need suggestions on how to organise the repo itself. Ideally the templates should be identical in all 3 branches, at the moment the app is still I'm development, so they are not identical. Considering this, should I just push the html templates on the repo directory or segregate them into different folders sit/, stg/ and prd/ ?

https://redd.it/1i8qrxz
@r_devops
Hey folks Anybody interested in Tech Talk call? We've got Michael Hausenblas - AWS Observability principal, CNCF Ambassador, ex-RedHat Developer Advocate ..

Hey Folks,

Michael Hausenblas https://www.linkedin.com/in/mhausenblas/ will do a call where we will talk about:

Its free event. No payments, No ads.

\- Observability (Open Source solutions, SaaS observability, AWS Observability etc.)
\- Career advices and hiring practices, what are the expectations from modern day DevOps engineer
\- Q&A for various other topics

if you are interested write something in the comments and i'll dm details (alternatively even details in my profile post)

29 Jan, 16:00 UTC (or 11:00 EST)

https://redd.it/1i8s9im
@r_devops
Need help to resolve this

Hey guys I am Ops engineer in one big MNC so I'll give a background so my manager has asked the team that we need to save a some target given money which was spent on cloud basically doing cost optimization and he asked the team to bring some ideas,
Now I have experience of 1 year under my belt but all my ideas and everything are already in place need some ideas from your vast experience to.reduce the cost and optimize the work flow

Some of the implemented solutions:
1. Start -stop.of server in office working hours
2. Auto deletion of ami or machine images
3. Intelligent tiering

Just to tell we use all three clouds big ones so you can tell for any of those clouds
Any help.will.be appreciated

Please give some ideas for cost optimization and also for automation of some tasks like deletion of amis after certain amount of time has passed


Thanks

https://redd.it/1i8s7vc
@r_devops
Should I take on the Associate Devops Engineer role as a fresher?

I'm a 2024 computer science graduate who spent the last 7 months learning Devops and cloud Technologies on my own (linux, Jenkins, Docker, Kubernetes, Terraform, Ansible, AWS, grafana etc). Devops has been the field I wanted to work in and now I managed to crack an interview at a company where they're hiring freshers for the role of associate Devops Engineer (they were particularly looking for freshers but only selects the ones that has a good grasp on for everything works. Interview was kinda hard). I've received the offer letter.

I keep on reading in developer subreddits that you need experience in developer or sysadmin roles to be a good devops Engineer. I have moderate knowledge in springboot framework, and web development in react js (but no industry level experience in neither development nor devops, not even internships). So I'm having second thoughts now whether I should take on the devops offer, they'll provide 3 months training but I'm afraid It'll difficult to switch to any developer roles later (if that's something I want in the future) due to the lack of coding experience.

Was anyone of you in a similar boat? Let me know your experience and how it went after you started your career as a devops Engineer without prior developer roles. Is to a bad idea to start as a fresher in this role or am I just overthinking?

https://redd.it/1i8u7lm
@r_devops
Azure Engineer - Where to go from here?



Azure Engineer - Where to go from here?

Where do you transition to after becoming a System Administrator in Azure? Curious what paths people have taken as I feel my skillset is too broad and not niche.

Syadmin roles have been around forever but what about DevOps, Cyber Security etc?

Was a Sysadmin before now a "Cloud Engineer". Have only been working with Azure for about 5 years though.

https://redd.it/1i8rzof
@r_devops
Feeling Stuck on What to Study!

Hey everyone,

I’m a junior DevOps engineer, and I’ve been feeling a bit stuck lately when it comes to what I should focus on learning next. I love studying and picking up new skills, but my work tasks aren’t particularly challenging or new, and I’m bound by a specific tech stack—so I don’t really get to experiment with other tools at work.

I’ve already studied the core DevOps tools and concepts

Here’s what’s on my mind:
1- should i learn new tools? but i don't get to use them so i feel it's pointless and waste of time?
2- should i Go deeper into concepts and aspects like container and Kubernetes security, reliability engineering, or advanced troubleshooting.
3- should I explore entirely different areas like AI/ML, distributed systems, or backend fundamentals to expand my knowledge beyond DevOps?

I’m not sure how to prioritize or if I’m overthinking it. What’s worked for you in similar situations? How do you decide what to study to stay sharp and keep growing as a professional?

Would love to hear your thoughts and what you’ve been focusing on lately and share experience

and Thanks in advance

https://redd.it/1i8vdon
@r_devops
Interview question: a pod is not able to schedule. How do you troubleshoot it ?

This was the question asked in the interview. From that there were many other questions like how do I troubleshoot when there's crashbackloopoff etc. I told every possible way for both the questions like checking events, logs, resource constraints, taint or tolerations, checking liveness and readyness probe, node resource everything. But the interviewer was looking for something different. How would you answer these questions ?
How do you troubleshoot when pod is not scheduling
How do you troubleshoot when there's crashbackloopoff
How do you troubleshoot when remains in pending for large amount of time ?

https://redd.it/1i8x7qe
@r_devops
Need help with the interview process | Mastercard-Bizops Engineer 1

Hi guys,

Have a bar raiser interview coming up. Can someone share what kind of questions can you expect and the subsequent interview process.

Thanks

https://redd.it/1i8xkd1
@r_devops