Reddit DevOps
268 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Problem to upload files to an Apache server with rsync

Hello. I am new to CI/CD. I wanted to automatically create an apache server with ec2 in AWS using Terraform. I also wanto to deploy the code after the server has been created.


Everything works nearly perfectly, the problem is that immediatly after I do the command to start the apache server I do the rsync command, but I get an error. I think it's because the folders var/www/html haven't been created yet.


Which would be the beset DevOps aproach? Add a sleep for 10 secos aprox. to give my server time to launch or what? Thanks for your help.


Terraform infrastructure:

name: "terraform-setup"


on:
  push:
    branches:
      - main

  workflowdispatch:


jobs:
  infra:
    runs-on: ubuntu-latest
    steps:
      - name: Get the repo
        uses: actions/[email protected]
      - name: "files"
        run: ls

      - name: Set up terraform
        uses: hashicorp/setup-terraform@v3
     
      - name: Configure AWS Credentials
        uses: aws-actions/[email protected]
        with:
          aws-access-key-id: ${{ secrets.KEY
ID }}
          aws-secret-access-key: ${{ secrets.ACCESSKEY }}
          aws-region: us-east-1

      - name: Initialize Terraform
        run: |
          cd infrastructure
          terraform init

      - name: Terraform plan
        run: |
          cd infrastructure
          terraform plan

      - name: Terraform apply
        run: |
          cd infrastructure
          terraform apply -auto-approve

      - name: Safe public
dns
        run: |
          cd infrastructure
          terraform output -raw publicdnsinstance
          terraform output publicdnsinstance
          publicdns=$(terraform output -raw publicdnsinstance)
          echo $public
dns
          cd ..
          mkdir -p tfvars
          echo $public
dns > tfvars/publicdns.txt
          cat tfvars/publicdns.txt

      - name: Read file
        run: cat tfvars/publicdns.txt

      - uses: actions/upload-artifact@v4
        with:
          name: tfvars
          path: tf
vars



Deployment:



name: deploy code

on:
  workflowrun:
    workflows: ["terraform-setup"]
    types:
      - completed


permissions:
  actions: read
  contents: read


jobs:
  deployment:
    runs-on: ubuntu-latest
     
    steps:
      - uses: actions/checkout@v3

      - uses: actions/download-artifact@v4
        with:
          name: tf
vars
          github-token: ${{ github.token }}
          repository: ${{ github.repository }}
          run-id: ${{ github.event.workflowrun.id }}


      - name: View files
        run: ls


      - name: rsync deployments
        uses: burnett01/[email protected]
        with:
          switches: -avzr --delete --rsync-path="sudo rsync"
          path: app/
          remote
path: /var/www/html/
          remotehost: $(cat publicdns.txt)
          remoteuser: ubuntu
          remote
key: ${{ secrets.PRIVATEKEYPAIR }}


https://redd.it/1m0nued
@r_devops
terraform tutorial 101 - modules

hi there!

im back with another series from my terraform tutorial 101 series.

Its about modules in terraform! If you want to know more, or if you have questions or suggestion for more topics regarding terraform let me know.

Thank you!

https://salad1n.dev/2025-07-15/terraform-modules-101

https://redd.it/1m0onme
@r_devops
Is it an exaggeration saying a product without unit-tests is not a product?

I joined this product, the guy said it was 80-90% complete. Plenty of problems, than i found out it didn't have unit tests

To me that product is doomed to break so much in production he won't have a functional product and people will, at best, cancel the subscription. At worst, ask for their money back within a week

My opinion is, he doesn't have a product there. It's "working" when you as the dev use it, but it can be broken easily (i seen it), and i broken it myself when adding or fixing features (it broke others and i had no tests to know)

Is it an exaggeration to tell him "hey, you don't have a product, this has no tests and thus you can't find out if things are working, this'll break in production and nobody will wanna use it"?

EDIT: some info that might be crucial

The problem is, he (when asked me to join) said he had a very high chance of launching what he already had, and then re-doing the whole thing again because it was so broken, and he wanted me to do it (from scratch, but with the ideas figured out)

https://redd.it/1m0xd6z
@r_devops
Feeling Lost in my Tech Internship - what do I do

Hey everyone,

I’m a rising college freshman and interning at a small tech/DS startup. I am supposed to be working on infrastructure and DevOps-type tasks. The general guidance I’ve been given is to help “document the infrastructure” and “make it better,” but I’m struggling to figure out what to even do. I sat down today and tried documenting the S3 structure, just to find there’s already documentation on it. Idk what to do

I know next to nothing. Ik basic python and learned a little AWS and Linux but I have no idea what half the technologies even do. Honestly, idrk what documentation is.

Also, it seems to me there’s already documentation in place. I don’t want to just rewrite things for the sake of it, but at the same time, I want to contribute meaningfully and not just sit around waiting for someone to tell me exactly what to do. I’ve got admin access to a lot of systems (AWS, EC2, S3, IAM, internal deployment stuff, etc.), and I’m trying to be proactive but I’m hitting a wall.

There’s no one else really in my role.

If anyone’s been in a similar spot — especially if you’ve interned somewhere without a super structured program — I’d love to hear what worked for you.



https://redd.it/1m11tvu
@r_devops
Jfrog help

I'm a front end engineer and for context I have no idea how devops or jfrog works.

recently we have upgraded our entire react application from react 16 to react 18 and application from node 12 to node 18. While publishing build and generating new version I'm constantly getting new errors that some version in jfrog npm private registry is not available. And while checking in artifactory it's indeed not available.
And some version or outdated how to fix this issue?

https://redd.it/1m11rt0
@r_devops
How do you manage secrets?

As per title, what are your approaches for secrets management so they are nice and secure like running ephemeral tokens for your workers?

https://redd.it/1m15dwv
@r_devops
EKS (Kubernetes) - Implementing principle of least privilege with Pod Identities

Amazon EKS (Elastic Kubernetes Service) Pod Identities offer a robust mechanism to bolster security by implementing the principle of least privilege within Kubernetes environments. This principle ensures that each component, whether a user or a pod, has only the permissions necessary to perform its tasks, minimizing potential security risks.

EKS Pod Identities integrate with AWS IAM (Identity and Access Management) to assign unique, fine-grained permissions to individual pods. This granular access control is crucial in reducing the attack surface, as it limits the scope of actions that can be performed by compromised pods. By leveraging IAM roles, each pod can securely access AWS resources without sharing credentials, enhancing overall security posture.

Moreover, EKS Pod Identities simplify compliance and auditing processes. With distinct identities for each pod, administrators can easily track and manage permissions, ensuring adherence to security policies. This clear separation of roles and responsibilities aids in quickly identifying and mitigating security vulnerabilities.
https://youtu.be/Be85Xo15czk

https://redd.it/1m16lf8
@r_devops
Hashicorp Waypoint fork?

So there are OpenBao vs. Vault and OpenTofu vs. Terraform, what is the Waypoint fork?

https://redd.it/1m17ey5
@r_devops
DevOps job market

I see constantly pessimistic post that to get job in IT is almost impossible yet on weekly basis, I get DMs from recruiters with offers to apply for DevOps positions.

Do you experience the same or just job market in the Eastern Europe is better.

https://redd.it/1m18kyj
@r_devops
What training or course should I do for my career growth? I'm an DevOps person

I have been in DevOps for almost 8 years now, I feel I should be looking at a bit towards security side of things because I see a lot of potential there. However, I'm debating whether it should go towards security or AI training as my next step for growth. I would appreciate if anyone could guide me!

https://redd.it/1m19vnj
@r_devops
Deploying A Service

Hi guys, I have developed a Web Application, that I want to deploy. This is a sode project so I don’t have budget for costly deployments. My service includes:

1. Backend: Fastapi, Celery
2. Frontend: ReactJS
3. DBs: Redis, SQLLite

Can anybody suggest me where can I deploy? Tried render free tier but redis is not included there

https://redd.it/1m194mj
@r_devops
A lightweight alternative to Knative for scale-to-zero in Kubernetes — Make any HTTP service serverless on Kubernetes (no rewrites, no lock-in, no traffic drop)

Hey Engineers,

I wanted to share something we built that solved a pain point we kept hitting in real-world clusters — and might help others here too.

# 🚨 The Problem:

We had long-running HTTP services deployed with standard Kubernetes `Deployments,` when traffic went quiet, the pods would:

* Keep consuming CPU/RAM
* **Last replicas couldn’t be scaled down**, leading to unnecessary cost
* Cost us in licensing, memory overhead, and wasted infra

Knative and OpenFaaS were too heavy or function-oriented for our needs. We wanted **scale-to-zero** — but without rewriting.


# 🔧 Meet [KubeElasti](https://github.com/truefoundry/KubeElasti)

It’s a lightweight operator + proxy(resolver) that adds **scale-to-zero** capability to *your existing HTTP services* on Kubernetes.

No need to adopt a new service framework. No magic deployment wrapper. Just drop in an `ElastiService` CR and you’re good to go.


# 💡Why we didn’t use Knative or OpenFaaS

They’re great for what they do — but **too heavy** or **too opinionated** for our use case.

Here’s a side-by-side:

|Feature|**KubeElasti**|**Knative**|**OpenFaaS**|**KEDA HTTP-add-on**|
|:-|:-|:-|:-|:-|
|Scale to Zero|||||
|Works with existing svc|||||
|Resource footprint|🟢 Low|🔺 High|🔹 Medium|🟢 Low|
|Request queueing| (Takes itself out of the path)| (always in path)|| (always in path)|
|Setup complexity|🟢 Low|🔺 High|🔹 Medium|🔹 Medium|



# 🧠 How KubeElasti works

When traffic hits a scaled-down service:

1. A tiny KubeElasti proxy catches the request
2. It **queues** and **triggers a scale-up**
3. Then **forwards** the request when the pod is ready

When the pod is already running? The proxy gets out of the way completely. That means:

* **Zero overhead in hot path**
* **No cold start penalty**
* **No rewrites or FaaS abstractions**



# ⚖️ Trade-offs

We intentionally kept KubeElasti focused:

* Supports **Deployments** and **Argo Rollouts**
* Works with **Prometheus metrics**
* Supports **HPA/KEDA for scale-up**
* 🟡 Only supports **HTTP** right now (gRPC/TCP coming)
* 🟡 Prometheus is required for autoscaling triggers



# 🧪 When to Choose KubeElasti

You should try KubeElasti if you:

1. Run **standard HTTP apps** in Kubernetes and want to avoid idle cost
2. Want **zero request loss** during scale-up
3. Need something **lighter than Knative,** KEDA HTTP add-on
4. Don’t want to **rewrite** your services into functions



We’re actively developing this and keeping it **open source**. If you’re in the Kubernetes space and have ever felt your infra was 10% utilized 90% of the time — I’d love your feedback.

We're also exploring gRPC, TCP, and Support more ScaledObjects.

Let me know what you think — we’re building this in the open and would love to jam.

Cheers,

Raman from the KubeElasti team ☕️

# Links

Code: [https://github.com/truefoundry/KubeElasti](https://github.com/truefoundry/KubeElasti)

Docs: [https://www.kubeelasti.dev/](https://www.kubeelasti.dev/)

https://redd.it/1m1cat7
@r_devops
Help: How to migrate Azure Data Factory, Blob Storage, and Azure SQL from one tenant to another?


Hi everyone,
I work at a data consulting company, and we currently manage all of our client’s data infrastructure in our own Azure tenant. This includes Azure Data Factory (ADF) pipelines, Azure Blob Storage, and Azure SQL Databases — all hosted under our domain.

Now, the client wants everything migrated to their own Azure tenant (their subscription and domain).

I’m wondering:

Is it possible to migrate these resources (ADF, Blob Storage, Azure SQL) between tenants, or do I need to recreate everything manually?

Are there any YouTube videos, blog posts, or documentation that walk through this process?


I’ve heard about using ARM templates for ADF, .bacpac for SQL, and AzCopy for storage, but I’m looking for step-by-step guidance or lessons learned from others who’ve done this.

Any resources, tips, or gotchas to watch out for would be hugely appreciated!

Thanks in advance 🙏

https://redd.it/1m1d1t7
@r_devops
How often are you seeing bugs in production and how do you handle them?

We have unit tests, some integration tests, CI that runs tests on each push, CD to automate deployments, and manual testing, yet we're still seeing a decent amount of bugs in production. At least a couple almost every week.

I'm curious how often are others seeing bugs and how do you handle passing them back to the dev team to fix? Aside from opening a ticket, how are you handling the politics behind passing work onto a team that you're not on and don't run?

https://redd.it/1m1ehue
@r_devops
Cloud for SMEs

Hi, I am currently researching the cloud market in Europe.
Want to understand what kind of businesses buy cloud services, why, and through what channels.

Please DM if you can help me with the same - won't take more than 10 mins of your time.

Thanks!!!

https://redd.it/1m1ehhr
@r_devops
Help me evaluate my options

Hi, I am the sole developer/devops in an application. The application runs through Wine on Linux because it needs to call a C++ DLL that has Windows dependencies. The DLL works by maintaining state. And it has I/O limitations and whatnot so it needs to run one instance of DLL for every user.

The application runs this way.
Frontend->API->Check if docker container running for that user-> If not create it and call the endpoint exposed from the container.


The container runs image has Wine+some more APIs that call the DLL.

The previous devs created a container on demand for each user and they hosted it in house running docker containers on bare metals. (Yes the application is governmental). Now they want to use AWS. I am now evaluating my options between Fargate and EKS.

I evaluated my options as: Fargate and EKS.

Fargate would make my life easier but I am worried the vendor lock in. What if they decide to use a different servers/in-house later down(for whatever reason). I/someone would need to setup everything again.

EKS would be better for less vendor lock in but it's complexity and the fact that I am going to be just the single guy on the project and jumping between writing C++ and maintaining kubernetes is obviously going to be a pain.

I could use some opinions from the experts. Thanks



https://redd.it/1m1hq11
@r_devops
Discussing about some fatures on a tool for DevOps Engineers that manipulates .env files.

I amimplementing this tool https://github.com/pc-magas/mkdotenv intented to be run insude CI/CD pipelines inoprderto populate `.env` files with secrets.


At future release (0.4.0) the tool would support theese arguments:


MkDotenv VERSION:  0.4.0
Replace or add a variable into a .env file.

Usage:
./bin/mkdotenv-linux-amd64 \
[ --help|-help|--h|-h ] [ --version|-version|--v|-v ] \
--variable-name|-variable-name <variable_name> --variable-value|-variable-value <variable_value> \
[ --env-file|-env-file|--input-file|-input-file <env_file> ] [ --output-file|-output-file <output_file> ] \
[ --remove-doubles|-remove-doubles ] \

Options:

--help, -help, --h, -h OPTIONAL Display help message and exit
--version, -version, --v, -v OPTIONAL Display version and exit
--variable-name, -variable-name REQUIRED Name of the variable to be set
--variable-value, -variable-value REQUIRED Value to assign to the variable
--env-file, -env-file, --input-file, -input-file OPTIONAL Input .env file path (default .env)
--output-file, -output-file OPTIONAL File to write output to (`-` for stdout)
--remove-doubles, -remove-doubles OPTIONAL Remove duplicate variable entries, keeping the first


And I wonder would --remove-doubles be a usable feature my goal is if .env file contains multiple occurences of a variable for example:

S3_SECRET="1234"
S3_SECRET="456"
S3_SECRET="999"


By passing the --remove-doubles for example in this execution of the command:

mkdoptenv --variable-name=S3_SECRET --variable-value="4444"  --remove-doubles


Would result:

S3_SECRET="4444"


But is this feature really wanted?

Futhermore I also can be used with pipes like this:

mkdoptenv --variable-name=S3_SECRET --variable-value="4444" --remove-doubles --output-file="-" | mkdoptenv --variable-name=S3_KEY --variable-value="XXXX" --remove-doubles


But is this also a usable feature as well 4u?


https://redd.it/1m1k6my
@r_devops
devops jobs for Jr level

I'm from India, btech cse student and I'm start learning devops, previously I'm in cybersecurity

can anyone give guidence?
and how about devops job market for Jr level or intern

https://redd.it/1m1lbe8
@r_devops
I built an AI tool that turns terminal sessions into runbooks - would love feedback from SREs/DevOps engineers

Hey everyone!

I've been working on Oh Shell! - an AI-powered tool that automatically converts your incident response terminal sessions into comprehensive, searchable runbooks.

**The Problem:**
Every time we have an incident, we lose valuable institutional knowledge. Critical debugging steps, command sequences, and decision-making processes get scattered across terminal histories, chat logs, and individual memories. When similar incidents happen again, we end up repeating the same troubleshooting from scratch.

**The Solution:**
Oh Shell! records your terminal sessions during incident response and uses AI to generate structured runbooks with:

* Step-by-step troubleshooting procedures

* Command explanations and context

* Expected outputs and error handling

* Integration with tools like Notion, Google Docs, Slack, and incident management platforms

Key Features:

* 🎥 One-command recording: Just run ohsh to start recording

* 🤖 AI-powered analysis: Understands your commands and generates comprehensive docs

* 🔗 Tool integrations: Push to Notion, Google Docs, Slack, Firehydrant, [incident.io](https://incident.io)

* 👥 Team collaboration: Share runbooks and build collective knowledge

* 🔒 Security: End-to-end encryption, on-premises options

What I'd love feedback on:

1. Does this solve a real pain point for your team?

1. What integrations would be most valuable to you?

1. How do you currently handle runbook creation and maintenance?

1. What would make this tool indispensable for your incident response process?

1. Any concerns about security or data privacy?

Current Status:

* CLI tool is functional and ready for testing

* Web dashboard for managing generated runbooks

* Integrations with major platforms

* **Free** for trying it out

I'm particularly interested in feedback from SREs, DevOps engineers, and anyone who deals with incident response regularly. What am I missing? What would make this tool better for your workflow?Check it out: [https://ohsh.dev](https://ohsh.dev)

Thanks for your time and feedback! 

https://redd.it/1m1uu2v
@r_devops
Oracle - Race to Certification 2025

Oracle is allowing free certification till 31st October via their Race to Certification program. If you are interested, sign up for it.

https://education.oracle.com/race-to-certification-2025

https://redd.it/1m1vd2e
@r_devops
Best way to prep for CKA?

Hey everyone,
I’m planning to take the **Certified Kubernetes Administrator (CKA)** exam and was wondering:

* What are the best resources/courses you used to prep?
* Any mock labs or hands-on practice you’d recommend?
* Also, any **student discounts** or promo codes available for the exam or courses?

Trying to keep it budget-friendly and efficient. Appreciate any help or advice!

Thanks in advance!

https://redd.it/1m1xj7q
@r_devops