Reddit DevOps
270 subscribers
9 photos
31.1K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Need Suggestions for Reducing Downtime During EKS Deployments

Hello everyone,

I could use some help or suggestions with a deployment issue we're facing.

Currently, we're deploying to EKS, using Atlas MongoDB, and storing some documents in S3. The challenge is that every time we deploy to production, we need to take the system offline, back up S3 (which takes about an hour due to a large number of files, even though the size is small), back up the database, then deploy and run the migration.

Does anyone have ideas on how we can reduce or eliminate this downtime?

https://redd.it/1erjuji
@r_devops
Resources to learn DevOps Project

Hi all,

Hoping you wonderful people can help.

I'm a project manager that moved into product management.

At present, I am product owner for Dynamics 365. One of the core issues we have faced has been single branching strategy. I'm currently in the process of moving us over fully onto Azure DevOps for us to automate testing and resolve the branching strategy allowing us to be more agile.

One area that I need help on is understanding how to use Azure boards, or the delivery plan section on DevOps.

Does anyone know any good, free content for me and my BA's to learn this?



https://redd.it/1erixho
@r_devops
What do you monitor on your servers?


We've been developing the BlueWave Uptime Manager for the past 5 months with a team of 7 developers and 3 contributors. As we move towards expanding from basic uptime tracking to a comprehensive monitoring solution, we're interested in getting insights from the community.

For those of you managing server infrastructure,

What are the key assets you monitor beyond the basics like CPU, RAM, and disk usage?
Do you also keep tabs on network performance, processes, services, or other metrics?

Additionally, we're debating whether to build a custom monitoring agent or leverage existing solutions like OpenTelemetry or Fluentd.

What’s your take—would you trust a simple, bespoke agent, or would you feel more secure with a well-established solution?
Lastly, what’s your preference for data collection—do you prefer an agent that pulls data or one that pushes it to the monitoring system?

https://redd.it/1erkhef
@r_devops
Exploring the 12-Factor App Methodology: A Blueprint for Building Scalable and Resilient Cloud-Native Applications

Hey everyone,

I wanted to share a comprehensive blog post I just published about the **12-Factor App methodology**—a set of best practices designed to help developers build scalable, maintainable, and resilient cloud-native applications.

If you're working with **DevOps**, **microservices**, or building applications that need to thrive in **cloud environments**, understanding and applying these 12 factors can be a game-changer. In the post, I dive deep into each principle, explaining how they contribute to building modern, robust applications. I've also included book recommendations for each factor to help you explore these concepts further.

**What you’ll find in the blog:**

* An overview of all 12 factors, from codebase management to treating logs as event streams
* Practical insights on how to implement these principles in your projects
* Book recommendations to deepen your understanding of each factor

If you're interested in improving your application development practices, I think you'll find this post valuable.

🔗 [https://medium.com/@srivatssan/the-12-factor-app-methodology-a-blueprint-for-modern-cloud-native-applications-c1aea2984bde?sk=e2e214a30f30be4dfe7495b5fc27c80a](https://medium.com/@srivatssan/the-12-factor-app-methodology-a-blueprint-for-modern-cloud-native-applications-c1aea2984bde?sk=e2e214a30f30be4dfe7495b5fc27c80a)



I'd love to hear your thoughts and any experiences you've had implementing the 12-Factor App principles in your work!

https://redd.it/1erthxd
@r_devops
What is best way to monitor lot of PC's health

My work place has lot of Lab systems which occasionally losses wifi network and goes offline. What is best way to monitor multiple PCs? I would like to monitor network connectivity, hard disk space availability.



https://redd.it/1eru418
@r_devops
Where and how do you store your environment vars / secrets.

Rn we are storing the env vars/ secrets in bitbucket (secrets are pulled and mounted).

Looking for a better options.

I found a few options such as HCP vault or AWS ssm parameter store. But still as a beginner, I'm stumbled on how it is done ???

https://redd.it/1erw27o
@r_devops
Aurora (MySQL) global database with global write forwarding.

We are using Aurora MySQL Global DB (east primary & west secondary). We have logic in gateway to route "read" traffic to geo based and "write" traffic to weighted i.e. east.

Question: Do you recommend using global write forwarding instead? Our application is read heavy if that matters and we do need performance (plus consistency, I know you can't have it all so maybe performance over consistency with lag of \~ milliseconds).

Reading some blogs say don't use global write forwarding? Is GW based routing that we have is good enough but its not truly Active/Active for our application either in that case. Should we do code based routing instead i.e. send read queries geo routed and write queries to weighted routes (Spring/JPA)?

Any suggestions or how you have implemented it would be helpful, thanks!

https://redd.it/1erwraz
@r_devops
CI/CD observability

Is your CI/CD pipeline slowing you down? Dive into the key steps and best practices to enhance your pipeline's visibility and performance using OpenTelemetry. Check out this blog: https://www.cloudraft.io/blog/opentelemetry-for-cicd-observability

https://redd.it/1ery0u3
@r_devops
Loggly alternative for centralized logs

I'm looking for an alternative to loggly. I have various .NET applications deployed across multiple locations, and I need them to send their logs back to a central server.

I've been experimenting with loggly and I’m already at the limit of their free plan, even in the testing phase. I was thinking about splunk since they offer the most similar feature set to Loggly, but it comes with significant limitations on data ingestion, especially in the Splunk Light version.

Does anyone have any recommendations? :)

https://redd.it/1ery93u
@r_devops
I started challenging our junior devs to provide feedback or ask at least one question while reviewing a PR. Thoughts?

Our JR devs are allowed to approve PRs (not my choice), and it's usually just a rubber stamp as they're nervous to call out a more senior member.

I requested they try to add something to the PR in terms of feedback just to help them get their feet wet and more comfortable.



https://redd.it/1es2ykc
@r_devops
We're reviewing a few CI/CD tools for our company and I'm curious about your experience with a couple.

Namely it looks like management is whittling it down to Travis CI or GitHub Actions. I've heard that Github Actions requires a lot more coding than Travis (this is a lot more important to me than the bean counters lol). If that's the case it sounds like there's a big argument there in terms of efficiency that may not be so easily quantified to various decision makers. Anyone?

https://redd.it/1es4h0a
@r_devops
Standard vs Express Step function

I don’t quite understand what do they mean by exactly once and atleas -once model respectively.If we can use a for loop and retry in standard workflow how is that exactly once then?!

https://redd.it/1es5brj
@r_devops
In your resume, do you put a lot of keywordd to pass CV screening or avoid it?

Hello!

In your resume, in order to pass the CV screening phase, often done by HR or even automatic tool, do you put a lot of technologies keywords? (Like list all the tech you work on only if it was for a low amount of time)

Or you avoid it in order to pass the hiring manager CV screening?

What is the good balance?

https://redd.it/1es72ff
@r_devops
How to prevent data exfiltration

Hi everyone,

I’d like to get your opinions on implementing a data exfiltration prevention system.

For context, we have a partner who provides us with data and requires controls to prevent data exfiltration through personal email accounts, saving to local drives, copying, remote printing, etc.

We currently have SIEM, antivirus, and threat detection in place on both workstations and servers. Server access is restricted to authorized personnel only, requires VPN and approval for each connection, with sessions limited to 8 hours and fully logged. We also have DLP enabled in Microsoft Office. We are SOC2 Type 2 certified.

Support handles level 1 and 2 issues with limited access to client data via browsers. Level 3 support is managed by developers, and unfortunately, there are too many of them, but that's something we can’t address.

The partner wants us to extend these measures to our clients, which is impossible since we are a B2C company. However, they criticize us for only having detection methods and no prevention measures.

This is where we’re stuck—how can we implement a system that actively blocks data exfiltration? I see the potential of using a proxy to filter all web traffic, but that would significantly slow down development, which is challenging for a tech firm like ours.

What solutions do you use?

https://redd.it/1es70z4
@r_devops
Exploring the Recent Microsoft AI Health Bot Vulnerability: What DevOps Teams Should Know

Recently, a vulnerability was discovered in the Microsoft AI Health Bot, raising important questions for DevOps professionals working in the healthcare sector. As we navigate an era where AI integration in health care is becoming increasingly prevalent, understanding the implications of such vulnerabilities is critical. How should DevOps teams approach security when deploying AI solutions? What best practices can be implemented to safeguard sensitive information? Let’s discuss the lessons learned from this incident and share strategies to enhance our security posture in artificial intelligence applications. Have you faced similar challenges in your projects? What measures did you take?
https://7med.co.uk/microsoft-ai-health-bot-vulnerability-patched/

https://redd.it/1esb8jo
@r_devops
Need to create a main branch under repo using API (power Automate)

Hi Guys, We are automating a process to create project, repos, main branch and sub branch using power Automate http connect or. I am able to create project and repo in an hierarchy. But I am struggling to create a main branch using the rest API. Please help me on this. Thanks 🙏

https://redd.it/1esd55o
@r_devops
Env vars

Hey all

Curious how you handle your env vars within a deployment?
We have GitHub builder our docker containers for kubernetetes and on every build adds the env vars to a .env file
I find this approach terrible as if a dev forgets to add a new variable in the pipeline file the build fails.

Wondered if you guys are doing it a cleaner way?


https://redd.it/1esddux
@r_devops
What's your strategy for learning tech at your organization?

When it seems like it's endless?

Do you tend to be a master of a few trades or a jack of all trades? How deep does your knowledge typically spread?

https://redd.it/1esfcah
@r_devops
Launch.json config for listening to a containerized app not working

I have a straightforward setup and I see Debugger listening on ws://127.0.0.1:5555, but when I try to listen to the debugger port, VS Code hangs and doesn't output any error message. Why would it work on Linux, but not on Windows, and what are the fixes?

{

"version": "0.2.0",

"configurations": [

{

"type": "node",

"request": "attach",

"name": "Attach to Docker Container",

"address": "localhost",

"port": 5555,

"localRoot": "${workspaceFolder}",

"remoteRoot": "/usr/src/app"

}

]

}




It looks something like this, and I was wondering what I can do to debug and find out why it's not working. I am using ts-node-dev.

https://redd.it/1escqf3
@r_devops