Need Suggestions for Reducing Downtime During EKS Deployments
Hello everyone,
I could use some help or suggestions with a deployment issue we're facing.
Currently, we're deploying to EKS, using Atlas MongoDB, and storing some documents in S3. The challenge is that every time we deploy to production, we need to take the system offline, back up S3 (which takes about an hour due to a large number of files, even though the size is small), back up the database, then deploy and run the migration.
Does anyone have ideas on how we can reduce or eliminate this downtime?
https://redd.it/1erjuji
@r_devops
Hello everyone,
I could use some help or suggestions with a deployment issue we're facing.
Currently, we're deploying to EKS, using Atlas MongoDB, and storing some documents in S3. The challenge is that every time we deploy to production, we need to take the system offline, back up S3 (which takes about an hour due to a large number of files, even though the size is small), back up the database, then deploy and run the migration.
Does anyone have ideas on how we can reduce or eliminate this downtime?
https://redd.it/1erjuji
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Resources to learn DevOps Project
Hi all,
Hoping you wonderful people can help.
I'm a project manager that moved into product management.
At present, I am product owner for Dynamics 365. One of the core issues we have faced has been single branching strategy. I'm currently in the process of moving us over fully onto Azure DevOps for us to automate testing and resolve the branching strategy allowing us to be more agile.
One area that I need help on is understanding how to use Azure boards, or the delivery plan section on DevOps.
Does anyone know any good, free content for me and my BA's to learn this?
https://redd.it/1erixho
@r_devops
Hi all,
Hoping you wonderful people can help.
I'm a project manager that moved into product management.
At present, I am product owner for Dynamics 365. One of the core issues we have faced has been single branching strategy. I'm currently in the process of moving us over fully onto Azure DevOps for us to automate testing and resolve the branching strategy allowing us to be more agile.
One area that I need help on is understanding how to use Azure boards, or the delivery plan section on DevOps.
Does anyone know any good, free content for me and my BA's to learn this?
https://redd.it/1erixho
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
What do you monitor on your servers?
We've been developing the BlueWave Uptime Manager for the past 5 months with a team of 7 developers and 3 contributors. As we move towards expanding from basic uptime tracking to a comprehensive monitoring solution, we're interested in getting insights from the community.
For those of you managing server infrastructure,
What are the key assets you monitor beyond the basics like CPU, RAM, and disk usage?
Do you also keep tabs on network performance, processes, services, or other metrics?
Additionally, we're debating whether to build a custom monitoring agent or leverage existing solutions like OpenTelemetry or Fluentd.
What’s your take—would you trust a simple, bespoke agent, or would you feel more secure with a well-established solution?
Lastly, what’s your preference for data collection—do you prefer an agent that pulls data or one that pushes it to the monitoring system?
https://redd.it/1erkhef
@r_devops
We've been developing the BlueWave Uptime Manager for the past 5 months with a team of 7 developers and 3 contributors. As we move towards expanding from basic uptime tracking to a comprehensive monitoring solution, we're interested in getting insights from the community.
For those of you managing server infrastructure,
What are the key assets you monitor beyond the basics like CPU, RAM, and disk usage?
Do you also keep tabs on network performance, processes, services, or other metrics?
Additionally, we're debating whether to build a custom monitoring agent or leverage existing solutions like OpenTelemetry or Fluentd.
What’s your take—would you trust a simple, bespoke agent, or would you feel more secure with a well-established solution?
Lastly, what’s your preference for data collection—do you prefer an agent that pulls data or one that pushes it to the monitoring system?
https://redd.it/1erkhef
@r_devops
GitHub
GitHub - bluewave-labs/Checkmate: Checkmate is an open-source, self-hosted tool designed to track and monitor server hardware,…
Checkmate is an open-source, self-hosted tool designed to track and monitor server hardware, uptime, response times, and incidents in real-time with beautiful visualizations. Don't be shy, ...
Exploring the 12-Factor App Methodology: A Blueprint for Building Scalable and Resilient Cloud-Native Applications
Hey everyone,
I wanted to share a comprehensive blog post I just published about the **12-Factor App methodology**—a set of best practices designed to help developers build scalable, maintainable, and resilient cloud-native applications.
If you're working with **DevOps**, **microservices**, or building applications that need to thrive in **cloud environments**, understanding and applying these 12 factors can be a game-changer. In the post, I dive deep into each principle, explaining how they contribute to building modern, robust applications. I've also included book recommendations for each factor to help you explore these concepts further.
**What you’ll find in the blog:**
* An overview of all 12 factors, from codebase management to treating logs as event streams
* Practical insights on how to implement these principles in your projects
* Book recommendations to deepen your understanding of each factor
If you're interested in improving your application development practices, I think you'll find this post valuable.
🔗 [https://medium.com/@srivatssan/the-12-factor-app-methodology-a-blueprint-for-modern-cloud-native-applications-c1aea2984bde?sk=e2e214a30f30be4dfe7495b5fc27c80a](https://medium.com/@srivatssan/the-12-factor-app-methodology-a-blueprint-for-modern-cloud-native-applications-c1aea2984bde?sk=e2e214a30f30be4dfe7495b5fc27c80a)
I'd love to hear your thoughts and any experiences you've had implementing the 12-Factor App principles in your work!
https://redd.it/1erthxd
@r_devops
Hey everyone,
I wanted to share a comprehensive blog post I just published about the **12-Factor App methodology**—a set of best practices designed to help developers build scalable, maintainable, and resilient cloud-native applications.
If you're working with **DevOps**, **microservices**, or building applications that need to thrive in **cloud environments**, understanding and applying these 12 factors can be a game-changer. In the post, I dive deep into each principle, explaining how they contribute to building modern, robust applications. I've also included book recommendations for each factor to help you explore these concepts further.
**What you’ll find in the blog:**
* An overview of all 12 factors, from codebase management to treating logs as event streams
* Practical insights on how to implement these principles in your projects
* Book recommendations to deepen your understanding of each factor
If you're interested in improving your application development practices, I think you'll find this post valuable.
🔗 [https://medium.com/@srivatssan/the-12-factor-app-methodology-a-blueprint-for-modern-cloud-native-applications-c1aea2984bde?sk=e2e214a30f30be4dfe7495b5fc27c80a](https://medium.com/@srivatssan/the-12-factor-app-methodology-a-blueprint-for-modern-cloud-native-applications-c1aea2984bde?sk=e2e214a30f30be4dfe7495b5fc27c80a)
I'd love to hear your thoughts and any experiences you've had implementing the 12-Factor App principles in your work!
https://redd.it/1erthxd
@r_devops
Medium
The 12-Factor App Methodology: A Blueprint for Modern Cloud-Native Applications
When developing software applications we focus on many aspects like scalability, maintainability, resiliency etc., Thanks partly to cloud…
What is best way to monitor lot of PC's health
My work place has lot of Lab systems which occasionally losses wifi network and goes offline. What is best way to monitor multiple PCs? I would like to monitor network connectivity, hard disk space availability.
https://redd.it/1eru418
@r_devops
My work place has lot of Lab systems which occasionally losses wifi network and goes offline. What is best way to monitor multiple PCs? I would like to monitor network connectivity, hard disk space availability.
https://redd.it/1eru418
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Where and how do you store your environment vars / secrets.
Rn we are storing the env vars/ secrets in bitbucket (secrets are pulled and mounted).
Looking for a better options.
I found a few options such as HCP vault or AWS ssm parameter store. But still as a beginner, I'm stumbled on how it is done ???
https://redd.it/1erw27o
@r_devops
Rn we are storing the env vars/ secrets in bitbucket (secrets are pulled and mounted).
Looking for a better options.
I found a few options such as HCP vault or AWS ssm parameter store. But still as a beginner, I'm stumbled on how it is done ???
https://redd.it/1erw27o
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Aurora (MySQL) global database with global write forwarding.
We are using Aurora MySQL Global DB (east primary & west secondary). We have logic in gateway to route "read" traffic to geo based and "write" traffic to weighted i.e. east.
Question: Do you recommend using global write forwarding instead? Our application is read heavy if that matters and we do need performance (plus consistency, I know you can't have it all so maybe performance over consistency with lag of \~ milliseconds).
Reading some blogs say don't use global write forwarding? Is GW based routing that we have is good enough but its not truly Active/Active for our application either in that case. Should we do code based routing instead i.e. send read queries geo routed and write queries to weighted routes (Spring/JPA)?
Any suggestions or how you have implemented it would be helpful, thanks!
https://redd.it/1erwraz
@r_devops
We are using Aurora MySQL Global DB (east primary & west secondary). We have logic in gateway to route "read" traffic to geo based and "write" traffic to weighted i.e. east.
Question: Do you recommend using global write forwarding instead? Our application is read heavy if that matters and we do need performance (plus consistency, I know you can't have it all so maybe performance over consistency with lag of \~ milliseconds).
Reading some blogs say don't use global write forwarding? Is GW based routing that we have is good enough but its not truly Active/Active for our application either in that case. Should we do code based routing instead i.e. send read queries geo routed and write queries to weighted routes (Spring/JPA)?
Any suggestions or how you have implemented it would be helpful, thanks!
https://redd.it/1erwraz
@r_devops
Phil's Blog
AWS Aurora Global Clusters Explained: What you wish they told you before you built it
AWS Aurora Global is, on the face of it, a decent product. Aurora is a MySQL fork with a tonne of purported performance benefits over vanilla MySQL. I was building a system, in AWS, which relied on a MySQL database so thought I'd give Aurora Global Clusters…
CI/CD observability
Is your CI/CD pipeline slowing you down? Dive into the key steps and best practices to enhance your pipeline's visibility and performance using OpenTelemetry. Check out this blog: https://www.cloudraft.io/blog/opentelemetry-for-cicd-observability
https://redd.it/1ery0u3
@r_devops
Is your CI/CD pipeline slowing you down? Dive into the key steps and best practices to enhance your pipeline's visibility and performance using OpenTelemetry. Check out this blog: https://www.cloudraft.io/blog/opentelemetry-for-cicd-observability
https://redd.it/1ery0u3
@r_devops
CloudRaft
OpenTelemetry for CI/CD Observability
Explore how OpenTelemetry enhances CI/CD observability, boosting performance, troubleshooting, and scalability in DevOps.
Loggly alternative for centralized logs
I'm looking for an alternative to loggly. I have various .NET applications deployed across multiple locations, and I need them to send their logs back to a central server.
I've been experimenting with loggly and I’m already at the limit of their free plan, even in the testing phase. I was thinking about splunk since they offer the most similar feature set to Loggly, but it comes with significant limitations on data ingestion, especially in the Splunk Light version.
Does anyone have any recommendations? :)
https://redd.it/1ery93u
@r_devops
I'm looking for an alternative to loggly. I have various .NET applications deployed across multiple locations, and I need them to send their logs back to a central server.
I've been experimenting with loggly and I’m already at the limit of their free plan, even in the testing phase. I was thinking about splunk since they offer the most similar feature set to Loggly, but it comes with significant limitations on data ingestion, especially in the Splunk Light version.
Does anyone have any recommendations? :)
https://redd.it/1ery93u
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
I started challenging our junior devs to provide feedback or ask at least one question while reviewing a PR. Thoughts?
Our JR devs are allowed to approve PRs (not my choice), and it's usually just a rubber stamp as they're nervous to call out a more senior member.
I requested they try to add something to the PR in terms of feedback just to help them get their feet wet and more comfortable.
https://redd.it/1es2ykc
@r_devops
Our JR devs are allowed to approve PRs (not my choice), and it's usually just a rubber stamp as they're nervous to call out a more senior member.
I requested they try to add something to the PR in terms of feedback just to help them get their feet wet and more comfortable.
https://redd.it/1es2ykc
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
We're reviewing a few CI/CD tools for our company and I'm curious about your experience with a couple.
Namely it looks like management is whittling it down to Travis CI or GitHub Actions. I've heard that Github Actions requires a lot more coding than Travis (this is a lot more important to me than the bean counters lol). If that's the case it sounds like there's a big argument there in terms of efficiency that may not be so easily quantified to various decision makers. Anyone?
https://redd.it/1es4h0a
@r_devops
Namely it looks like management is whittling it down to Travis CI or GitHub Actions. I've heard that Github Actions requires a lot more coding than Travis (this is a lot more important to me than the bean counters lol). If that's the case it sounds like there's a big argument there in terms of efficiency that may not be so easily quantified to various decision makers. Anyone?
https://redd.it/1es4h0a
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Standard vs Express Step function
I don’t quite understand what do they mean by exactly once and atleas -once model respectively.If we can use a for loop and retry in standard workflow how is that exactly once then?!
https://redd.it/1es5brj
@r_devops
I don’t quite understand what do they mean by exactly once and atleas -once model respectively.If we can use a for loop and retry in standard workflow how is that exactly once then?!
https://redd.it/1es5brj
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
In your resume, do you put a lot of keywordd to pass CV screening or avoid it?
Hello!
In your resume, in order to pass the CV screening phase, often done by HR or even automatic tool, do you put a lot of technologies keywords? (Like list all the tech you work on only if it was for a low amount of time)
Or you avoid it in order to pass the hiring manager CV screening?
What is the good balance?
https://redd.it/1es72ff
@r_devops
Hello!
In your resume, in order to pass the CV screening phase, often done by HR or even automatic tool, do you put a lot of technologies keywords? (Like list all the tech you work on only if it was for a low amount of time)
Or you avoid it in order to pass the hiring manager CV screening?
What is the good balance?
https://redd.it/1es72ff
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How to prevent data exfiltration
Hi everyone,
I’d like to get your opinions on implementing a data exfiltration prevention system.
For context, we have a partner who provides us with data and requires controls to prevent data exfiltration through personal email accounts, saving to local drives, copying, remote printing, etc.
We currently have SIEM, antivirus, and threat detection in place on both workstations and servers. Server access is restricted to authorized personnel only, requires VPN and approval for each connection, with sessions limited to 8 hours and fully logged. We also have DLP enabled in Microsoft Office. We are SOC2 Type 2 certified.
Support handles level 1 and 2 issues with limited access to client data via browsers. Level 3 support is managed by developers, and unfortunately, there are too many of them, but that's something we can’t address.
The partner wants us to extend these measures to our clients, which is impossible since we are a B2C company. However, they criticize us for only having detection methods and no prevention measures.
This is where we’re stuck—how can we implement a system that actively blocks data exfiltration? I see the potential of using a proxy to filter all web traffic, but that would significantly slow down development, which is challenging for a tech firm like ours.
What solutions do you use?
https://redd.it/1es70z4
@r_devops
Hi everyone,
I’d like to get your opinions on implementing a data exfiltration prevention system.
For context, we have a partner who provides us with data and requires controls to prevent data exfiltration through personal email accounts, saving to local drives, copying, remote printing, etc.
We currently have SIEM, antivirus, and threat detection in place on both workstations and servers. Server access is restricted to authorized personnel only, requires VPN and approval for each connection, with sessions limited to 8 hours and fully logged. We also have DLP enabled in Microsoft Office. We are SOC2 Type 2 certified.
Support handles level 1 and 2 issues with limited access to client data via browsers. Level 3 support is managed by developers, and unfortunately, there are too many of them, but that's something we can’t address.
The partner wants us to extend these measures to our clients, which is impossible since we are a B2C company. However, they criticize us for only having detection methods and no prevention measures.
This is where we’re stuck—how can we implement a system that actively blocks data exfiltration? I see the potential of using a proxy to filter all web traffic, but that would significantly slow down development, which is challenging for a tech firm like ours.
What solutions do you use?
https://redd.it/1es70z4
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Notebook-Native CI/CD: Dagger for Runme
Have you tried Dagger yet? If the answer is yes, have you built pipelines in a notebook yet? Crazy idea?
Learn about it here: https://runme.dev/blog/dagger-for-runme
https://redd.it/1es7taw
@r_devops
Have you tried Dagger yet? If the answer is yes, have you built pipelines in a notebook yet? Crazy idea?
Learn about it here: https://runme.dev/blog/dagger-for-runme
https://redd.it/1es7taw
@r_devops
runme.dev
Notebook-Native CI/CD: Dagger for Runme
Learn how to build Dagger functions and pipelines in interactive notebooks. Runme v3.7 integrates Dagger directly into the notebook user interface, making learning and building with it a breeze.
Exploring the Recent Microsoft AI Health Bot Vulnerability: What DevOps Teams Should Know
Recently, a vulnerability was discovered in the Microsoft AI Health Bot, raising important questions for DevOps professionals working in the healthcare sector. As we navigate an era where AI integration in health care is becoming increasingly prevalent, understanding the implications of such vulnerabilities is critical. How should DevOps teams approach security when deploying AI solutions? What best practices can be implemented to safeguard sensitive information? Let’s discuss the lessons learned from this incident and share strategies to enhance our security posture in artificial intelligence applications. Have you faced similar challenges in your projects? What measures did you take?
https://7med.co.uk/microsoft-ai-health-bot-vulnerability-patched/
https://redd.it/1esb8jo
@r_devops
Recently, a vulnerability was discovered in the Microsoft AI Health Bot, raising important questions for DevOps professionals working in the healthcare sector. As we navigate an era where AI integration in health care is becoming increasingly prevalent, understanding the implications of such vulnerabilities is critical. How should DevOps teams approach security when deploying AI solutions? What best practices can be implemented to safeguard sensitive information? Let’s discuss the lessons learned from this incident and share strategies to enhance our security posture in artificial intelligence applications. Have you faced similar challenges in your projects? What measures did you take?
https://7med.co.uk/microsoft-ai-health-bot-vulnerability-patched/
https://redd.it/1esb8jo
@r_devops
7Med Integration
Microsoft AI Health Bot Patched to Address Critical Vulnerability | 7Med Integration
Learn how Microsoft addressed a critical privilege escalation vulnerability in its AI Health Bot, ensuring enhanced security for healthcare applications.
Need to create a main branch under repo using API (power Automate)
Hi Guys, We are automating a process to create project, repos, main branch and sub branch using power Automate http connect or. I am able to create project and repo in an hierarchy. But I am struggling to create a main branch using the rest API. Please help me on this. Thanks 🙏
https://redd.it/1esd55o
@r_devops
Hi Guys, We are automating a process to create project, repos, main branch and sub branch using power Automate http connect or. I am able to create project and repo in an hierarchy. But I am struggling to create a main branch using the rest API. Please help me on this. Thanks 🙏
https://redd.it/1esd55o
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Env vars
Hey all
Curious how you handle your env vars within a deployment?
We have GitHub builder our docker containers for kubernetetes and on every build adds the env vars to a .env file
I find this approach terrible as if a dev forgets to add a new variable in the pipeline file the build fails.
Wondered if you guys are doing it a cleaner way?
https://redd.it/1esddux
@r_devops
Hey all
Curious how you handle your env vars within a deployment?
We have GitHub builder our docker containers for kubernetetes and on every build adds the env vars to a .env file
I find this approach terrible as if a dev forgets to add a new variable in the pipeline file the build fails.
Wondered if you guys are doing it a cleaner way?
https://redd.it/1esddux
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
What's your strategy for learning tech at your organization?
When it seems like it's endless?
Do you tend to be a master of a few trades or a jack of all trades? How deep does your knowledge typically spread?
https://redd.it/1esfcah
@r_devops
When it seems like it's endless?
Do you tend to be a master of a few trades or a jack of all trades? How deep does your knowledge typically spread?
https://redd.it/1esfcah
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Launch.json config for listening to a containerized app not working
I have a straightforward setup and I see Debugger listening on ws://127.0.0.1:5555, but when I try to listen to the debugger port, VS Code hangs and doesn't output any error message. Why would it work on Linux, but not on Windows, and what are the fixes?
It looks something like this, and I was wondering what I can do to debug and find out why it's not working. I am using ts-node-dev.
https://redd.it/1escqf3
@r_devops
I have a straightforward setup and I see Debugger listening on ws://127.0.0.1:5555, but when I try to listen to the debugger port, VS Code hangs and doesn't output any error message. Why would it work on Linux, but not on Windows, and what are the fixes?
{"version": "0.2.0","configurations": [{"type": "node","request": "attach","name": "Attach to Docker Container","address": "localhost","port": 5555,"localRoot": "${workspaceFolder}","remoteRoot": "/usr/src/app"}]}It looks something like this, and I was wondering what I can do to debug and find out why it's not working. I am using ts-node-dev.
https://redd.it/1escqf3
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Would Sherlock use traces or metrics to debug your application?
https://jaywhy13.hashnode.dev/would-sherlock-use-traces-or-metrics-to-debug-your-application
Looking for some thoughts and oppositions on the superiority of traces for debugging applications.
https://redd.it/1esi84f
@r_devops
https://jaywhy13.hashnode.dev/would-sherlock-use-traces-or-metrics-to-debug-your-application
Looking for some thoughts and oppositions on the superiority of traces for debugging applications.
https://redd.it/1esi84f
@r_devops
Perspective Unspoken
3 reasons traces are better than metrics for debugging
Discover why traces are essential for effective debugging and system investigation in modern micro-service architectures.