Reddit DevOps

I am building a new CI tool what things should I keep in mind ?

If I were to build a new CI tool what are some things i should do which gives me competitive edge over others ?

https://redd.it/1er9cwm
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views15:28

Reddit DevOps

Needing to run 4 web applications, each requiring only 0.25cpu 500mb ram, what's the most economical way on AWS?

I'm looking into various options to run 4 web applications, each requiring only 0.25 cpu and 500mb ram (or lesser even). Traffic is fairly low, less than 1k active users a month. Each application is merely running SPA + a node backend bundled with it. These applications also update very frequently (once or twice a day), it needs to automatically swap out the old, from code to a running application, without downtime, and without supervision.

Sure, I could setup a EKS cluster running solely on spot nodes + running multiple replicas of them to ensure spot termination interrupt doesn't create downtime. But even that, would cost me roughly $200 a month (guesstimate). Slap in argocd, image updater and a build pipeline, everything is handled for me without supervision.

Or I could spin up an EC2 instance, and have them all run in it, but these applications updates once or twice a day, I needed a way to have them deployed as soon as code is checked in to the repository, automatically. I don't feel like fiddling with webhook, SNS and lambda just to get it work.

Then I saw AWS Amplify, it can tracks code! and have them built as soon as there's code checks in and deployed automatically. But damn, they are buggy, I could not get those applications to work 100% on Amplify for some weird reasons I could not understand behind the scene.

Then I saw ECS with Fargate, seems promising, but the ability for me to automate builds and deploys from code to a running container is still questionable. I'm not sure if there's cost advantage comapred to running a full EKS + spot instances only (economical-wise).

I looked at other providers, like Digital Ocean and Vultr, they offer managed kubernetes control plane that cost $0, but damn their container registry cost a lot more than AWS ECR and has no lifecycle policy to automatically remove old images, which brings the cost very similar as though I'm doing the same on AWS.

Any idea how would you deploy these applications?

https://redd.it/1erbi8r
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

12 views16:28

Reddit DevOps

Traefik global redirect from www to non-www domain

I want to redirect all my containers - websites from https://www.mywebsite.com to https://mywebsite.com. Http to https redirect I already have. I have set up CNAME dns record to point www.mywebsite.com to my server's IP.

I had discussion with ChatGpt, but what it gave me doesn't work, it just loads https://www.mywebsite.com without a SSL certificate.

Here is my Traefik dynamic.yml configuration, what is missing to make it work? I want to apply this redirect globally in static or dynamic configuration without editing labels for each container.

This does redirect but www domain has no https certificate.

# dynamic configuration

http:
  middlewares:
    redirect-to-non-www:
      redirectRegex:
        regex: "^https?://www\\.(.*)"
        replacement: "https://$1"
        permanent: true

    secureHeaders:
      headers:
        sslRedirect: true
        forceSTSHeader: true
        stsIncludeSubdomains: true
        stsPreload: true
        stsSeconds: 31536000

    user-auth:
      basicAuth:
        users:
          - '{{ env "TRAEFIK_AUTH" }}'

  routers:
    default-router:
      entryPoints:
        - web
        - websecure
      rule: "HostRegexp(`{host:.+}`)"
      middlewares:
        - redirect-to-non-www
        - secureHeaders
        - user-auth
      service: noop-service
      priority: 1

  services:
    noop-service:
      loadBalancer:
        servers:
          - url: "https://0.0.0.0"

tls:
  options:
    default:
      cipherSuites:
        - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
        - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
        - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
        - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
        - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
      minVersion: VersionTLS12

https://redd.it/1ercmvj
@r_devops

13 views17:28

Reddit DevOps

Should I leave ?

Hey all, struggling with what to do with regards to my current role

My main issue is around a year ago a lot of the stuff which I would have been interested in has been abstracted away to managed vendors , from the management of our environments to the management of developer machines.

Anything network related is handled by either an internal network team or again our managed vendor

As such , there’s actually not much I have direct responsibilities over in any meaningful capacity.

I can feel my skills atrophying and it just feels like we’re secretaries for these other teams to tell them something is wrong, it really feels like just a glorified support role they slapped the name devops engineer on

We are barely involved in th development process for any new applications and don’t have much of any opportunities to practice anything

I’ve been trying to learn in my own time but it’s hard when you can’t utilise the skills in the work place

As someone who’s first job this is out of uni for 3 years in the role , In my scenario what would you do ?

https://redd.it/1erf1hm
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views18:28

Reddit DevOps

I built a POC for a real-time log monitoring solution, orchestrated as a distributed system

A proof-of-concept log monitoring solution built with a microservices architecture and containerization, designed to capture logs from a live application acting as the log simulator. This solution delivers actionable insights through dashboards, counters, and detailed metrics based on the generated logs. Think of it as a very lightweight internal tool for monitoring logs in real-time. All the core infrastructure (e.g., ECS, ECR, S3, Lambda, CloudWatch, Subnets, VPCs, etc...) deployed on AWS via Terraform.

Feel free to take a look and give some feedback: https://github.com/akkik04/Trace

https://redd.it/1ergpf0
@r_devops

GitHub

GitHub - akkik04/Trace: POC for a real-time log monitoring solution, orchestrated as a distributed system

POC for a real-time log monitoring solution, orchestrated as a distributed system - akkik04/Trace

11 views19:28

Reddit DevOps

API Observability Guide: Enhancing Reliability & Performance

One of these guest blogs did a pretty good job covering API observability including the pillars of it, what it is, components, and implementation of it. There are also a few advanced techniques, and I thought it might be good to share it here as an educational resource.

Any additional techniques that we may have missed are welcome but no pressure.
https://www.getambassador.io/blog/api-observability-enhancing-reliability-performance

https://redd.it/1erghrs
@r_devops

www.getambassador.io

API Observability: Key to Boosting Reliability & Performance

Explore API observability to boost reliability and performance in your digital systems. Master essential tools for improved infrastructure management

11 views20:28

Reddit DevOps

Why is this happening

Suddenly started to face this problem while pressing Run Java of my Spring Boot App. If any of you beautiful souls faced it before, how did you work around it? I have this deadline i gotta fix this quick im sorry

The problem:
Failed to refresh live data from process

service:jmx:rmi:///jndi/rmi://127.0.0.1:45556/jmxrmi

after retries: 10

Source: Spring Boot Tools

https://redd.it/1erj3j1
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views21:28

Reddit DevOps

DevOps lessons from building a global monitoring platform

Ever start a side project that spirals out of control? That's the story of my last year building UptimeCard, and I thought I'd share some DevOps war stories with you all.

It began innocently enough - just a simple uptime monitor. Fast forward, and I'm juggling a platform that's analyzing tech stacks for thousands of websites globally.

The first reality check hit when my cute little DigitalOcean setup choked at around 1000 monitored sites. Suddenly, I'm deep-diving into AWS documentation, trying to figure out how to scale this thing without breaking the bank. EC2, Lambda, DynamoDB - my new best friends and worst nightmares.

But here's the kicker - monitoring globally means dealing with, well, the globe. I naively thought I could run everything from a single region. You can't.

Then came the data deluge. Turns out, collecting and processing data from thousands of sites every minute is like drinking from a fire hose. I cobbled together a pipeline with Kinesis, and it's holding... for now.

Oh, and the irony of needing rock-solid monitoring for a monitoring service? Not lost on me. I've got CloudWatch alerts that would wake the dead. Because nothing says "professional" like your uptime monitor going down.

Infrastructure management became my nemesis. Started with manual setups (I know, I know), and quickly drowned in config hell. Terraform saved my sanity, but the migration was... let's call it character-building.

Security? A constant paranoia. When you're handling data from thousands of websites, every shadow looks like a potential breach. I'm now on a first-name basis with AWS's IAM documentation.

And let's not forget the cloud bill. I'm now a reluctant expert in auto-scaling groups and spot instances.

UptimeCard's at v1.0 now (https://uptimecard.com if you're curious), but it feels like I've aged a decade getting here. I'm sure there's still a ton to optimize.

So, what hard-learned lessons have you picked up from similar projects? Any tips for a battle-worn developer still figuring out this DevOps game?

I'm also toying with the idea of open-sourcing some of our DevOps scripts. Feels like it's time to give back to the community that's saved my bacon more times than I can count.

https://redd.it/1erjp83
@r_devops

UptimeCard

UptimeCard | Uptime For Innovators

Join UptimeCard to discuss and review the best web hosting providers. Get insights, tips, and find your perfect host.

13 views22:28

Reddit DevOps

Need Suggestions for Reducing Downtime During EKS Deployments

Hello everyone,

I could use some help or suggestions with a deployment issue we're facing.

Currently, we're deploying to EKS, using Atlas MongoDB, and storing some documents in S3. The challenge is that every time we deploy to production, we need to take the system offline, back up S3 (which takes about an hour due to a large number of files, even though the size is small), back up the database, then deploy and run the migration.

Does anyone have ideas on how we can reduce or eliminate this downtime?

https://redd.it/1erjuji
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

13 views23:28

Reddit DevOps

Resources to learn DevOps Project

Hi all,

Hoping you wonderful people can help.

I'm a project manager that moved into product management.

At present, I am product owner for Dynamics 365. One of the core issues we have faced has been single branching strategy. I'm currently in the process of moving us over fully onto Azure DevOps for us to automate testing and resolve the branching strategy allowing us to be more agile.

One area that I need help on is understanding how to use Azure boards, or the delivery plan section on DevOps.

Does anyone know any good, free content for me and my BA's to learn this?

https://redd.it/1erixho
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

12 views00:28

Reddit DevOps

What do you monitor on your servers?

We've been developing the BlueWave Uptime Manager for the past 5 months with a team of 7 developers and 3 contributors. As we move towards expanding from basic uptime tracking to a comprehensive monitoring solution, we're interested in getting insights from the community.

For those of you managing server infrastructure,

What are the key assets you monitor beyond the basics like CPU, RAM, and disk usage?
Do you also keep tabs on network performance, processes, services, or other metrics?

Additionally, we're debating whether to build a custom monitoring agent or leverage existing solutions like OpenTelemetry or Fluentd.

What’s your take—would you trust a simple, bespoke agent, or would you feel more secure with a well-established solution?
Lastly, what’s your preference for data collection—do you prefer an agent that pulls data or one that pushes it to the monitoring system?

https://redd.it/1erkhef
@r_devops

GitHub

GitHub - bluewave-labs/Checkmate: Checkmate is an open-source, self-hosted tool designed to track and monitor server hardware,…

Checkmate is an open-source, self-hosted tool designed to track and monitor server hardware, uptime, response times, and incidents in real-time with beautiful visualizations. Don't be shy, ...

11 views01:28

Reddit DevOps

Exploring the 12-Factor App Methodology: A Blueprint for Building Scalable and Resilient Cloud-Native Applications

Hey everyone,

I wanted to share a comprehensive blog post I just published about the **12-Factor App methodology**—a set of best practices designed to help developers build scalable, maintainable, and resilient cloud-native applications.

If you're working with **DevOps**, **microservices**, or building applications that need to thrive in **cloud environments**, understanding and applying these 12 factors can be a game-changer. In the post, I dive deep into each principle, explaining how they contribute to building modern, robust applications. I've also included book recommendations for each factor to help you explore these concepts further.

**What you’ll find in the blog:**

* An overview of all 12 factors, from codebase management to treating logs as event streams
* Practical insights on how to implement these principles in your projects
* Book recommendations to deepen your understanding of each factor

If you're interested in improving your application development practices, I think you'll find this post valuable.

🔗 [https://medium.com/@srivatssan/the-12-factor-app-methodology-a-blueprint-for-modern-cloud-native-applications-c1aea2984bde?sk=e2e214a30f30be4dfe7495b5fc27c80a](https://medium.com/@srivatssan/the-12-factor-app-methodology-a-blueprint-for-modern-cloud-native-applications-c1aea2984bde?sk=e2e214a30f30be4dfe7495b5fc27c80a)

I'd love to hear your thoughts and any experiences you've had implementing the 12-Factor App principles in your work!

https://redd.it/1erthxd
@r_devops

Medium

The 12-Factor App Methodology: A Blueprint for Modern Cloud-Native Applications

When developing software applications we focus on many aspects like scalability, maintainability, resiliency etc., Thanks partly to cloud…

11 views05:28

Reddit DevOps

What is best way to monitor lot of PC's health

My work place has lot of Lab systems which occasionally losses wifi network and goes offline. What is best way to monitor multiple PCs? I would like to monitor network connectivity, hard disk space availability.

https://redd.it/1eru418
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views06:28

Reddit DevOps

Where and how do you store your environment vars / secrets.

Rn we are storing the env vars/ secrets in bitbucket (secrets are pulled and mounted).

Looking for a better options.

I found a few options such as HCP vault or AWS ssm parameter store. But still as a beginner, I'm stumbled on how it is done ???

https://redd.it/1erw27o
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

9 views08:28

Reddit DevOps

Aurora (MySQL) global database with global write forwarding.

We are using Aurora MySQL Global DB (east primary & west secondary). We have logic in gateway to route "read" traffic to geo based and "write" traffic to weighted i.e. east.

Question: Do you recommend using global write forwarding instead? Our application is read heavy if that matters and we do need performance (plus consistency, I know you can't have it all so maybe performance over consistency with lag of \~ milliseconds).

Reading some blogs say don't use global write forwarding? Is GW based routing that we have is good enough but its not truly Active/Active for our application either in that case. Should we do code based routing instead i.e. send read queries geo routed and write queries to weighted routes (Spring/JPA)?

Any suggestions or how you have implemented it would be helpful, thanks!

https://redd.it/1erwraz
@r_devops

Phil's Blog

AWS Aurora Global Clusters Explained: What you wish they told you before you built it

AWS Aurora Global is, on the face of it, a decent product. Aurora is a MySQL fork with a tonne of purported performance benefits over vanilla MySQL. I was building a system, in AWS, which relied on a MySQL database so thought I'd give Aurora Global Clusters…

9 views09:28

Reddit DevOps

CI/CD observability

Is your CI/CD pipeline slowing you down? Dive into the key steps and best practices to enhance your pipeline's visibility and performance using OpenTelemetry. Check out this blog: https://www.cloudraft.io/blog/opentelemetry-for-cicd-observability

https://redd.it/1ery0u3
@r_devops

CloudRaft

OpenTelemetry for CI/CD Observability

Explore how OpenTelemetry enhances CI/CD observability, boosting performance, troubleshooting, and scalability in DevOps.

8 views10:28

Reddit DevOps

Loggly alternative for centralized logs

I'm looking for an alternative to loggly. I have various .NET applications deployed across multiple locations, and I need them to send their logs back to a central server.

I've been experimenting with loggly and I’m already at the limit of their free plan, even in the testing phase. I was thinking about splunk since they offer the most similar feature set to Loggly, but it comes with significant limitations on data ingestion, especially in the Splunk Light version.

Does anyone have any recommendations? :)

https://redd.it/1ery93u
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views11:28

Reddit DevOps

I started challenging our junior devs to provide feedback or ask at least one question while reviewing a PR. Thoughts?

Our JR devs are allowed to approve PRs (not my choice), and it's usually just a rubber stamp as they're nervous to call out a more senior member.

I requested they try to add something to the PR in terms of feedback just to help them get their feet wet and more comfortable.

https://redd.it/1es2ykc
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views14:28

Reddit DevOps

We're reviewing a few CI/CD tools for our company and I'm curious about your experience with a couple.

Namely it looks like management is whittling it down to Travis CI or GitHub Actions. I've heard that Github Actions requires a lot more coding than Travis (this is a lot more important to me than the bean counters lol). If that's the case it sounds like there's a big argument there in terms of efficiency that may not be so easily quantified to various decision makers. Anyone?

https://redd.it/1es4h0a
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

13 views15:28

Reddit DevOps

Standard vs Express Step function

I don’t quite understand what do they mean by exactly once and atleas -once model respectively.If we can use a for loop and retry in standard workflow how is that exactly once then?!

https://redd.it/1es5brj
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

11 views16:28

Reddit DevOps

In your resume, do you put a lot of keywordd to pass CV screening or avoid it?

Hello!

In your resume, in order to pass the CV screening phase, often done by HR or even automatic tool, do you put a lot of technologies keywords? (Like list all the tech you work on only if it was for a low amount of time)

Or you avoid it in order to pass the hiring manager CV screening?

What is the good balance?

https://redd.it/1es72ff
@r_devops

From the devops community on Reddit

Explore this post and more from the devops community

15 views17:28

About

Blog

Apps

Platform