Reddit DevOps

What is a good way to document CI/CD pipelines?

I’m building some pipelines for various apps, this includes CI and CD. I want to start by illustrating to the team the different tools and steps within these pipelines. Are there any free tools for generating nice and illustrative docs that are DevOps orientated?

https://redd.it/yx4jv6
@r_devops

What is a good way to document CI/CD pipelines?

I’m building some pipelines for various apps, this includes CI and CD. I want to start by illustrating to the team the different tools and steps...

3 views20:28

Reddit DevOps

Feeling not so great about being a DevOps\Cloud Engineer

Hello fellow DevOps friends,

Lately I have been feeling very down and depressed about how I'm functioning in DevOps, cloud, infrastructure, etc.

I got into DevOps about 6(?) years ago when I moved to it from QE, so I never had a strong programming background. I do love learning about different tools and technologies and finding effective ways to implement them, but I really feel like I'm lagging behind.

I've been working in Azure consistently for 7 months and still barely understand it even at a fundamental level. Aside from this, I've been experiencing major brain fog, not able to focus at work, etc. Not sure if that's stress of learning so many new tools, or how I'm feeling (or maybe it's all the Excel sheets... 🤢), but it's impacting how I'm performing.

I just wanna know if someone else in the DevOps world has experienced this, and have you/how did you overcome it? I'm feeling so scrambled 😫

https://redd.it/yx2pkx
@r_devops

Feeling not so great about being a DevOps\Cloud Engineer

Hello fellow DevOps friends, Lately I have been feeling very down and depressed about how I'm functioning in DevOps, cloud, infrastructure,...

3 views21:28

Reddit DevOps

DevOps infrastructure from scratch

I'm a long time Network/SysAdmin who wants to move into DevOps and SRE type roles.

I want to setup an environment from scratch implementing best practices, and need a little guidance with the foundational building blocks, and where to start. I want to do this on the cheap using FOSS and low cost services (but only when necessary). That being said, I don't want to close the door on paid services, especially Azure as our current application stack is Windows based (and could migrate to Azure in the future, but hopefully not.)

* I have a somewhat beefy server (dual Xeon, 192GB RAM, redundant storage) on premises that is a blank slate (but I'd like to use Proxmox as it allows hosting any OS). We have gigabit internet.
* I also have a free tier Oracle Cloud (Ampere aka ARM) account that seems pretty decent.
* Finally, I could add a cheap VPS (think LowEndBox) if there is any benefit. I also have more hardware on premises I can use.

I'd like to start my build with something like Terraform, but Ansible, Puppet, etc. are options. This kind of feels like picking oneself up by your own bootstraps. I'm trying to avoid installing directly on my workstation. I'm unclear on where to start.

Eventually I'd progress into Docker (or Podman?), Kubernetes, host my own code repo, monitoring, etc. I guess my confusion is also about the order of operations so that I'm not having to undo/redo things.

Any help or advice is appreciated. Many thanks.

https://redd.it/yx2vci
@r_devops

DevOps infrastructure from scratch

I'm a long time Network/SysAdmin who wants to move into DevOps and SRE type roles. I want to setup an environment from scratch implementing best...

6 views22:28

Reddit DevOps

Harness CI launches the fastest CI leveraging Drone

Interesting announcement from Harness on being the fastest CI on the planet. https://harness.io/blog/fastest-ci-tool There a repo to test the results. Has anyone tried it?

https://redd.it/yx79jk
@r_devops

Harness.io

The Data is In: Harness CI is the Fastest on the Planet! | Harness

New data shows that Harness Continuous Integration is the fastest CI tool, building up to four times faster. This validates the impact of three important new feature enhancements: Cache Intelligence, Test Intelligence (a technology exclusive to Harness CI)…

6 views00:28

Reddit DevOps

Argo Rollouts and Service Mesh: Automate and Control Canary Releases

Canary release sometimes referred to as a phased rollout or an incremental rollout, is a technique for lowering the risk of releasing a new software version into production by gradually introducing the change to a small subset of users before presenting it to the entire infrastructure and making it available to everyone. Similar to a Blue-Green Deployment, you start by deploying the new version of your software to a subset of your infrastructure, to which no users are routed. When you are satisfied with the new version, you can begin directing a few select users to it. There are various strategies for selecting which users will see the new version: a simple strategy is to use a random sample; some companies choose to release the new version to their internal users and employees before releasing it to the rest of the world, and a more sophisticated approach is to select users based on their profile and other demographics. As you gain more confidence in the new version, you can start releasing it to more servers in your infrastructure and routing more users to it.

This article will describe how to use Argo Rollouts and Service Mesh osm-edge for automated, controlled canary releases of applications.

https://blog.flomesh.io/argo-rollouts-and-service-mesh-automate-and-control-canary-releases-c71e5403eb2

https://redd.it/yxdvod
@r_devops

flomesh.io

Pipy

A programmable network proxy for the cloud, edge and IoT.

3 views02:28

Reddit DevOps

This experiment shows that Harness CI is faster than GitHub actions

The Drone founder Brad Rydzewski's new article is where the new data shows that Harness Continuous Integration builds up to four times faster than other leading CI tools, which validates the impact of three important new feature enhancements: Cache Intelligence, Test Intelligence (a technology exclusive to Harness CI), and Hosted Builds.

Here is the article: https://www.harness.io/blog/fastest-ci-tool

https://redd.it/yxgg0m
@r_devops

Harness.io

The Data is In: Harness CI is the Fastest on the Planet! | Harness

3 views04:28

Reddit DevOps

How to help train a newbie?

Hey all,

I just started my first senior role, the team is great and I love working with everyone. I suspect that one of my areas of focus will be training our Junior DevOps Engineer on our team! He's a nice kid whos basically straight out of college and has that "deer caught in the headlights" vibe lol. I don't think he knows much about DevOps and just accepted the job because why not (I was the same at that stage in my career, so I get it).

Any advice for training him up? I was given the sink or swim treatment but that doesn't always work for everyone. He's a bit shy and reserved so I think if I go that route he would just drown without asking for help....

Mostly I've been doing pair programming with him as I learn the system and shadow him but any more advice is always appreciated, I'm sure he'd thank you for any advice you give me as well!

https://redd.it/yxgwj1
@r_devops

How to help train a newbie?

Hey all, I just started my first senior role, the team is great and I love working with everyone. I suspect that one of my areas of focus will...

5 views05:28

Reddit DevOps

Agile workflow with Jira and Git

Does anyone have a good resource to learn the best practices for a small development team (5-10 devs) working in an agile workflow? The process we use at my work is not very efficient but there are some challenges I can't figure out. I would love to hear about the workflows that are successful for you. I googled this but surprisingly I didn't find anything super detailed, just high level principles.

For context we work on web front-ends in React and back-ends (REST or GraphQL APIs) in Node. Database is on Planetscale. We use Jira and deploy on AWS but I would be open to any other tools/platforms.

Here's the process as I understand it, as well as my questions:

1. Developer creates a new branch to work on a feature/bugfix/etc (an item in the sprint)
2. When done, dev creates a pull request to merge their feature branch into the Staging branch
3. Someone (product manager in our case) tests the app against the acceptance criteria and if everything is OK they approve + merge the pull request into the Staging branch
1. For web front-ends (static sites like React, Vue, etc), all of these feature branches are automatically deployed using branch previews (like in Netlify, AWS Amplify, etc), so the PM can do their testing vs acceptance criteria in these temporary automatic deployments, no issue there. Database branching is done in Planetscale and that part works well too.
2. But how can/should this be done for web back-ends, or other systems that run on a server? For example a REST API running on Kubernetes or AWS ECS or AWS EC2 etc? Right now these need to be deployed manually, which is a huge pain. Even using terraform/cloudformation there are still manual steps required. Ideally it would be great if every branch was automatically deployed to some unique URL that we can use temporarily for testing.
3. How do we ensure the front-end is communicating with the correct instance of the back-end (.env management)? Same thing for pointing the back-end to the correct database branch URL. Right now this is done manually with environment variables.
4. Since the front-end and back-end are in separate Git repos, how do we ensure both of these branches and PRs for the same feature are in sync? How do we avoid having to approve/merge 2 separate PRs in 2 separate repos for the same feature / item in Jira? Is a monorepo the best approach? What if there are separate teams for front-end and back-end?
4. Once the PR is merged, the feature branch is deleted and the dev moves on to the next task in Jira and repeats the process
5. Once all the items we want to include in the next release are done and merged into staging, some more testing is done in staging and finally a release PR is created and merged into main and automatically deployed (CI/CD pipeline automatically runs and deploys the staging and main branches)
6. What do we do if we don't want to deploy every single change that was merged to staging? For example maybe the business decides to delay the release of a certain feature but it was already merged to staging. How can we avoid merging that into main? Do we have to implement feature flags for everything?

Please forgive me if this isn't the right subreddit for this question, and I would greatly appreciate it if you could point me to a better place for this question.

https://redd.it/yxinb0
@r_devops

Agile workflow with Jira and Git

Does anyone have a good resource to learn the best practices for a small development team (5-10 devs) working in an agile workflow? The process we...

6 views06:28

Reddit DevOps

Best tips for preparing for technical DevOps interviews. Is grinding Leetcode needed/worth it at all?

Context: I am already a DevOps Engineer and currently looking for a new position. I had some previous experience as a Software Engineer doing Java development but took a break from development for a year as I didin't enjoy programming all day everyday. Got a different position at a new company more business focused but opportunities and my skillsets brought me through Release Engineering and then into DevOps. Coming to my current role, I knew enough about programming to get the position but due to this year break I had...my programming skills are a little rusty.

In my role I have been doing some light Groovy scripting. Maintaining our pipelines, adding new steps and functionality to a handful of them, but I don't feel like any of the work that I have been doing has been really exercising any HARD programming skills/concepts.

As I feel it is the most useful/practical in a DevOps role)and given my knowledge and background in OOP, i've been trying to learn python from scratch (Bash comes next).

What types of problems/concepts should I be practicing when I am trying to study for the coding portion of technical DevOps interviews? Is grinding leetcode problems and going through algorithm and data structure problems (stuff I would normally grind if I were going for a softare engineer position) worth it or might it be overkill for questions I would get asked to do?

Any input helps! Thank you.

https://redd.it/yxntyg
@r_devops

Best tips for preparing for technical DevOps interviews. Is...

Context: I am already a DevOps Engineer and currently looking for a new position. I had some previous experience as a Software Engineer doing Java...

6 views11:28

Reddit DevOps

Server starts dropping http connections after a certain amount of requests

Hello, I'm not sure if this is the right place to ask such a question but I'm trying to get help somewhere as I'm unable to get this resolved in any other places (tried stack overflow, plesk forums, numerous other forums).

I have two domains setup on our server - let's say usersite.com and api.usersite.com. usersite.com is powered by nuxt.js - a front-end framework which runs on node.js. It makes API calls to api.usersite.com, which is a Laravel application. Both of these projects are running inside docker containers. Usersite is using a reverse nginx proxy to the api site.

Now to the problem - when there is slightly higher traffic to usersite (200 users per minute) API site starts to drop connections, immediately resulting in 504. Perhaps someone could guide me in the right direction of why this might be happening? I've noticed that API website logs show that all requests come from the same IP (the server it self), that means that as requests are proxied they take the proxy server ip instad of client ip. So perhaps a self-ddos is happening, where nginx thinks one ip is flooding it with requests and starts dropping connections? What could be the possible solution for this?

What's weird is that it's not an uncommon practice to have back-end separate from front-end and for them to communicate through API with reverse proxy but I can't find any results regarding such issue that I have on Google...

https://redd.it/yxmmxr
@r_devops

Server starts dropping http connections after a certain amount of...

Hello, I'm not sure if this is the right place to ask such a question but I'm trying to get help somewhere as I'm unable to get this resolved in...

4 views13:28

Reddit DevOps

Developer self-service portal for Kubernetes/Helm

We are working on a tool that allows **developers** to deploy their own services from a catalog, via a simple UI portal. DevOps engineers can create a catalog of deployable apps via templates. Each template can define custom user-inputs and can define one or more services(helm charts).

[https://github.com/JovianX/Service-Hub](https://github.com/JovianX/Service-Hub) (Please star ⭐ on GitHub if you think it's cool).

This is an alternative to what currently happens in many organizations where DevOps create hackware solutions for developers to deploy on-demand services with Jenkins Jobs, Scaffold Git repos with custom actions, and so on.

The tool offers a very simple way to create a Self-Service app deployment on Kubernetes with Helm. The tool creates a self-service UI, with custom user-inputs. The user-inputs can be used as Helm values to allow users to configure some parts of the application.

You can define [templates](https://github.com/JovianX/Service-Hub/blob/main/documentation/templates.md), which construct the catalog you expose to developers. An application template can compose multiple helm charts (for example, an app layer that needs a database, somewhat similar to Helmfile).

Here's a simple **Template** example for creating Redis-as--a-Service:

# Template reference and documentation at
# https://github.com/JovianX/Service-Hub/blob/main/documentation/templates.md

name: my-new-service
components:
- name: redis
type: helm_chart
chart: bitnami/redis
version: 17.0.7
values:
- db:
username: {{ inputs.username }}

inputs:
- name: username
type: text
label: 'User Name'
default: 'John Connor'
description: 'Choose a username'

The template creates this Self-Service experience [https://user-images.githubusercontent.com/2787296/198906162-5aaa83df-7a7b-4ec5-b1e0-3a6f455a010e.png](https://user-images.githubusercontent.com/2787296/198906162-5aaa83df-7a7b-4ec5-b1e0-3a6f455a010e.png)

We are gathering **feature requests**, and **user** **feedback**.

I would love to read thoughts and get extremely excited by GitHub **STARS**! ⭐

https://redd.it/yxrhxw
@r_devops

GitHub

GitHub - JovianX/Service-Hub: ServiceHub is a Self-service Portal, for creation and day 2 operations, leverages existing automation…

ServiceHub is a Self-service Portal, for creation and day 2 operations, leverages existing automation processes. SerivceHub is built for Platform Engineers. - JovianX/Service-Hub

4 views14:28

Reddit DevOps

Aliasing of EKS endpoint domain

Hello peeps,

Would be aliasing `https://<HASH>.gr7.<region>.eks.amazonaws.com` to a custom CNAME, such as <myClusterName>.<region>.domain to have a predictable endpoint that in turn can be hardcoded in some places a bad practice? Any advice against or in favor of this?

Thank you for your input.

https://redd.it/yxpeo0
@r_devops

Aliasing of EKS endpoint domain

Hello peeps, Would be aliasing \`https://<HASH>.gr7.<region>.eks.amazonaws.com\` to a custom CNAME, such as <myClusterName>.<region>.domain to...

6 views15:28

Reddit DevOps

Best options for SLA/SLO tracking outside of data dog

We have very basic needs

Monitor uptime of MongoDB atlas cluster

A few ec2 instances

Need to ping a frontend react app

Need to ping uptime for a graphql api endpoint

That’s about it

I’ve set this up with datadog but worried about the cost, not today, but in two years

Are any other APMs going to be that much cheaper while still doing it all with one account?

https://redd.it/yxo7t0
@r_devops

Best options for SLA/SLO tracking outside of data dog

We have very basic needs Monitor uptime of MongoDB atlas cluster A few ec2 instances Need to ping a frontend react app Need to ping uptime for...

4 views16:28

Reddit DevOps

How do you track/help onboarding to on-call?

When it comes to something like interviewing, ramping someone to run interviews often involves a process of shadowing for a number of times and some level of feedback before you become officially 'ramped'.

When I've lead teams before, as a team lead I've tracked which incidents people have been involved in, and which services they've touched. But I never had a proper structure to the onboarding, probably because:

- Incident training often requires participating in real incidents, which can’t be scheduled in advance.

- When one does occur, responders want to focus fully on the incident: they don’t want to be searching for an onboarding spreadsheet, making coordinating onboarding a low priority.

- Incidents are varied, as is the way people participate in them, making it difficult to understand what qualifies as ‘training’.

I wondered if people have had more structure than me on this, and if so what and how are they tracking it?

The context is we're considering building this into our product (incident.io) as a concept of onboarding programmes, where you can say:

> You're ramped to handle SRE incidents once you've shadowed the lead for >3 incidents involving either Postgres, ElasticSearch, etc, and led at least one yourself

And want to know how/if people are doing this already.

https://redd.it/yxnd4o
@r_devops

How do you track/help onboarding to on-call?

When it comes to something like interviewing, ramping someone to run interviews often involves a process of shadowing for a number of times and...

5 views17:28

Reddit DevOps

My mandate is being moved from “DevOps” to “Developer Experience.” Has anyone else made this switch?

Context: Been overseeing the devops for an ecomm company for a little over three years. We brought in a new CTO from a rival startup earlier this year who seems to be way more plugged in to trends in the broader developer community than most of us.

After mentioning “Developer Experience” without much explanation, he’s formally asked me to make it my priority for 2023.

The problem I’m having is there doesn’t even to seem be a crystalized consensus on what “Developer Experience” even means.

From my early research it’s everything from building new CI/CD frameworks to “making sure the developers have the muffins they like.”

Hoping to get any insights you might have on best practices as well as what falls under this responsibility so I can start making a plan.

https://redd.it/yxxeen
@r_devops

My mandate is being moved from “DevOps” to “Developer Experience.”...

Context: Been overseeing the devops for an ecomm company for a little over three years. We brought in a new CTO from a rival startup earlier this...

4 views18:28

Reddit DevOps

Branching and deployment strategy for continuous integration

What branching/merging/deployment strategy would you use for a development team of 5 developing a webapp with 10,000 users (not small, not large)?

Currently we have three environments: development, staging, production. Features are developed on feature branches and merged to master, causing an auto-deployment to staging. After smoke testing on staging the developer click-ops to production.

If an issue is discovered on staging, the developer creates a new branch (hotfix) which is merged again to master. There is no way to reverse the feature branch merge to master after the fact.

An added complication: if production ever goes down while the master branch is compromised, the system will auto-deploy the compromised master branch to production.

Also, the development environment is a free-for-all.

There has to be a better approach...

https://redd.it/yxzi8d
@r_devops

Branching and deployment strategy for continuous integration

What branching/merging/deployment strategy would you use for a development team of 5 developing a webapp with 10,000 users (not small, not large)?...

4 views20:28

Reddit DevOps

NGINX / NGINX Ingress / Envoy WAF Comparison

https://www.openappsec.io/post/comparing-nginx-waf-solutions-nginx-app-protect-waf-vs-open-appsec-open-source-ml-based-waf

Article compares the NGINX App Protect signature-based WAF solution and a new open-source initiative called “open-appsec,” which builds on machine learning and can be deployed as an add-on to both NGINX and NGINX Ingress open-source and premium (Plus) versions.

Documentation here: https://docs.openappsec.io/getting-started/start-with-kubernetes

https://redd.it/yy1l00
@r_devops

open-appsec

NGINX WAF and Kubernetes WAF options (App Protect vs. open-appsec)

This articles compares NGINX App Protect signature-based WAF and open-appsec free open-source ML-based WAF.

4 views21:28

Reddit DevOps

What is the point of having both a develop and a main branch aiming to be in sync?

I often notice teams have both a develop branch from where they pull featurebranches, only for them to merge into develop and then merge into main.

What's the point ? Seems like double bookkeeping to me.

https://redd.it/yy2wz7
@r_devops

What is the point of having both a develop and a main branch...

I often notice teams have both a develop branch from where they pull featurebranches, only for them to merge into develop and then merge into...

3 views23:28

Reddit DevOps

Uptime for MongoDB atlas? No luck with asking atlas and nothing for dátadog integration

Im feeling like I’m just getting poor support and I’m a lazy docs reader, but I can’t seem to find anyway to easily get the uptime of a MongoDB atlas cluster

There is a mongo serverStatus function you can run but you need to run it on each node AND it just tells you the time the mongod process has been running which I’m guessing isn’t going to be the same as “uptime for the cluster” because when a new node is spun up or down, it doesn’t necessarily mean we had downtime (from the experience of a MongoDB atlas cluster consumer/user)

Are people just not measuring SLAs for the DBs lol? How does atlas measure their own SLA lol

https://redd.it/yy55ra
@r_devops

Uptime for MongoDB atlas? No luck with asking atlas and nothing...

Im feeling like I’m just getting poor support and I’m a lazy docs reader, but I can’t seem to find anyway to easily get the uptime of a MongoDB...

2 views02:28

Reddit DevOps

NPM version in container environments

I’ve recently begun a new job and found something interesting.

I’ve noticed this pattern where SWEs will make commits to simply bump their package.json version. This of course triggers a new build on their default branch. Then of course the thing they are applying a git tag too isn’t the image that was tested in a lower environment. (We do at least properly promote so there’s not a rebuild on tags).

So I’m curious how do you guys handle apps that are npm apps but are rest apis per se? In the past I’ve just always set the package.json version to 0.0.0 and disregarded it as I prefer the git tags/image tags as the source of truth. Now for npm packages of course the typical process is used.

https://redd.it/yy85hl
@r_devops

NPM version in container environments

I’ve recently begun a new job and found something interesting. I’ve noticed this pattern where SWEs will make commits to simply bump their...

4 views04:28

Reddit DevOps

How do you yaml

A?:

accessModes:
- ReadWriteOnce

or

B?:

accessModes:
- ReadWriteOnce

Personally, I can't even with B. I don't know if it's some sort of chemical imbalance in my brain but I get ultra confused if I see yamls structured this way.

I want to know if I'm the only one or not. No explanation necessary. You do you.

View Poll

https://redd.it/yya8p7
@r_devops

How do you yaml

A?: accessModes: - ReadWriteOnce or B?: accessModes: - ReadWriteOnce Personally, I can't even with B. I don't know if it's...

4 views05:28

About

Blog

Apps

Platform