Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Best Practices for Managing a Large Number of Subscriptions?

I manage around 14 Azure subscriptions and it's expected to keep growing. Most of them were created by developers before I joined the team so were built via Click-Ops. I'm trying to push the move to IaC.

Originally I had the idea to create a repo for each subscription but it's proved to be quite tedious to configure and most aren't being utilised anyway. I now have a new idea of a factory: A single pipeline with branches for each of our common templates. With the factory, a developer could run the factory pipeline, select the "App Service Plan" branch, enter in the parameters required (subscription name, name of the project, etc) and it will just spit out an App Service Plan to the chosen subscription.

I think this would be a great experience for the developers as it would then be all GUI based but it then means the infrastructures aren't actually recorded in code but are just a handful of templates that are frequently used to push things out.

I was wondering what more experienced people think of this idea - Would it be considered bad practice from an auditability perspective? I am really struggling to find anything about IaC best practices in general so anything you can share would be great.

Thank you!

https://redd.it/1dykmdb
@r_devops
Seeking Advice: Backend Services for (Flutter) Mobile Apps


# Hello Fellow Developers,

I'm reaching out to gather some insights on backend services for Flutter mobile apps. There's an overwhelming amount of information available, and I would greatly appreciate some clarity on a few points.

Specifically, I'm interested in the differences between using Firebase and a self-hosted solution (such as using AWS).

Firebase: It offers a lot of out-of-the-box solutions, which can be very appealing. However, I've heard that it can get quite costly, especially when it comes to downloading files. Would this be a significant concern for my clients?
Self-Hosted Solutions: On the other hand, these can potentially offer more control and scalability. But I'm curious about the additional effort required to set up and maintain such a solution. Is it worth it to offer this to my clients?

For context, I'm looking to develop apps for businesses, I am trying to provide them the most value possible, and I am wondering if it's worth it to offer self-hosted solutions or stick to Firebase. I am not concerned about anything but value for them.

I'd love to hear your experiences and recommendations regarding these backend options. Does it really matter which one I choose, given the specifics of my situation? Any feedback or advice would be greatly appreciated!

Thank you in advance for your help!

https://redd.it/1dytmdj
@r_devops
I want to treat my IIS logging, like I do my container logging, with a log collector?

hi,

I'm thinking that instead of my webapp logging drectly to ElasticSearch/ES it ought to write to stdout, and then I have a log collector (logstash/fluentd?) for each of my sites (100+) shipping those logs to ES. I've got something like this working by shipping to eventlog, and shipping those with winlogbeat, but it doesn't feel right, not least because my Windows event-logs/discs are spinning trying to keep up with the events per second I want to ship. Is this right approach, or should I write to stdout/stderr and have a different collector do this shipping for me without my discs spinning so much. thanks,

https://redd.it/1dyyj4u
@r_devops
Azure Container App deployed from GHCR with Github Actions does not make a new revision

I am trying to build a container then deploy it to ACA via Github Actions, I use a sha tag not latest and the run says "Your container app hyde has been created and deployed! Congrats!" with all of the correct image names and tags but in Azure I see no evidence of a new revision being made.

Please see my run here:https://github.com/r-Techsupport/hyde/actions/runs/9850292558/job/27195481063

https://redd.it/1dyznki
@r_devops
Flyway for database migration

So I am currently learning about how to adapt flyway with our current technology stack, as of now we use a local developed tool for database migration but since it was developed 10 years aga, it is showing signs of limitations and no one wants to touch the original src code so we opted in looking for a tool that do the same job.. so far flyway is the first option we have...just to clarify are we able to save versions of schemas and access those older version for testing? As of now all I am seeing is that you to manually do it since flyway is a versioned migration and accessing your older version will be harder than it is when done manually... Is that correct?

https://redd.it/1dyygv3
@r_devops
Should I keep my CCNP current or let it go

I'm wanting to spend my final years in tech (like my last 20 or so years, lol) working with cloud and DevOps. I like the environment better than the network teams I've been on through the years, I like working with code more, honestly, I like the cloud more. That being said, it's come time to renew my CCNP and I'm running out of time to do it with CEs. I'm honestly thinking of just letting it go. I'm starting to really hate Cisco and the money-grabbing thing they've become anyway. Is it important to keep if I want to make this transition? I'm thinking of focusing more on AWS certs if I'm wanting to show potential employers certifications.

https://redd.it/1dz1iuk
@r_devops
Going from 30 to 30 Million SLOs Observability Meetup

Hey everyone!

If you're in London (UK) this week and are interested in Observability, make sure you drop by the Observability Engineering Meetup on Thursday, July 11th.

Alex, Senior Site Reliability Engineer at Google, will present the evolution of Service Level Objectives (SLOs) for the GCE Compute API over the past eight years. He'll start with the initial 30 SLOs, move through a phase with around a thousand, and end with millions of per-customer SLOs. He'll share anecdotes, techniques for handling low-QPS (continuous over discrete metrics), and strategies for aggregating data to enhance leadership visibility. He'll also give practical tips for running and improving this system in production.

You can RSVP here: https://www.meetup.com/observability\_engineering/events/301637095/

See you there :D

https://redd.it/1dz17p7
@r_devops
Got charged for DB2 after free trial credits ended - no invoice was sent before.

In April this year, I started a DB2 instance purely for exploration as I was looking for jobs, and DB2 was one of the required elements for that job. Soon after, I forgot about it, and the instance kept running.



Fast forward to this month, and I see a charge of $140 on my Amex from IBM Canada. I instantly realized something was left running, so I logged into IBM Cloud, hopped on chat support, and got the instance deleted. I inquired about a refund as it was a genuine mistake, and the agent asked me to create a ticket. While logged in, I noticed that there were no invoices issued for April and May. They billed me for June and issued me an invoice in July after I got charged.



The ticket I created was declined for a refund as the analyst said she couldn't do anything about it since I entered my card details and upgraded to a pay-as-you-go account. I argued with her about why I was not issued an invoice for April and May. I have $0.00 invoices from GCP. Aren't they legally inclined to issue an invoice for services being used? I checked my audit logs and showed her that I hadn't logged in since the day after I created my account, except now to create the ticket. I insisted on talking to a senior agent, but it doesn't look like she is going to comply, and I have another $40 charge coming up next month for usage of 8 days in July. The support staff seems to be outsourced to India, and from the conversation, it doesn't appear like they are going to issue a refund or credit.



Is there an escalation system at IBM support, or am I left for dead? I am considering disputing the charge with Amex. What would be a strong reason to win this dispute? I don't care if my IBM account gets banned; I just want to limit my losses. It doesn't look like a lot, but for a person searching for a job, it hurts.

https://redd.it/1dz5587
@r_devops
Load balancing Airbyte workloads across multiple Kubernetes clusters

How do you load balance long-running Kubernetes workloads across multiple clusters?

At Airbyte, as part of supporting multiple geographic regions for data replication workloads, we adopted a control-plane/data-plane architecture. A control-plane orchestrates data movement workloads across multiple data-planes. Technically speaking, each plane is a Kubernetes cluster.

Our solution to load-balancing workloads across multiple data-planes is to push down the responsibility to the data-plane itself. We enqueue workloads in a single job queue and let the data-planes compete for jobs to process if they have capacity to do so. This has the benefit of treating capacity as a problem that is local to a cluster, removes the complexity of planning ahead for available resources, and keeps operations simple when facing cluster downtime.



https://redd.it/1dz4tih
@r_devops
Gitlab - Syft/grype: Are there any GOOD resources to learn how to set up?

I'm new to Devops. I, along with a coworker, am tasked with getting container vulnerability scanning and SBOMs generation set up. I've been looking for a decent video or webpage that goes over the implementation of syft and grype but have failed to do so. Even the one on posted on the documentation section refers to a video that I don't think helped me much. Could be that I just don't understand what exactly I am supposed to take from our AWS EKS images/containers to input into the .gitlab-ci.yml file. Does anyone have any tips and/or sites they can refer me to so I can get a better understanding of the steps involved? And before you ask, no, we don't have the option of using an alternative. This is what THEY want and paid for (Gitlab ultimate).

https://redd.it/1dz2vwc
@r_devops
Cloud Deploy Skaffold overwriting Terraform

Hello, does anyone have experience using Cloud Deploy / Skaffold in conjunction with Terraform?

I'm setting up a Cloud Deploy pipeline for the first time (previously had a simple Cloud Build setup for deployments). However, I noticed that my server configuration defined in Terraform (e.g. scaling, service account, etc.) is being overwritten by new Cloud Deploy releases.


Question: Is there a way for Cloud Deploy / Skaffold to only update the container's image while leaving all other parts of the configuration alone, to be managed by Terraform?


skaffold.yaml:

apiVersion: skaffold/v3alpha1
kind: Config
deploy:
cloudrun: {}
profiles:
- name: development
manifests:
rawYaml:
- run-development.yaml

run-development.yaml

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: my-service
spec:
template:
spec:
containers:
- image: image


I can migrate all the config to skaffold, but I'd prefer to keep it in Terraform.

https://redd.it/1dzafii
@r_devops
Miss the ctrl-alt-delete life.

Just had nostalgia now when I thought back to the days when computers became popular. Good old days!

https://redd.it/1dzcktl
@r_devops
Dynatrace vs. Datadog with AWS monitoring

Hello all! I am wondering if anybody has experience with both Dynatrace and Datadog in terms of monitoring AWS specifically. I am not able to identify the pros and cons of each since I don’t have access to both tools. Any shared experience would be greatly helpful, thank you!


https://redd.it/1dza6nv
@r_devops
cloud for backend services

I'd like to develop and manage a few backend services for my side future commercial project. I'm proficient in java/sql, I know/seen some reports that even js/nodejs seems te perform better in the cloud thus I open to switch the stack

But on the other hand I'm working with quarkus/kubernetes on daily basis, I love kubernetes and quarkus address some issues with java based containers and so this is my prefered stack. Also I like to have full control how the requests are processed so I',m a bit resistant in sth like functions

I assume the traffic would be pretty low but the availability must be 24/7 at low/reasonable price

I did some reserach but yet cannot decide, switching in the future would be hard especially if I use functions offered by some providers instead of docker containers/quarkus/spring boot

I think I should decide between:

* digital ocean: managed kubernetes / app platform + managed postgres

Looks like a dead simple solution from devops point of view, I know the max cost so I can sleep well at night

* gcp: bigtable + cloudrun looks like an almost ideal solution

The only thing is I have no trust in google and their support, and I prefer working in aws terminal than gcp

* aws: with lambda and dynamodb propably could cost me nothing. rds I guess could be too costly at the beginning

But it requires to spend much more time now and in the future; you know even choosing the right services among many requires some research, also I need to learn how work with dynamodb, design proper schema ,get familiar with other services billings, cloudwatch etc etc
Do you have any strong opinion which way to go/no go from devops point of view?

https://redd.it/1dzdxyk
@r_devops
splitting larger commits into smaller commits

hi all I wanted some advice on how to split a larger commit into smaller commits. The intent is that I want to take the smaller commits and create smaller prs. In a nut shell I created a giant pr that is really hard to review and don't want to stress out other devs.
<br>
These are the steps I was taking to split a larger commit into smaller commits. It has a code smell, so please let me know what I'm doing wrong
<br>
git checkout  <largeCommitHash>
git rebase -i HEAD\~1
The interactive rebase editor will open,
Change the pick command to edit for the commit you want to split
git reset HEAD\^
git add changes I want for first part of the commit
git commit -m ‘smallerCommit1’
git add other changes
git commit -m ‘smallerCommit2’
git rebase --continue
git checkout -b smallerCommits
git checkout original_branch_with_large_commits
git rebase smallerCommits
git push origin original_branch_with_large_commits --force

Thank you. please let me know if anything is unclear

https://redd.it/1dzjveb
@r_devops
Is Serverless becoming a hated company now?

They keep optimizing to make money by not releasing features in v3

https://redd.it/1dzlbed
@r_devops
Need help to create a CI/CD pipeline

I am new in devops and only member in the team. I have to create a CI/CD pipeline to deploy service now code. Have to decide tools for source code, artifact, build tools, testing and quality. I find github for source code and Jenkins as build tool would be ideal. please help me to define all aspects to create a CI/CD pipeline.

https://redd.it/1dzm795
@r_devops
On-premise infrastructure vs. hosting with the hyperscalers

Compare the Total Cost of Ownership (TCO) of running on-premise cloud infrastructure and hosting with the hyperscalers. Use the ShapeBlue calculator to evaluate the costs of using hyperscalers like AWS, Azure, or GCP versus managing your own infrastructure. What savings can you achieve with Cloud Repatriation?

This Cloud Cost Calculator allows you to compare the total costs associated with running workloads on different hyperscalers against on-premises workloads using Apache CloudStack. With the calculator, you can see the TCO for running your own infrastructure for 36 months and compare it against using instances/virtual machines from AWS, Azure or GCP. Operating an on-premises Apache CloudStack infrastructure involves expenses for datacenter facilities, software, hardware, licensing, and support.  The calculator is built in Microsoft Excel and is customisable for your needs.

https://www.shapeblue.com/cloud-cost-calculator-and-cloud-pricing-report/

https://redd.it/1dzq0e7
@r_devops
Mastering GitOps: ArgoCD vs. FluxCD - Complete Guide with Demo

I wrote a blog for beginners, comparing ArgoCD and FluxCD for mastering GitOps in Kubernetes. It covers core principles, key features, installation steps, and best practices: https://www.cloudraft.io/blog/argocd-vs-fluxcd

https://redd.it/1dzrep6
@r_devops
Istio Service Mesh inter-service communication.

So I am pretty much a beginner in DevOps. I have been asked to design architecture such that there are 4 microservices (let's call them A, B, C, D). A and B are public facing. So each of their Deployments in K8s are exposed by a LB Service behind a single Ingress.

However, C and D are not public facing. Only B can communicate to the C and D microservice deployments.

All deployments are autoscaling and the org is using Istio Service Mesh. My question is:
1. Is there any specific library to connect to Proxy Sidecar from Service Pod (apps A and B running in the respective Pods are written in Go and Java while C n D are written in Node)?

2. When I will be trying to communicate to C from B through Istio Proxy, would Istio Load Balance between C's Deployment Pods automatically?

3. A emits event which D needs to listen to. Org has proposed Kafka. How would an app running in K8s Deploy emit and receive event (I know it is a noob question)?

Any help is much appreciated.

https://redd.it/1dzs6fy
@r_devops