Reddit DevOps
268 subscribers
1 photo
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Am I a bad platform engineer or am I just a bad platform engineer at this company?

I am struggling to understand what to do. I switched jobs last year because while I was kicking ass at a mid-sized local company on a small devops team, I wasn’t getting paid quite what I thought I should be. So, I switched to a senior platform engineering role at a coastal company, fully remote and got a significant pay bump. However, I have been floundering according to my superiors. I recently got feedback that they’re questioning why I’m a senior.

This company is very large and the infrastructure is both complex and broad. The architects and people who’ve been at the company for years are super smart, super capable and do things quite faster than me. It’s been depressing. I feel like I was a big fish in a small pond in my last job and I almost singlehandedly migrated the companies cloud apps to a Kubernetes cluster with a service mesh. But now I feel like a small fish in a big pond. And its not that I don’t understand IaC, K8s, CICD or any of the cloud native technology. It’s more that I haven’t been able to complete any big projects or show off any high visibility effort because everything I have been doing has been internally facing to our team. I do struggle to code better (this position uses more javascript than I’ve ever had to touch before) so I have had to learn a lot. But I also struggle with the processes of creating design documents and goal gates. I just wanted to build cool stuff and instead I’m more or less vaccuming the floors. I don’t know if working for a multinational company with high-level engineers is for me, but I don’t know where else to go that will pay me this much. I’ve been doing this for 8 years. I have taught devops principles at tech conferences and I can’t somehow keep up.

EDIT: I’ve been at this new company for almost a year. I got a bad annual review last year but I had only been there a few months and I chalked it up to a steep learning curve. Then, my boss and three other people on my team left. I was actually doing pretty well over the past few months but the feedback that I’m struggling is coming back.

https://redd.it/13dbccy
@r_devops
Running DataDog agent as a sidecar container for APM in ECS Fargate

Hi,

So we are using ECS EC2, and running our DataDog agents per host. Now we want to move to Fargate and debating how to integrate the DataDog agent.

There are two (?) ways:

\- Run DataDog agent as a sidecar. One for each ECS task.

\- Send APM metrics from application to an external DataDog agent and the agent will send it to DataDog.

​

First option, is clearly the easiest.

However, there is concern than running a sidecar per task (we would usually run around 1000 ECS tasks), would be a considerable cpu/memory overhead to the task.

But considering the overhead of maintaining DD agents and updating the application to send metrics to DD agents, I think running sidecars is the efficient solution.


How can I convince the team the running a sidecar does not drain too much resources.

Do you guys know any numbers on this? Or any advice on this?

Thank you.

https://redd.it/13dbpj4
@r_devops
Whats the deal with azure devils?

I’ve been looking for a new job and for some reason most of the recruiters here (Costa Rica) need azure devops specifically. Most of my experience is with gcp and aws but I’ve worked with azure, just not as a devops type role. What’s the deal? Are there features/cost savings that make it popular?

https://redd.it/13d5u1c
@r_devops
Local Development Technologies?

Currently working for a startup that does local development via docker-compose and mounting the source code into the containers as volume mounts. It works but as the app gets bigger it becomes more of a burden on peoples machines to run it all and compiling/pulling dev images is cumbersome when libraries need to change

What technologies is everyone using to do local development? What have you found that works and what should you stay away from?

Currently looking at switching to telepresence or something like devspaces with our migration to K8s but not sure yet

https://redd.it/13d1i36
@r_devops
I can't believe it's not DNS - Or how I spent a weekend fighting networking in Kubernetes

This is a short writeup on the debugging I did trying to figure out why DNS was completely broken in my fresh Kubernetes install. It seems like it's something that could be hitting a lot of deployments so I hope some of you find it useful.

I should also add that networking is not my strongest domain, so please let me know if you see any inaccuracies or have something to add!

https://redd.it/13dj0bh
@r_devops
Is it okay if IaC is not part of CI/CD pipeline but a separate process?

Let's consider there are multiple projects (development teams) inside one organization. Each one works on their own product with DevOps goals in mind so they can deliver software quickly and reliably to its users with the help of CI/CD process. One thing that is not automated *by product teams* is the process of infrastructure provisioning. It is assumed they are provided with infrastructure by the infra team to deploy their project to (for example slice of the on-prem Kubernetes cluster).

​

In that same organization, there is an infra team that manages the company's on-premises infrastructure. They are responsible for managing bare-metal servers, VMs, k8s clusters etc. They do this in an IaC manner so (apart from bare metal maybe) they can provision set of VMs, k8s clusters or new tenants in existing k8s cluster with a click of a button.

​

As you can see there is a clear separation between infrastructure provisioning (good thing it is IaC though) and delivery of the software on that (CI/CD pipeline).

​

If teams were deploying their software to the cloud they could provision the infrastructure they need during CI/CD process because:

\- cloud has "infinite" resources

\- there is an API so it is self-service

\- there is a natural force that keeps you from requesting too much - cloud costs

​

Although the self-service part could be solved on-premises by deploying some sort of private cloud, resources are not "infinite" in this case and no real money is involved. Currently, when a new product team requests a slice of infrastructure for their needs there is a manual sanity check if the request seems reasonable for a project scale and might require approval from a manager. After getting approval infra is provisioned with a single click of a button and the product team can run their project on that.

​

What would you say about DevOps practices at that company? Is that separation between infrastructure provisioning and CI/CD process a common thing or it is a priority problem to be addressed?

https://redd.it/13djxu0
@r_devops
Database VM access restriction from Kubernetes pods

Hi

We have a MariaDB database which is on a VM and applications deployed on Kubernetes trying to connect to it.

The problem is, in order to give access to pod A with username X on Kubernetes we have to allow all Kubernetes IP ranges as the pod doesn't have a static IP, so pod B can also try to connect to the database using X username.

Is there any solution to this issue?

​

Update 1: If pod B somehow gets A's credentials, it can connect to the database as well, but if A was hosted on a VM and B on a separate VM, we could tell MariaDB to only allow connection from A's VM IP.

Update 2: Our K8S and VMs are on-premise

https://redd.it/13dkxro
@r_devops
Chicken-and-egg issue with repos, CIs & k8s - sanity check

I need a sanity check/help with a bit of a chicken-and-egg issue we have:

Setup:

NX-based monorepo
GHA for CI
CDKTF for cloud config, using Terraform Cloud for state
AWS for cloud

We're attempting to have one monorepo for all of our services. This also includes (for better or worse, but it shouldn't really matter here) keeping our cloud config in the same monorepo.

We are using CDKTF because the majority of our stack is TS and it's familiar to our devs. We have one app in the monorepo called cloud-environment which is responsible for all of the terraform config.

The issue we're having, is that in the scenario where we want to deploy a new service, we have three steps we must go through:

1. Create a repo in ECR for the service images
2. Build the image via CI and upload to the repo
3. Trigger a deployment/redeployment in k8s

Step 1 and 3 are defined in the cloud-environment app. Step 2 is it's own app in the monorepo and has it's own build/package/upload targets. The problem is that step 1 needs to have always run before step 2 and 3 run, but step 1 and 3 will run at the same time because they're part of the same config. The second problem is that we don't want to make Step 2 dependent on cloud-environment because there's no good way in CDKTF running remote to parse the output of a speculative plan and then control apply using a PR.

How are people normally solving this? Do you have a completely separated "global" config which controls all of your base stuff, or do you use some other process for controlling kube deployments?

https://redd.it/13dlj8d
@r_devops
Why use Lerna over NPM / Yarn Workspaces?

As the title says. Is Lerna still the goto tool for monorepos or have Yarn and NPM caught up and made it unnecessary?

https://redd.it/13dn235
@r_devops
Policy-as-code is recommended for managing cloud and SaaS services

Policy-as-code is a software development model that uses code to automate the enforcement of enterprise policies and standards. In this model, enterprise policies and standards are written in the form of program code, so they can be automatically executed when deploying and managing software systems. PAC can be associated with many security and compliance issues in areas such as access control, data privacy, and security configuration.

The advantages of PAC include:

1. Automated execution: PAC writes policies and standards as program code, making them automatically executed, reducing the reliance on manual processes.
2. Consistency: PAC ensures that enterprise policies and standards are consistently executed in all systems, reducing the potential errors that may occur in manual processes.
3. Scalability: PAC allows enterprises to easily add, delete, or modify policies and standards as needed.
4. Reproducibility: PAC ensures that each execution is consistent, reducing potential changes that may occur in manual processes.

The disadvantages of PAC include:

1. Learning curve: PAC requires team members to have programming skills, which may require a learning curve.
2. Maintenance costs: PAC needs to be constantly updated and maintained to ensure that policies and standards are consistent with changing enterprise needs.
3. May introduce new risks: If policies and standards are poorly written, PAC may introduce new risks.

Now let's take a look at some popular PAC products:

Pulumi Policy as Code: Pulumi Policy as Code is an open-source PAC tool that can integrate with multi-cloud environments such as AWS, Azure, and GCP to help users automatically enforce security and compliance policies.

Terraform: Terraform is an open-source infrastructure as code tool that can automate management of various cloud platforms and services such as AWS, GCP, and Azure. By writing code, Terraform can automate the implementation of security and compliance policies.

HashiCorp Sentinel: Sentinel is a PAC tool developed by HashiCorp that can be used in tools such as Terraform, Vault, and Nomad. Sentinel supports writing rules in programming languages such as HCL to automate the enforcement of security and compliance policies.

Selefra: Selefra is an open-source Policy as Code tool that can use natural language to write rules for security compliance checks, cost configuration checks, and architecture rationality checks on current cloud services.

AWS Config Rules: AWS Config Rules is a service provided by Amazon Web Services (AWS) that allows users to write custom rules to check the compliance of AWS resources. By integrating with AWS Config, users can automatically enforce security and compliance policies without needing to write their own policy engine.

Overall, PAC is an effective method for automating the implementation of security and compliance policies, helping enterprises reduce their reliance on manual processes and improve consistency and scalability. However, using PAC also requires consideration of some disadvantages, such as the learning curve, maintenance costs, and potential introduction of new risks. Enterprises should choose the appropriate product based on their needs and actual situation when selecting PAC tools.

https://redd.it/13dkdc0
@r_devops
can someone point me to a Jenkins CI/CD pipeline project for absolute beginners?

could someone point me to a Jenkins project where I would be able to create a ci/cd pipeline for a simple app? I'm a beginner and am trying to get some real world experience for my first Cloud role. I just need a project that is easy to follow with clear instructions to the point where I understand why I'm doing what Im doing as I like to write blogs to show what im doing and the steps I took so I can show employers. Thank you!

https://redd.it/13dpevg
@r_devops
Hey Reddit fam, do any of y'all know any freelance devops who charge for value rather than time? Tired of feeling like my wallet's on a countdown every time I need some work done. Let me know your thoughts!

Hey guys! So I work for a software dev agency and we not only develop solutions but also set up and manage client servers. As we've grown, our clients have been requesting more complex systems which got me thinking - should we be charging by value instead of by time? I feel like devops is better suited for this approach. For instance, setting up a server only takes me a few minutes with the scripts I have.

Do any of you charge fixed fees for these types of services? I'm based in the US and would love to hear some pricing ideas! Thanks in advance!

https://redd.it/13drujj
@r_devops
Documentation/Help for your platform

I've been looking for a good documentation platform for my SaaS product. I came across Docy theme for WordPress BookStack and ReType. Are there other tried, tested options you all work with today? Please share, thanks so much!

https://redd.it/13dvfzb
@r_devops
OneUptime: Open Source StatusPage.io alternative that you can self-host.

I'm Simon, I'm the OSS contributor to OneUptime (https://github.com/oneuptime/oneuptime) . It's an open-source alternative to StausPage.io. We're working on adding APM functionalities to it to make it closer to an open-source alternative to data dog. It's 100% free and you can self-host it on your VM / server.
Let me know what you think! Happy to hear early feedback and make the tool better.

https://redd.it/13dzh5n
@r_devops
GitHub actions top alternatives

Due to the current state of GitHub I’m considering moving away from actions.

These are my top alternatives please share yours:


1. circleCI
2. GitLab
3. Travis CI

https://redd.it/13dvu0n
@r_devops
Mixing infrastructure provisioning and configuration

I am trying to build a SaaS service where for every tenant I need to spin up some infrastructure and configure it, and would appreciate some help on how to chain those things. I would like the trigger for a new tenant spin up to be as simple as a new line in a file in a git repo if possible.

To keep things simple, let’s say that my tenant infra is an RDA instance and an ASG+EC2 running a Java app. I want to configure my app to use a non-master DB user for least privilege and create the schema.

My first thought was to do this all in terraform (I am already using spacelift for some other infra things) but the creation of the applicational user and tables in the DB doesn’t feel right in terraform, so I am now having trouble designing the best way to chain these things (RDS+db setup+EC2) so that ideally the “new tenant” trigger is still super simple.

Splitting EC2 into an infra step and a “deployment” step I guess could help, but with an ASG there doesn’t seem to be room for a deployment phase “later” since the launch template is defined when the ASG is set up.

I am quite new to this space, feel free to completely destroy my assumptions:)

https://redd.it/13e3nm1
@r_devops
GitHub vs Gitlab

My team moved away from GitHub back before GitHub Actions was a thing after seeing GitlabCI in action.

I am pretty happy with Gitlab, and after Microsoft bought GitHub, I haven't really kept up with it to see if it has gained/kept feature parity with Gitlab.

With all the outages going on at GitHub, I began to wonder: why are DevOps people still using it? Is it some killer feature I am out of the loop on, or is it mainly organizational inertia driving the decision to stay on it?

https://redd.it/13e4eoj
@r_devops
What are the best tracing tools for microservices?

This article covers some of the best options - curious to know if anything is missing from this list.

https://redd.it/13e404b
@r_devops