Reddit DevOps
268 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
How to use a Lambda function to deploy single-tenant applications to an EKS cluster?

I am working on a pet project which will have an EKS cluster where I want to have a tool do automated deployments of single-tenant applications. Got a good part of it working but not the Lambda part.

I have a single-tenant application that will need to be deployed to an EKS cluster and I am planning on having it so there is an SQS queue that will get messages of what to deploy. The idea being a user will 'sign up' so to speak and then that would create an SNS notification that will fan out to two or three SQS queues. One being the queue to set up the database and the other being to deploy the application to the EKS cluster. Third, being maybe a status system I am debating.

My question/design issue is how can I have a Lambda function be the one to process this? I have built a rough tool that does most of what I want in a single script run locally however I am trying to figure out how to get Lambda to process it and handling the EKS security part. IE currently it is running against EKS via my local kubeconfig but that doesn't work in Lambda and I really don't want to hard code the kubeconfig into the Lambda function as that is both a security issue and an operations issue..

The way I have Lambda handle this must work across multiple Lambda functions deploying at once as well as work in Lambda functions that do stuff like delete a deployment (more than just a K8s deployment).

https://redd.it/lru0wb
@r_devops
Pods Disk Utilization

Hi, community,

How can I track the storage of the pods running in the cluster? I have metrics server and Kube-state-metrics deployed but the problem is I cannot access the web UI as it's just a test environment running behind VPN? Is there any way I could fetch this data using the command line? Are there any lightweight tools that can be used to query from CLI or any functions to query Kube-server using the command line?

Thanks a lot

https://redd.it/lrz24c
@r_devops
Did you use Loki for logs aggregation? (loki vs elk)

Hi.

I discovered https://grafana.com/oss/loki/

Did somebody use that in production or with side project?

If yes, what is your opinion and how it is compared to ELK?

https://redd.it/lrokdj
@r_devops
Running Kubecost as a Prometheus metric exporter

Hi all, one of the original authors of the kubecost project. We've just open-sourced a prometheus exporter for tracking Kubernetes cost metrics. You can read public data from cloud provider pricing APIs, map those to pod resource request statistics, and view them in prometheus. Let us know what you think:

https://github.com/kubecost/cost-model/blob/develop/kubecost-exporter.md

https://redd.it/lrmq6x
@r_devops
Free intro to Linux commandline/server course starts this Monday

This course has been running successfully now every month since February 2020 - more detail at: https://LinuxUpskillChallenge.org - daily lessons appear in the sub-reddit r/linuxupskillchallenge - which is also used for support/discussion.

Suitable whatever your background, and aims to provide that "base layer" of traditional Linux skills in a fun interactive way.

https://redd.it/ls4xtb
@r_devops
[aws-quota-checker] A tool to check your AWS account for quota utilization to prevent hitting a limit

Hey r/devops, I recently hit an AWS quota limit without prior notice (ALBs created by aws-load-balancer-controller) and was looking for a way to prevent this from happening again.

Wasn't really satisfied with the solutions I found so I built one myself. It's called [aws-quota-checker](https://github.com/brennerm/aws-quota-checker) and does exactly that. It retrieves the quota limits of your AWS account and compares them to the number of current resources (e.g. number of EC2 instances).

Here's a little demo:

$ pip install aws-quota-checker
$ aws-quota-checker check vpc_count
VPCs per region [default/eu-central-1]: 1/5 ✓

Think this may be helpful for some of you. Feel free to give it a shot and let me know if you encounter any issues or would like to see some additional feature.

https://redd.it/lrqtf2
@r_devops
What is your DevOps/SRE dream job/dream employer?

I was thinking today about how I like my current job and company a lot, and how I think the culture is pretty solid and growing in a good direction. It’s a small startup, but I am really enjoying it and the DevOps culture is great. I used to work for a huge bank and it was the polar opposite. Although my company doesn’t have the prestige that Google, etc. has, I like it and I hope we continue to grow.

What is your DevOps dream job? Who would you want to work for in an ideal situation? Is it because of prestige? Culture? Benefits? Pay?

https://redd.it/lrm7nw
@r_devops
Volume of non-engineering time as a lead or principal?

We've all seen the ten thousand posts about how to get a start and starting out in DevOps. What about at the other end? What sort of balance are Leads and Principals seeing regarding engineering vs non-engineering time?

I am nominally a lead engineer, although serving more as a principal. I am explicitly NOT a manager - I have no formal line management responsibility. I have previous experience as a manager, can do it, but am keen not to revisit that. Without giving too much info, let's say the sector is 'established financial services', although I am exclusively working in an arms-length division doing modern cloud-native stuff. I seem to spend over 90% of my week doing management and admin. Seriously, in a week, I get less than an hour to actually spike or code anything. Most of this non-engineering time is spent in meetings or dealing with bureaucracy, not mentoring, reviewing code, working with engineers etc. Whilst the volume of meetings is an acknowledged problem in the org, here it seems to have reached ridiculous levels. I am literally on Slack/Teams/email etc all day every day. I am concerned that my own professional development and enthusiasm is grinding to a halt.

I am well aware that it is easy to lose your balance in such situations - all you can see becomes all you can see. What sort of split engineering:non-engineering time are others seeing at this level?

https://redd.it/lrnhtz
@r_devops
Advise needed on a mini PC build for learning purposes.

Hi guys,

I’m looking for some advise. I’m working forwards getting certified in various areas of computing as I’m wanting to move into a DevOps role in the next few years. I’m starting with CompTIA Network+ and then moving onto Linux Foundation certifications (Admin/ and Engineer) before then going on to Kubernetes, Docker, Cloud, etc.

I’m looking for a small computer just to practise some of these concepts, mainly networking and Linux. I currently have a M1 Mac Mini which is great for the design work I do but virtualisation isn’t quite reliable yet. I’ve been looking at an 8th Gen Intel Nuc with an i5 but I’m not sure it will be enough. Someone at my work today mentioned they use laptop CPU’s and to get a mini atx build but I’m not keen on it. It’s something I’m likely to sell on in the next year or two once the Mac M chips become more mature. It will also only be a machine for learning purposes, not for gaming or anything like that.

Ideally, I don’t want to pay anymore that around £400 but would be willing to stretch that if necessary. Any help would be hugely appreciated.

https://redd.it/lrsdv8
@r_devops
CI/CD... if you were to start over, what tools would you use?

So at work we're mainly a Jenkins shop, with some homegrown tools.

A friend (at a fast growing startup) asked for ci/cd advice, and all I know is what we're currently doing at work isn't the way to go.

What would you recommend if you could start fresh?

Environment: EKS on AWS. Mainly Java-based microservices (going to be dozens), a few lambdas, a couple node+angular web apps (behind nginx). The startup expects to grow very fast with multiple feature teams soon.

What I dislike about what we do at work is that CI & CD are disjointed. I.e., CI using Jenkins (building images to our docker registry) then a completely separate process for CD. Also deployment to stage, pre-prod, prod, etc., are completely disconnected from each other.

Ideally I'd like a tool that we can streamline an entire pipeline from dev (or at least stage) to production & DR, manage rollbacks, maybe with some approval steps along the way?

I was going to look into concourse but at a glance it seems pivotal centric and pivotal is no more...

Any thoughts / suggestions on what's currently the "state of the art"?

Thanks!!

https://redd.it/ls7lzk
@r_devops
Watch Kubernetes Experts Attempt to Fix Broken Kubernetes Clusters (Episode II)

Damn ... this weeks was really fucking tough.

Both Jason DeTiberus (@detiber) and Walid Shaari (@walidshaari) decided to "mess" with `etcd` and debugging the problem was rather challenging.

I hope you find this entertaining and helpful

I need a really strong drink after this one.

https://www.youtube.com/watch?v=JzGv36Pcq3g

https://redd.it/ls9ctm
@r_devops
calling terratest operators

does anyone have a solution for generating a report from terratest, something easy on the eyes?

https://redd.it/lscyrh
@r_devops
Team meeting on future priorities next week

Hi all, I’m currently a systems engineer in a team of 8, working under a larger IT services team of around 100 and serving around 10k staff total. Next week we have a meeting to discuss our ongoing priorities and spitball what new ways of working we could implement/ where to put our focus for the future. I’m desperate to put the case forward for DevOps processes, Agile working and focus on pushing the company towards the Cloud/ Azure. I’m wondering what areas I should suggest we focus on to try and move us from the old school and into a much more DevOps-focussed mindset.
Right off the bat I think we should initially concentrate on areas like making sure we have Azure policy right (enforcing resource group tags, only allowing certain SKUs of VMs to be spun up, correct data residency etc) and getting DevTest labs rolled out, thus creating a platform for our Devs to start being able to provision environments through self-service and start working in Azure for Dev/Test and Production.
I also think we should look at applying some elements of Site Reliability Engineering into our team. We don’t operate a shift pattern but I think we could benefit from having a designated ‘on call’ engineer, whom initially responds to our major outages/ P1 tickets, oversees the resolution and documents after.
Does that make sense, or do we instead need to start the conversation fresh between the Dev team and us to decide how we’re going to achieve ‘DevOps’? I think if we put in the initial groundwork to start removing the barriers between Ops and Dev as above, the collaboration between the two teams will start to run a bit smoother and we can build from there.

https://redd.it/lscmgf
@r_devops
Microsoft hosted agent - No folders to save?

Hey,

Im pretty new to understand things, but I understand that the Microsoft hosted agent has some limitations vs the self hosted ones. Im trying to do the following on a release:

I'm trying to download something and save it somewhere but it seems like I can't save it anywhere? I can't seem to find any target. As a save location I already tried stuff like $(Build.ArtifactStagingDirectory) or $(Build.SourcesDirectory) but nothing seems to be available.
I now created a cmd task to list the output of those variables by using the code:

echo "Structure of work folder of this pipeline:"
tree $(Agent.WorkFolder)\1 /f

echo "Build.ArtifactStagingDirectory:"

echo "$(Build.ArtifactStagingDirectory)"

echo "Build.BinariesDirectory:"

echo "$(Build.BinariesDirectory)"

echo "Build.SourcesDirectory:"

echo "$(Build.SourcesDirectory)"

The output is:

Folder PATH listing for volume Temporary Storage
Volume serial number is 0000028E 0CE3:CCEA
D:\A\1
Invalid path - \A\1
No subfolders exist
"Build.ArtifactStagingDirectory:"
"$(Build.ArtifactStagingDirectory)"

"Build.BinariesDirectory:"
"$(Build.BinariesDirectory)"

"Build.SourcesDirectory:"
"$(Build.SourcesDirectory)"

So like, where can I even download and put stuff?
I dont have an artifact connected as this doesnt depend on code, it just has some tasks to download stuff.

https://redd.it/lscgo8
@r_devops
Automated Remediation / Automated Runbooks

A while back there was a thread on automated remediation for incidents that's relevant to some work that I'm doing, The idea of machines telling us where our systems are broken and then solving those problems for us sounds like a really nice idea in practice, but I have yet to run into someone responsible for operating systems that wants to remove the human from the loop...yet.

Which sort of brings us to the space of the automated runbooks where an SRE or DevOps team can define catalogues of system solutions that are served to support teams to click "execute" on.

One of the things I'm thinking through is even when we have automated runbooks, we're still requiring human interaction, which means we've gotten a customer complaint and so something somewhere is broken, but we want a human being to check in on the problem and make sure a fix isn't making something worse.

However, I've seen some systems that were large / complex enough that ticket volume was so high from redundant issues that it made sense to systematically automate away problems -- as in, create scripts that resolve redundant issues in order to eliminate toil -- in order to get operational teams head above water. The justification is that it takes a while to 1) get to RCA and 2) get an eng to prioritize a fix and get it deployed that operational teams ought to have a path to end-to-end solve problems.

Doing this at scale has it's own set of risks or things that would understandably freak people out (don't make the situation worse!), but I'm curious if anyone has some examples of this type of work happening on operational teams.

https://redd.it/lsgxi7
@r_devops
Devops for data engineering role

Hi folks,

We have a position open in our company:

https://www.linkedin.com/jobs/view/2427007934/

We are trying to find a DevOps lead with some spark experience. We want to productize an ML analytics application that is built on Kafka+Spark on GCP and we'd like to automate the deployment of the architecture based on customer supplied parameters.

If you are comfortable with Kafka deployment (on-prem/cloud), know how to setup authentication/authorization for a multi-tenant Kafka broker, know all about load balancing, have basic knowledge on spark and database deployment on the cloud, and can automate this end-to-end pipe using a tool of your choice, please do apply.

https://redd.it/lsg56p
@r_devops
How We Securely Deploy our Website with AWS

Hi all,

We've just put up our first blog post which is a detailed look at how we rely on various AWS services such as CodePipeline, Secrets Manager & CloudFormation for development & secure deployment of our company website.

We use Hugo to generate a static HTML site, Ansible for configuration management, and GitHub for our private repos.

There's some improvements we'd like to make like using an S3 bucket to host the static site, and we'd also like to implement AWS CodeGuru into the pipeline for ML-powered code reviews.

We'd love to hear any feedback about this post, and we'd be happy to discuss this solution further in the comments if anyone wants to.

https://www.acutestack.com/blog/how-we-securely-deploy-our-website-with-aws/

https://redd.it/ls0vik
@r_devops
calling terratest operators

does anyone have a solution for generating a report from terratest, something easy on the eyes?

https://redd.it/lscz0t
@r_devops
Introducing: Dashbird's serverless Well-Architected Insights feature

I'm super excited to tell you all about Dashbird's new feature that I believe will change the way serverless developers build and operate their environments. TL;DR: Based on the principles of WAF’s best practices, Dashbird now runs 80+ continuous checks against your infrastructure and gives you actionable advice on how to optimize your infrastructure.

Why did we decide to add this feature and how does it actually work? Learn more here: https://dashbird.io/blog/introducing-well-architected-insights/

https://redd.it/lsc5wc
@r_devops