Reddit DevOps
269 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Dependency tracker for (really big) builds / deploys

I was asked an interesting question the other day about dependency tracking for build components, so thought I'd ask here to find out what other folk are doing to keep on top of this...

For most of the projects I've worked on, you just have a variables file with the expected version of the components so the build or deploy grabs the versions it has been told to - which have been tested and approved to work with each other...

But if the number of these components doubles? Or is a massive number to begin with - a dependencies file doesn't scale... Do you aggregate things in to bundles? Keep the granularity and use a databases for this? How would that work?

What are your thoughts? Is this something you've tackled before?

https://redd.it/11xcqvk
@r_devops
Helm Upgrade causing the pods to go in the pending state

After doing helm upgrade, pods go into pending state because helm upgrade applies the rollout strategy and it tries to create the replicas of the new pods but since the node has limited resources which causes all the pods to remain in the pending state.

Precisely i get this error : Warning FailedScheduling 6m2s default-scheduler 0/1 nodes are available: 1 Too many pods. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming po.

I don't have the option to increase the node size or count, is there any solution to the problem ?

https://redd.it/11xdesv
@r_devops
DSC Tutorials / DSC Not working?

Hello.

​

I'm trying to get better at DevOps, in this instance i'm trying to learn how to place artifacts on a virtual machine, and how to execute a DSC Scripts, so i can configure a Domain Controller, with users and everything.

My issue so far is, i don't know how DSC should be working, me inserting a 3 line Powershell Script and then using it as a DSC works, however that isn't best practice for sure, and i like to create a DSC which i'm from my DevOps repository places on my virtual machine which isn't connected by a Site2Site, or by the use of a Dedicated host ( for cost reasons ) which means WinRM isn't a possbility.

This is my current DSC, the attempts is to run a custom script, which performs a invoke on my VM to retreive 3 files from my Storage Account, this however isn't working as intended.

Does anynoe have any clue what i'm doing wrong in this scenario, or potentially have tried something like this which wouldn't mind sharing it with me?

Configuration DC1 
{
param

$domainCred = Get-AutomationPSCredential -Name "DomainAdmin"
$DomainName = Get-AutomationVariable -Name "DomainName"
$DomainDN = Get-AutomationVariable -Name "DomainDN"
$SACred = Get-AutomationPSCredential -Name "ServiceAccountCreds"
$destination = 'C:\Windows\DSC\'
$url = 'SasUrl'

    Get-DscResource

    Node "Localhost"
    {   
        Script ScriptExample
        {
                GetScript            =  { return @{result = 'result'} }
                TestScript           = { return $false }
                SetScript            = { Invoke-WebRequest -Uri $url -Outfile $destination }
        }                  
    }
}

SasUrl paramater is a SAS Key, which i've removed for the secuirty aspect.

https://redd.it/11xf3kq
@r_devops
Is there a way to retrieve an output value from one workflow to another in Github Actions?

At the end of my "build.yml" workflow I am trying to export the github run number, so it can be used in the "deploy.yml" workflow.

build.yml workflow:

- name: Upload artifact # uploads artifact as zip to temp runner
uses: actions/upload-artifact@v2
with:
name: RC-${{ github.runnumber }}
path: ${{ env.RUNNER
TEMP }}\WebAppContent*
outputs:
runnumber: ${{ github.runnumber }}

I'm just wondering the best way I can reference this output, so when I deploy using the "deploy.yml" workflow, I can use the run_number as a value to notify our slack channel on which run has been built and is being deployed. Does anyone have any good ideas about how to export github variables/values between different workflows (both are within the same repository).

Thanks!

https://redd.it/11xfbqe
@r_devops
Terraform Security Best Practices

Terraform is the de facto tool if you work with infrastructure as code (IaC). Regardless of the resource provider, it allows your organization to work with all of them simultaneously. One unquestionable aspect is Terraform security. We want to explain the benefits of using Terraform and provide guidance for using Terraform in a secure way by reference to some security best practices.


Auditing Terraform configurations.
Managing access credentials.
Security best practices in the use of Terraform modules.
DIY Terraform modules.

I hope it will be useful and all feedback is welcome.

https://redd.it/11xj71a
@r_devops
Artifactory vulnerability Scan

Hello everyone,

I have some artifacts builds and docker images in an artifactory repositories and I want to scan the vulnerabilities of all of these.

Is Jfrog X-ray the best tool or do you guys know some alternatives?

I wanted to use trivy but I can't find information on the compatibility.

https://redd.it/11xknqm
@r_devops
ZeusCloud - an open-source cloud security platform

Sharing something we're in the early innings of developing: https://github.com/Zeus-Labs/ZeusCloud

Have heard from many devops friends that they often get charged w/ managing security. Hope to get your feedback on if this would be helpful!

ZeusCloud is an open-source cloud security platform that thinks like an attacker! ZeusCloud works by:

1. Identifying risks across your cloud environments (e.g. misconfigurations, identity weakness, vulnerabilities, etc.)
2. Prioritizing those risks based on toxic risk combinations an attacker may exploit.
3. Remediating by giving step by step instructions on how to fix the risk findings.
4. Monitoring compliance - track your PCI DSS, SOC 2, GDPR, CIS goals.

So far, we’ve added misconfiguration checks and common identity-based attack paths for AWS. Up next on our roadmap are network/access graph visualizations of your entire cloud environment, vulnerability scanning, and secret scanning!

Check out our GitHub (Licensed Apache 2.0): https://github.com/Zeus-Labs/ZeusCloud

Play around with our Sandbox environment: https://demo.zeuscloud.io

Get Started (free/self-hosted): https://docs.zeuscloud.io/introduction/get-started

https://redd.it/11xmp1f
@r_devops
My Green/blue AWS db deployment strategy for avoiding data loss due to table locks

Gonna write out my deployment strategy, because for some reason I can't find any detailed breakdown of how to actually do a green/blue update that accounts for obvious obstacles, like how to deal with replication and table locks. Let me know if this makes sense or is missing an obvious better way.

1. Create a green/blue environment on AWS RDS
2. Disable replication and remove the readonly lock applied to the green (new) table.*
3. Drop tables, add/rearrange/delete columns, etc.
4. Test that the green database to you satisfaction
5. Grab the max ID's from any tables that have been updating live in the blue (old) database. Prepare to copy any new data added during the hours after you disabled replication.*
6. Freeze the blue database somehow (make it read-only? not sure best way). Export all new data from the blue to the green.*
7. Perform the blue/green swap through AWS

* Certain changes can be done with replication on, such as adding a table or adding a column to end of a table, allowing steps 2 5 and 6 to be skipped.

The idea here is that I've got massive tables that I need to lock, or even can't risk accidentally locking. Instead of having 5 hours of downtime, I only have to keep it down for the time it takes to copy over 5 hours worth of new data, i.e. minutes. (Note that my db has a small number of very large tables that change frequently, so it's logistically simple to manually copy over new data, if you had dozens of tables that might change in an hour with dependencies on one another, this might be too complicated.)

Thoughts? Does this seem like a good strategy, at least in some use cases? Are there better ways to deal with table locks?

https://redd.it/11xm5v7
@r_devops
Selefra - An open-source policy-as-code software that provides analytics for multi-cloud and SaaS.

Hey everyone!

We're excited to introduce Selefra, an open-source policy-as-code software that provides analysis for multi-cloud and SaaS environments. Selefra supports over 30 services including AWS, GCP, Azure, Kubernetes, Github, Cloudflare, and Slack.

With Selefra, you can select * from infrastructure and gain insights into your entire environment. Our solution helps you to ensure that your cloud resources are configured correctly, compliant with industry standards, and optimized for cost and performance.

Check out our GitHub repository at **https://github.com/selefra/selefra** and our website at **https://www.selefra.io** to learn more about the project and how it can help your organization. We welcome your contributions and feedback, so please feel free to get involved!

Thanks for your support, and happy coding!

https://redd.it/11xr7vo
@r_devops
How to automate security patching for OSS docker image?

I work for a large organization, and we are working on deploying open source infrastructure in production. The infrastructure relies on 2 docker images for a UI and a metadata service endpoint. Obviously, our organization is very strict on security so we have a security hardening process that we have to abide by.

Currently, we are trying to minimize operational maintenance, and part of that is having to manually construct a hardened image based off of the original OSS image. That includes changing the base images to custom hardened ones that are internally provided, using the latest source code, using a different nginx conf file, using an interal npm registry, etc. The are lots of little fragmented changes that I do to manually adapt the original docker files, and it would make automation not so straight-forward.  Curious about what patterns and technologies others are using to automate patching for open source images.

https://redd.it/11xnpr3
@r_devops
"Off the record" hangout on Friday: How cruise reduced CI time on giant monorepo

This month the Aviator.co team will be joined by a group of senior engineers on Cruise's UCI team (unified CI team). They'll explain how they managed to reduce CI time on their giant monorepo. They've done some interesting work with intelligently managing runs, auto-quarantining bad tests, etc. Come hang out. No recordings or sales follow-ups. Just a hangout, as usual.

https://getcruise.com/

Sign up here:
https://dx.community

https://redd.it/11xhx2e
@r_devops
Should I leave my junior devops (small company) for a junior SRE role at a fortune 100. But my current role is full time at-will and the other one is a 1 year contract to hire.

Salary difference is 10k more than my current role. And the other one is hybrid but once a month with 20-30 min commute. Please advise!

More context:

I recently obtained my AWS SAA which led me to the f100 offer. And the team is production facing with alternating on-calls that primarily use AWS but some other cloud services. My current place is very legacy with intentions of using more modern stack but not yet executed.

My concern is do the risks overweigh the rewards? About 1.5 yoe, counting internship. Please share your experiences.


Edited to add context.

https://redd.it/11xrd7l
@r_devops
Continuous cloud run deployment problem

Hey everyone, I'm trying to run a continuous deployment from bitbucket with google's Cloud Run and I have a weird issue where my settings don't look like those I find online, or match the Google's docs. Was the feature removed or do I lack privileges? Help would be appreciated.

https://redd.it/11xxymp
@r_devops
Long term Prometheus metric storage

Curious what everyone is using for long term storage of their prometheus metrics. We currently store metrics on disk locally and have also tried Longhorn which has proven to be more trouble that its worth.

Looking to store 30-90 days of metrics and curious what people have worked with in the past for long term metric storage

https://redd.it/11xrcj0
@r_devops
Tips needed: Adopting DevOps for a support team

Hello. We are in the midst of a transformation in our organization, and we are considering the adoption of Agile and DevOps. In line with this, for our team, I've been considering training our group wither via EXIN's DevOps Fundamentals or Devops Institute's DevOps Foundations. Which is better for a team to get acquainted to DevOps? Your advice is going to be very helpful. Thanks!

https://redd.it/11xn0hh
@r_devops
Salary accurate?

I have been working in devops for about 1.5 years now since I graduated from school. I graduated with a bs in cs and landed this job through connectionsi am currently making between $60k and $70k.

Is this reasonable for someone who had no internship experience? Or am I being low balled?

https://redd.it/11y3ctr
@r_devops
How to enable sonarqube code coverage checking using .net framework

Hello Team,

Anyone here have tried to add code coverage checking on their sonar scan for .net? Could you please give me an example on how to do it and what are the requirements for us to do that?

​

Note: I'm using .net framework as a build tool

Thank you team! Have a good day!

https://redd.it/11y46ri
@r_devops
📣 Understand Probes In Kubernetes - Liveness Probe, Readiness Probe, Startup Probe 📣

This is my 4th video on the Kubernetes series. In today’s video, I will share how to use different probes to check the container health and take actions accordingly.

I will be talking about three probes - Liveness Probe, Readiness Probe and Startup Probe. Finally I will provide one example where I will combine these probes to get most benefit out of these.

📌 Video: https://youtu.be/gahdtHYHbjI

https://redd.it/11xityb
@r_devops
EC2 Instance families interview question

I was asked about the difference between all the EC2 instance families. Are we expected to memorize that? What would you have answered?

https://redd.it/11y7y5v
@r_devops
Parseable - an open source log observability platform

Hello DevOps community, we've been working on https://github.com/parseablehq/parseable for a while now. Would love to get any feedback, questions etc.

​

Major driver for us to build Parseable, is the acute absence of a developer friendly, simple product to just ingest logs and integrate with current tools in the ecosystem. Parseable is

1. Written in Rust for memory efficiency and performance.
2. Uses Apache Arrow and Parquet for data management.
3. Based on indexing free design for fast ingestion (up to 100K events / sec / node).
4. Uses object storage (like S3) as primary storage system for cost effective storage.

​

Log dashboard in Grafana (powered by Parseable data source): https://demo.parseable.io:3000/d/ojonXSp4z/parseable-demo-data?orgId=1&refresh=1m

Get Started: https://www.parseable.io/docs/

https://redd.it/11y9hpp
@r_devops