Reddit DevOps
269 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Raises: Associate —> DevOps Engineer

Hey all, associate devops engineer here and I’m working my tail off to get promoted at the end of the year. Just curious - what % did your salary get bumped when you got promoted out of the associate role?

https://redd.it/11x3qnz
@r_devops
Python projects for devops

Hi, I'm learning python scripting, I had a few courses, wrote a few simple projects from courses. I didn't have any python scripts in my projects at work, so I'm wondering how devops use python scripting, I think in some lambdas in aws like "shutdown some instances". And I need some project ideas, what to write and how to be prepare to work as a devops in python.

https://redd.it/11wf08z
@r_devops
Recommend tooling for Docker image and .NET SBOM generation.

Looking into finding some quality tools to generate SBOMs as part of Github Actions build pipeline for docker images and .NET project. Recommend some, please.

Thanks!

https://redd.it/11wlbng
@r_devops
What is the best tooling to generate OCI images and .NET project SBOMs?

Looked into cdxgen and docker sbom, not really satisfied with the output of both and especially with cdxgen's reliability. Wonder if there is better tooling available.

Thanks!

https://redd.it/11wkjy1
@r_devops
How Logistics And Transportation Apps Streamline Business Operation And Maximize Efficiency

**Logistics and transportation apps** have transformed the way businesses manage their supply chain operations. As we have discussed in this blog, Solution Analysts is a leading DevOps development company with extensive experience in creating high-performing and feature-rich native mobile and web applications for diverse industries.

https://redd.it/11x8tw1
@r_devops
How do you make the pod use all CPU request?

I have an app that runs computational tasks. I have set CPU requests for the pod, but the app chooses to run slowly and use less CPU. Is there a way to force the app use all the CPU available in order for it to execute faster?

https://redd.it/11wibep
@r_devops
Trying to switch to devops as a complete newbie

Hey guys 22 M here worked as a test analyst and was contributing in both automation and manual testing, tried to switch internally but the management never allowed me to and I resigned from the company. Currently learning Linux jenkins AWS and terraform and ansible( beginner level). need suggestions in preparing for the interview and topics i have to be strong. Overall IT exp is around 1.4 years.

https://redd.it/11xb6lj
@r_devops
Dependency tracker for (really big) builds / deploys

I was asked an interesting question the other day about dependency tracking for build components, so thought I'd ask here to find out what other folk are doing to keep on top of this...

For most of the projects I've worked on, you just have a variables file with the expected version of the components so the build or deploy grabs the versions it has been told to - which have been tested and approved to work with each other...

But if the number of these components doubles? Or is a massive number to begin with - a dependencies file doesn't scale... Do you aggregate things in to bundles? Keep the granularity and use a databases for this? How would that work?

What are your thoughts? Is this something you've tackled before?

https://redd.it/11xcqvk
@r_devops
Helm Upgrade causing the pods to go in the pending state

After doing helm upgrade, pods go into pending state because helm upgrade applies the rollout strategy and it tries to create the replicas of the new pods but since the node has limited resources which causes all the pods to remain in the pending state.

Precisely i get this error : Warning FailedScheduling 6m2s default-scheduler 0/1 nodes are available: 1 Too many pods. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming po.

I don't have the option to increase the node size or count, is there any solution to the problem ?

https://redd.it/11xdesv
@r_devops
DSC Tutorials / DSC Not working?

Hello.

​

I'm trying to get better at DevOps, in this instance i'm trying to learn how to place artifacts on a virtual machine, and how to execute a DSC Scripts, so i can configure a Domain Controller, with users and everything.

My issue so far is, i don't know how DSC should be working, me inserting a 3 line Powershell Script and then using it as a DSC works, however that isn't best practice for sure, and i like to create a DSC which i'm from my DevOps repository places on my virtual machine which isn't connected by a Site2Site, or by the use of a Dedicated host ( for cost reasons ) which means WinRM isn't a possbility.

This is my current DSC, the attempts is to run a custom script, which performs a invoke on my VM to retreive 3 files from my Storage Account, this however isn't working as intended.

Does anynoe have any clue what i'm doing wrong in this scenario, or potentially have tried something like this which wouldn't mind sharing it with me?

Configuration DC1 
{
param

$domainCred = Get-AutomationPSCredential -Name "DomainAdmin"
$DomainName = Get-AutomationVariable -Name "DomainName"
$DomainDN = Get-AutomationVariable -Name "DomainDN"
$SACred = Get-AutomationPSCredential -Name "ServiceAccountCreds"
$destination = 'C:\Windows\DSC\'
$url = 'SasUrl'

    Get-DscResource

    Node "Localhost"
    {   
        Script ScriptExample
        {
                GetScript            =  { return @{result = 'result'} }
                TestScript           = { return $false }
                SetScript            = { Invoke-WebRequest -Uri $url -Outfile $destination }
        }                  
    }
}

SasUrl paramater is a SAS Key, which i've removed for the secuirty aspect.

https://redd.it/11xf3kq
@r_devops
Is there a way to retrieve an output value from one workflow to another in Github Actions?

At the end of my "build.yml" workflow I am trying to export the github run number, so it can be used in the "deploy.yml" workflow.

build.yml workflow:

- name: Upload artifact # uploads artifact as zip to temp runner
uses: actions/upload-artifact@v2
with:
name: RC-${{ github.runnumber }}
path: ${{ env.RUNNER
TEMP }}\WebAppContent*
outputs:
runnumber: ${{ github.runnumber }}

I'm just wondering the best way I can reference this output, so when I deploy using the "deploy.yml" workflow, I can use the run_number as a value to notify our slack channel on which run has been built and is being deployed. Does anyone have any good ideas about how to export github variables/values between different workflows (both are within the same repository).

Thanks!

https://redd.it/11xfbqe
@r_devops
Terraform Security Best Practices

Terraform is the de facto tool if you work with infrastructure as code (IaC). Regardless of the resource provider, it allows your organization to work with all of them simultaneously. One unquestionable aspect is Terraform security. We want to explain the benefits of using Terraform and provide guidance for using Terraform in a secure way by reference to some security best practices.


Auditing Terraform configurations.
Managing access credentials.
Security best practices in the use of Terraform modules.
DIY Terraform modules.

I hope it will be useful and all feedback is welcome.

https://redd.it/11xj71a
@r_devops
Artifactory vulnerability Scan

Hello everyone,

I have some artifacts builds and docker images in an artifactory repositories and I want to scan the vulnerabilities of all of these.

Is Jfrog X-ray the best tool or do you guys know some alternatives?

I wanted to use trivy but I can't find information on the compatibility.

https://redd.it/11xknqm
@r_devops
ZeusCloud - an open-source cloud security platform

Sharing something we're in the early innings of developing: https://github.com/Zeus-Labs/ZeusCloud

Have heard from many devops friends that they often get charged w/ managing security. Hope to get your feedback on if this would be helpful!

ZeusCloud is an open-source cloud security platform that thinks like an attacker! ZeusCloud works by:

1. Identifying risks across your cloud environments (e.g. misconfigurations, identity weakness, vulnerabilities, etc.)
2. Prioritizing those risks based on toxic risk combinations an attacker may exploit.
3. Remediating by giving step by step instructions on how to fix the risk findings.
4. Monitoring compliance - track your PCI DSS, SOC 2, GDPR, CIS goals.

So far, we’ve added misconfiguration checks and common identity-based attack paths for AWS. Up next on our roadmap are network/access graph visualizations of your entire cloud environment, vulnerability scanning, and secret scanning!

Check out our GitHub (Licensed Apache 2.0): https://github.com/Zeus-Labs/ZeusCloud

Play around with our Sandbox environment: https://demo.zeuscloud.io

Get Started (free/self-hosted): https://docs.zeuscloud.io/introduction/get-started

https://redd.it/11xmp1f
@r_devops
My Green/blue AWS db deployment strategy for avoiding data loss due to table locks

Gonna write out my deployment strategy, because for some reason I can't find any detailed breakdown of how to actually do a green/blue update that accounts for obvious obstacles, like how to deal with replication and table locks. Let me know if this makes sense or is missing an obvious better way.

1. Create a green/blue environment on AWS RDS
2. Disable replication and remove the readonly lock applied to the green (new) table.*
3. Drop tables, add/rearrange/delete columns, etc.
4. Test that the green database to you satisfaction
5. Grab the max ID's from any tables that have been updating live in the blue (old) database. Prepare to copy any new data added during the hours after you disabled replication.*
6. Freeze the blue database somehow (make it read-only? not sure best way). Export all new data from the blue to the green.*
7. Perform the blue/green swap through AWS

* Certain changes can be done with replication on, such as adding a table or adding a column to end of a table, allowing steps 2 5 and 6 to be skipped.

The idea here is that I've got massive tables that I need to lock, or even can't risk accidentally locking. Instead of having 5 hours of downtime, I only have to keep it down for the time it takes to copy over 5 hours worth of new data, i.e. minutes. (Note that my db has a small number of very large tables that change frequently, so it's logistically simple to manually copy over new data, if you had dozens of tables that might change in an hour with dependencies on one another, this might be too complicated.)

Thoughts? Does this seem like a good strategy, at least in some use cases? Are there better ways to deal with table locks?

https://redd.it/11xm5v7
@r_devops
Selefra - An open-source policy-as-code software that provides analytics for multi-cloud and SaaS.

Hey everyone!

We're excited to introduce Selefra, an open-source policy-as-code software that provides analysis for multi-cloud and SaaS environments. Selefra supports over 30 services including AWS, GCP, Azure, Kubernetes, Github, Cloudflare, and Slack.

With Selefra, you can select * from infrastructure and gain insights into your entire environment. Our solution helps you to ensure that your cloud resources are configured correctly, compliant with industry standards, and optimized for cost and performance.

Check out our GitHub repository at **https://github.com/selefra/selefra** and our website at **https://www.selefra.io** to learn more about the project and how it can help your organization. We welcome your contributions and feedback, so please feel free to get involved!

Thanks for your support, and happy coding!

https://redd.it/11xr7vo
@r_devops
How to automate security patching for OSS docker image?

I work for a large organization, and we are working on deploying open source infrastructure in production. The infrastructure relies on 2 docker images for a UI and a metadata service endpoint. Obviously, our organization is very strict on security so we have a security hardening process that we have to abide by.

Currently, we are trying to minimize operational maintenance, and part of that is having to manually construct a hardened image based off of the original OSS image. That includes changing the base images to custom hardened ones that are internally provided, using the latest source code, using a different nginx conf file, using an interal npm registry, etc. The are lots of little fragmented changes that I do to manually adapt the original docker files, and it would make automation not so straight-forward.  Curious about what patterns and technologies others are using to automate patching for open source images.

https://redd.it/11xnpr3
@r_devops
"Off the record" hangout on Friday: How cruise reduced CI time on giant monorepo

This month the Aviator.co team will be joined by a group of senior engineers on Cruise's UCI team (unified CI team). They'll explain how they managed to reduce CI time on their giant monorepo. They've done some interesting work with intelligently managing runs, auto-quarantining bad tests, etc. Come hang out. No recordings or sales follow-ups. Just a hangout, as usual.

https://getcruise.com/

Sign up here:
https://dx.community

https://redd.it/11xhx2e
@r_devops
Should I leave my junior devops (small company) for a junior SRE role at a fortune 100. But my current role is full time at-will and the other one is a 1 year contract to hire.

Salary difference is 10k more than my current role. And the other one is hybrid but once a month with 20-30 min commute. Please advise!

More context:

I recently obtained my AWS SAA which led me to the f100 offer. And the team is production facing with alternating on-calls that primarily use AWS but some other cloud services. My current place is very legacy with intentions of using more modern stack but not yet executed.

My concern is do the risks overweigh the rewards? About 1.5 yoe, counting internship. Please share your experiences.


Edited to add context.

https://redd.it/11xrd7l
@r_devops
Continuous cloud run deployment problem

Hey everyone, I'm trying to run a continuous deployment from bitbucket with google's Cloud Run and I have a weird issue where my settings don't look like those I find online, or match the Google's docs. Was the feature removed or do I lack privileges? Help would be appreciated.

https://redd.it/11xxymp
@r_devops