Reddit DevOps
270 subscribers
6 photos
31.1K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
To be a devops, do we need to be a peogrammer?

Do I need to know program in order to be a devops?

https://redd.it/n817i9
@r_devops
Failed dream job interview due to 2 questions, would appreciate any answers about these

Hey guys, coming down from a big failure today. So I got into selected for interview in a dream job, not big name but in terms of $$$, worklife balance, culture, etc. So I clear first round it was standard, complete some assignment, all good. Had talk with Devops lead. He asked me 2 questions that I had no answers for and that was it. I mean 2 questions.

So here is how it went,

He: So we want continuous deployment very frequently what is the best way to achieve this?

Me: CICD pipeline using tools like Jenkins, Gitlab, etc.

He: Okay, great. Now say for some reason there is a faulty release, how will you delete and roll it back automatically?

Me: Not sure maybe we can have Prometheus send alert which would trigger a python k8s script for rollback?

He: Not the best way

​

Next one, he asked me something related to helm before

He: So, let me give you a scenario. Our users are growing and we are making application ever better by introducing new tools and more applications, like Redis, python app , etc. What would you do in devops perspective so that adding and maintaining those new applications is as easy as possible?

Me: Create docker image for new component, manifest and helm and then Maybe we can have git repository to store all helm chart ?

He: Are you sure?

Me: Not really

​

We that's it to be honest some questions here and there, I guess either I am kinda dumb, maybe these were easy questions. Anyway I'd appreciate the answers guys. Thank you

https://redd.it/n7qiwt
@r_devops
Podman vs Docker

How many of you are using podman at work in place of Docker? How good it is?

https://redd.it/n7tidm
@r_devops
Host apps on different subdomains on the same ec2 server

Hello everyone,

Let me start by saying, I'm a noob in DevOps. Recently I encountered a situation where I have to host 2 flask apps on two different subdomains, for example, I've to host flask_app_1 on sub1.example.com and flask_app_2 on sub2.example.com. I have an ec2 instance and I own the example.com domain and I'm using Nginx as well. I did some trying before. I tried making a .conf file in the sites-available folder for both of them and linking them, but that didn't work. I'm not sure if I'm doing something wrong in this method or whether I should be doing something else other than this.

I'm pretty sure this is a famous thing that many websites do, but I just can't get it. Can somebody help me to get this problem fixed?

Thank you!

https://redd.it/n90ewe
@r_devops
Developing on Apple M1 Silicon with Virtual Environments

I teach a graduate course on DevOps and Agile Methodologies which is 50% lecture and 50% hands-on. In the hands-on labs, I use Vagrant and VirtualBox to provide consistent development environments for my students. This eliminates the problems with some students having Macs while others have Windows... everyone develops on Linux! 😁

That worked really well until Apple released their new 2020 Macs with Apple M1 Silicon chips based on the ARM architecture which VirtualBox won't run on. My luck, 8 students showed up for the spring semester with Apple M1 Silicon Macs and none of my VirtualBox based labs would work. So I purchased an Apple M1 Mac mini and began looking for a solution.

I just published Developing on Apple M1 Silicon with Virtual Environments to document how I solved this problem and provided a consistent development environment for all students using Vagrant with Docker as a provider.

You can read it here (feedback/questions are welcome): https://johnrofrano.medium.com/developing-on-apple-m1-silicon-with-virtual-environments-4f5f0765fd2f

If you want to try this out on an Apple M1 Mac, you can clone one of my lab reps ion GitHub and bring it up for yourself: https://github.com/nyu-devops/lab-flask-rest.git

(Next article will be how I got this working with Visual Studio Code Remote Containers as well)

https://redd.it/n973sc
@r_devops
Web application support resources

Any resources you can provide so I can get better at supporting web app. Mainly the server side troubleshooting part? I mean I know stuff from working over the years but would like a proper go to resource where I can learn about all the possible scenarios on why a site may go down and best solution to troubleshoot them.

https://redd.it/n97j8q
@r_devops
Issues with TFS agent version on Android Builds?

Hey guys so our dev team at our company started reported android build fails for our android apps. It looks like sometime in the last week (last successful android build was 5/3..however we only have 2 android apps we don't touch daily).

Wondering if anyone has seen this/gotten past this without just upgrading agent versions. I'm just wondering what happened that would force something like this and I hope this info can help others.

​

Required version as of some time in the last week: 2.182.1

Our current agent versions: 2.173

https://redd.it/n96yra
@r_devops
DoorDash Custom Canary Kubernetes Controller

Hey folks, I thought you might be interested in learning how we at DoorDash built a custom Kubernetes Canary controller on top of Argo Rollouts in the linked blog post. Let us know your comments and feedback!
https://doordash.engineering/2021/04/14/gradual-code-releases-using-an-in-house-kubernetes-canary-controller/

https://redd.it/n8zo60
@r_devops
Making the case for a new quality dimension for K8s apps

One of my mentors once said "Don't optimize before it works". I think that we could make a case that cloud native applications do already work. Hence, the next logical step is to think about running our apps with the optimal resource configuration. My team and I make the case that resource efficiency should be a key dimension for cloud native applications and I'd like to invite you to join our discussion next Thursday about best ways to integrate efficiency in our daily work. Find more details here:https://www.stormforge.io/event/crossing-kubernetes-performance-chasm/?utm\_medium=social&utm\_source=Reddit&utm\_campaign=crossingthechasm

https://redd.it/n7r4fb
@r_devops
Ad hoc jobs question

Hello everyone,

I was hoping to ask the experienced DevOps Engineers here for some help with the concept of ad-hoc jobs. I have read that "AWS CodePipeline and GitHub Actions do not cater for ad hoc jobs. AWS CodePipeline needs a trigger, and then runs a static pipeline. GitHub Actions is listening to git events. "

Our team is leaning towards using GitHub Actions and I am trying to determine if GitHub Actions not catering for ad-hoc jobs is something that we should seriously consider. However, I am not sure I understand clearly the concept of ad-hoc jobs. Could someone clarify what these ad-hoc jobs are, and what do they actually do? Is it a big downside that GitHub Actions do not cater for them? Any help will be much appreciated.

https://redd.it/n7q44g
@r_devops
Slack ChatOps for releases

Looking for solutions,

problem:

inherited project, release process is semi manual and cannot be completed without heavy engineer involvement.

​

proposition:

in order to win back time for engineers to refactor the release process into something a little more up to date, we should create a slack chatbot which can be used to send api requests to our build and release automation services.

​

I have done some research and the most accessible solution appears to be errbot.io which supports a sort of ACL which would be perfect for the gated release process our stakeholders require.

The solution needs to be fairly lightweight and written in an accessible language that won't require much up skilling from our engineering team should they need to support it.

I am about to enter the rabbit hole on this topic, I have no idea what the infrastructure will look like yet. Hopefully it can all be run from a lambda on AWS.

​

Any tips, ideas, warnings or references would be greatly appreciated at this point. Keen to hear what you've got for me DevOps!!!

https://redd.it/n7lv7w
@r_devops
New to cloud CI infrastructure (Bitbucket Pipelines in my case). What is the proper way to make a release?

I do traditional desktop software development with slow point releases, e.g. myapp_1.2.1.tar.gz. So no CD, I just build, compress, and upload to the Downloads section of my repo, from where people can manually obtain my release packages. I recently started using Bitbucket.

I already set up Bitbucket Pipelines to launch a build whenever there's a commit. I'd like to start using it to create the actual release package: so manually click something to initiate my intention to make a 1.2.1 release from a commit, compile the app, run tests, update a header file with the string "1.2.1", create the archive myapp_1.2.1.tar.gz, and upload that.

I read the doc and learned how to do these steps but I can't tell how I'm supposed to send the desired version name to this pipeline.

Based on what I read (their doc isn't the best btw), I saw 2 ways:

* pipelines can be triggered based on a pushed git tag. So instead of making a release from the web UI, I could open a git terminal, create and push the git tag "release-1.2.1". This will trigger a pipeline configured to trigger on "release-*" tags. Anything in the pipeline can then extract "1.2.1" from the value of the $BITBUCKET_TAG envvar. This seems very unnatural to me, it inverts the flow I'm used to.

* I could abuse repo variables. Keep an CURRENT_RELEASE_VERSION=1.2.1 variable which I update before making a release. I don't like this because I could forget to update the variable, or click the Run Pipeline button by mistake when I'm not trying to make a release. And because of that silly stuff can happen like overwriting past versions, overwriting git tags, etc. Also, this doesn't communicate the explicit intent of making a release.

It feels like there's a clean 3rd method I'm missing.

Here's a skeleton of the pipeline I had in mind before I started: https://paste.ubuntu.com/p/zmn8ffkkJ6/

https://redd.it/n9k1u8
@r_devops
Production Ready DevOps Books

Hi there,

I'm writing this post to ask for any books that discusses production-ready devops techniques/designs/prototypes. I'm moving our applications to Docker and i think it's ready to go to production. I also built Jenkins pipeline that does the build and deployment to my Swarm Cluster. It's surfing perfectly. But I'm interested in reading more about what others are building and using for their environments, best practices, security todos, and more. I know it's a big topic but there must be a book that discusses this.
My stack is: Java Spring Boot, Jenkins, Docker Swarm, Haproxy.

Thank you guys for the help.

https://redd.it/n7hfvt
@r_devops
data redundancy. Where to back stuff up to?

Hi all.

I have an app in Google Cloud platform. I have the following data:

* Bucketed misc files in storage (~ 1gb)
* Bucketed secondary files (~ 1tb and growing). If we lost these, it's not the end of the world, but not ideal.
* Database (~ 1gb)

What is the best way of keeping all that safe? I have the regular 7 day backups on the database.

I am most concerned about a scenario where we are held ransom, or we lose access to our account.

I would ideally like to store this data offsite, or perhaps in another cloud provider? What do people recommend?

Edit: I came across rsync.net. Seems like something that could be useful as a simple solution?

https://redd.it/n7gcku
@r_devops
How are you measuring DevOps performance?

Hi r/devops,


Many people here are familiar with the four key metrics identified by DORA for measuring performance of software development teams i.e. lead time, deployment frequency, change failure rate, mean time to recover

I'm curious to know some of the different ways in which you are measuring these metrics. Are there any well known tools/approaches that make this easy, or are you building internal applications to measure this stuff? e.g. Incrementing a counter in a data store after every successful deployment and pulling this data into a nice dashboard

I apologize if this is a simple question, just curious to see how others are measuring the impact of a devops culture

https://redd.it/n9o76m
@r_devops
How to persist volumes/filesystems in a Packer EBS AMI for use in newly created EC2 instances?



I'm trying to build an AWS AMI that has all my filesystems set up as I'd expect, i.e. /var, /var/log, /tmp etc. I am attempting to achieve this using packer in conjunction with the Ansible provisioner.

Here is my HCL2 build file

source "amazon-ebs" "example" {
ami_name = "test_ami ${local.timestamp}"
ami_description = "test ami with predefined filesystems ${local.timestamp}"
instance_type = "t2.micro"
region = "eu-west-2"
source_ami_filter {
filters = {
name = "amzn2-ami-hvm-2.0.*-gp2"
root-device-type = "ebs"
virtualization-type = "hvm"
architecture = "x86_64"
}
most_recent = true
owners = ["amazon"]
}
# EBS for root volume
launch_block_device_mappings {
device_name = "/dev/xvda"
volume_size = 10
volume_type = "gp2"
delete_on_termination = true
}
# EBS for data volume
launch_block_device_mappings {
device_name = "/dev/sdb"
volume_size = 5
volume_type = "gp2"
delete_on_termination = true
}
ssh_username = "ec2-user"
}

I then have ansible provisioners to set up my physical volumes, volume groups and logical volumes, along with some xfs filesystems. This all works fine during the Packer AMI build. I can verify using PACKER\_LOG=1 packer build . that the plays are successful in my ansible playbook.

Once the AMI is created, I have built an EC2 instance off of it, but all the work the Ansible playbook has done in setting up the aforementioned volumes and file systems has disappeared. For example, /dev/sdb1 doesn't exist when I run blkid or fdisk -l
. My /etc/fstab file has also disappeared.

I was under the impression that although I've selected delete\_on\_termination under launch\_block\_device\_mappings , that the snapshot created from the AMI build would be applied to any EC2 instances that were built from the AMI, therefore my physical volumes and filesystems would have been intact.

Am I misunderstanding this? If so, can anybody clarify where I'm going wrong?

https://redd.it/n9rtll
@r_devops
Delete CloudFormation Stack Including S3 Objects

I needed to create and tear down development environments. Deleting CloudFormation Stack has issue with S3 objects. S3 bucket can not be deleted if it has objects (to the best of my understanding). So I wrote a script which does:

1. Removes deletion protection from DB instances belonging to the stack
2. Deletes S3 objects including versions (10 in parallel) in buckets belonging to the stack
3. Issues delete stack command after the above is finished

The script is at https://github.com/ngs-lang/nsd/blob/master/aws/cloudformation/delete-stack.ngs

It is written in Next Generation Shell.

Hope that helps!

https://redd.it/n9sf9o
@r_devops
Spacelift Feature Reveal: Local Preview

Multiple times have we been asked to implement local preview, here on Reddit and elsewhere. Creating small commits all the time to see if what you’re writing will properly execute is tedious! So is setting up all necessary accesses and environment variables locally.

We’re glad to let you know this is now available!

From now on, by turning on `Enable local preview` on a Stack, you can preview changes based on the changes in your local directory, you just have to run `spacectl stack --id <stack-name> local-preview` and you’ll get the output streamed right into your terminal!

Here’s a demo of it:

Spacelift Local Preview - asciinema

To find our more about Spacelift, check out: https://spacelift.io

https://redd.it/n9zl4x
@r_devops
Apache atlas configuration conenction cassandra backend [help]

Hi,

for a future poc i need to deploy an apache atlas 2.1 stack

but i can't found the parameter for the cassandra backend connection.

if anyone got a link or hadalready made a implementation with password authen.

or a other sub reddit where some one can have an answer

this is my current config file if it can help.

&#x200B;

atlas.graph.storage.backend=cql
atlas.graph.storage.hostname=cassandra
atlas.graph.storage.cassandra.keyspace=JanusGraph

atlas.graph.storage.clustername=cassandra
atlas.graph.storage.port=9042

atlas.EntityAuditRepository.impl=org.apache.atlas.repository.audit.CassandraBasedAuditRepository
atlas.EntityAuditRepository.keyspace=atlas_audit
atlas.EntityAuditRepository.replicationFactor=1

atlas.graph.index.search.backend=solr
atlas.graph.index.search.solr.mode=cloud
atlas.graph.index.search.solr.zookeeper-url=zookeeper:2181
atlas.graph.index.search.solr.zookeeper-connect-timeout=60000
atlas.graph.index.search.solr.zookeeper-session-timeout=60000
atlas.graph.index.search.solr.wait-searcher=true

atlas.graph.index.search.max-result-set-size=150

atlas.notification.embedded=false
atlas.data=${sys:atlas.home}/data/kafka

atlas.notification.create.topics=true
atlas.notification.replicas=1
atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES
atlas.notification.log.failed.messages=true
atlas.notification.consumer.retry.interval=500
atlas.notification.hook.retry.interval=1000

atlas.enableTLS=false

atlas.authentication.method.kerberos=false
atlas.authentication.method.file=true

atlas.authentication.method.ldap.type=none

atlas.authentication.method.file.filename=${sys:atlas.home}/conf/users-credentials.properties


atlas.rest.address=https://localhost:21000

atlas.audit.hbase.tablename=apache_atlas_entity_audit
atlas.audit.zookeeper.session.timeout.ms=1000
atlas.audit.hbase.zookeeper.quorum=atlas-zookeeper:2181

atlas.server.ha.enabled=false
atlas.authorizer.impl=simple
atlas.authorizer.simple.authz.policy.file=atlas-simple-authz-policy.json
atlas.rest-csrf.enabled=true
atlas.rest-csrf.browser-useragents-regex=^Mozilla.*,^Opera.*,^Chrome.*
atlas.rest-csrf.methods-to-ignore=GET,OPTIONS,HEAD,TRACE
atlas.rest-csrf.custom-header=X-XSRF-HEADER

atlas.metric.query.cache.ttlInSecs=900

######### Gremlin Search Configuration #########

#Set to false to disable gremlin search.
atlas.search.gremlin.enable=false

thanks for any help

https://redd.it/n9zi16
@r_devops