Reddit DevOps
267 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Career in devops

Hello guys, I working in it field from last 3 years mostly in help desk positions. Right now due to some personal issues I cannot work for 5-6 months so I thought to study and complete some certification to get better job in cloud. I have basic knowledge of programming languages such as java, c#, html and css. My question is that what would be the path to get my career start in cloud and move to devops? Do I need to learn programming if yes than which languages? Should I give aws solutions architect ? Please guide me.


Thanks in advance

https://redd.it/falka0
@r_devops
Gitlab CI: feel like I'm going crazy, how do I build a docker Image?

I'd like to just build a Docker image for a node.js app. The Dockerfile builds on my computer. I have gone through the Gitlab CI Docker documentation and it feels like it's taking me to the stars and back but I'm having difficulty understanding whether I need to register a runner or where to get started. (https://docs.gitlab.com/ee/ci/docker/using_docker_build.html)

My assumption was that it would be something like the following. The build-node stage is not exactly what I'm using but it is working correctly -- there are also testing and lint stages that are working, too. The docker image build is where I'm tripping up.

//gitlab-ci.yml

build-node:
stage: build
before_script:
- yarn install --ignore-engines --frozen-lockfile
script:
yarn build

build-docker:
stage: dockerize
image: some-standard-gitlab-docker-build-image
script:
- docker build --rm -f ./docker/Dockerfile -t my-app:latest .

https://redd.it/falcop
@r_devops
'grep' to find how many users are in a group number?

I am having trouble trying to use the grep command to find how many users are in a GID...

Example: how many users are in group 101?

https://redd.it/fagb0h
@r_devops
How to get better at searching within JIRA

We painstakingly document a fix for issues with full screenshots, comments, etc. But when it comes to search, I can clearly see that JIRA fails even the most basic searches. We are using the cloud JIRA BTW, which is painfully slow to load on top of other things.

Is there a better way to search for keywords or groups of keywords? is there a way to use google to search within our JIRA cloud? Confluence is slightly better at search, but JIRA is the worst.

https://redd.it/faj9zf
@r_devops
Gateway with throttling/rate limiting - help us decide!

We're building a SaaS solution and have recently moved past MVP phase. Our customer base grows and we have noticed that some of them are abusing/overloading the system, which of course leads to "noisy neighbors" type of problems.

To overcome the issues, I have been assigned to scout out the throttling possibilites. However, our stuff runs in AWS and so far we have been using AWS ALB as "gateway" (it has various rules so that `/api/service1/` is handled by `service_1` application running on ECS, `/api/service2` goes to `service_2` and so on. Internal communication between services is handled by AWS App Mesh - Envoy in disguise). Since we did not need a real gateway when building MVP, load balancer worked just fine.

This has to change however, due to throttling/rate limiting requirement. Natural move would be replacing AWS ALB with Amazon API Gateway. That would come with a hefty price and I thought of adding our gateway between the load balancer and microservices instead.

The question is: **What are your best practices/go-to technologies when it comes to rate limitng/throttling the requests?**

I've looked into solutions and Nginx, HAProxy or Zuul(our apps are Kotlin-based so JVM technology like Zuul could fit in just fine) popped up. Which one would you recommend or avoid? Zuul seems too big for now, while Nginx or HAProxy could be not developer-friendly (our team has no classic SysOps - all the infrastructural work is done by Sofware Developers/SRE utilising Terraform/AWS CDK), but maybe that's just a wrong feeling. We are not afraid of any technology. :P

The throttle/rate limitng mechanism should work based on header with `tenant-key`. I'd also love to have different limit for different endpoints.

Thank you for your time!

https://redd.it/fastey
@r_devops
metrics for docker performance on different linux flavors

Curious if anyone has any experience gathering metrics or gauging how docker performs on certain distros of linux. I wanted to install a few different linux flavors on the same machine, probably ubuntu server, debian, and centos, and spin up some containers and I was wondering if anyone knows how a way to compare the different distros. As in what metrics would you gather to say that docker runs better on debian than ubuntu, or better on ubuntu than centos. How would you gather these metrics?

https://redd.it/faiwep
@r_devops
Quantifying business impact of my role

I got a disappointing performance review, despite my boss, teammates, and practically every dev team recognizing my work. However, the company calculator seems to value putting out fires rather than preventing them. For goal-setting, I want to know what kind of Key Performance Indicators (KPIs) that I can measure from my work to show what I'm actually doing.

Right now 2/3 of my team has been conscripted to project work, and I am basically one of two people dealing with all the support requests and automation improvements. I deal with the CICD pipeline and managing a lot of cloud-based resources.

What should I be measuring?

https://redd.it/fahw07
@r_devops
Including Mainframe in CI/CD with Zowe CLI

What do you think about using the Zowe open source framework to incorporate mainframe into enterprise CI/CD toolchains? The Zowe CLI is a lot like CLIs for AWS, Azure, K8s, etc. Here's a [simple CI example using Jenkins](https://medium.com/zowe/continuous-integration-for-a-mainframe-app-800657e84e96).

The mainframe frequency is not the same as cloud, mobile, etc. but curious to hear your thoughts.

https://redd.it/fafr5h
@r_devops
The woes of Wix, and how client lost all subdomains!

The client decided it was time to spruce up the old image by getting a new website to replace the old WordPress site. Now this is nothing unusual - but what is unusual is having one of the board-members do it. A board-member with somewhat limited technical insight…

How hard can it be, amirite?! He went forth with audacity and bravado. And indeed, the end result wasn’t bad as such. Video, animations, nicely organized. The advantage of tools like Wix - it makes cookie-cutter webdesign quite simple and easy. Anyone can do it, and indeed anyone did.

The problems started when going live. This fresh web-dev found that with Wix, it is not as easy as just updating your DNS to whatever VM is hosting the generated website. Oh no. Wix wants to be the DNS SOA (Start of Authority) and NS (Name Server) provider for the already existing domain. A domain I might add that was in production, with a host of services attached as sub-domains.

#### The disconnect

DNS SOA changed over, happy green web-dev cum board-member proudly showed his work - it going live on a Sunday evening… When exactly zero people are at work just in case, you know, something should happen. And it did. It soon transpired that the link to the subdomain that host the webapp suddenly stopped working. As did the logging server, the metrics server, the gitlab server. CI/CD pipeline, Prod and test environments both. In short: Everything. The kit and kaboodle. It all ground to a screeching halt. Except the website - that worked very well. Turns out, if you change DNS SOA and NS - any registered subdomains done by the previous DNS SOA and NS don’t carry over. They are wiped.

#### Tried turning it off and on again?

The technical staff having had no involvement thus far in this comedy of errors, soon got distracted from whatever they were doing at nearly nine o’clock on a Sunday evening. The problem soon got triaged down to one of three possible items - based merely on behaviour, as at this point in time what changes were made were not known. Indeed, it was not known that any change had taken place to begin with. Actual access to the relevant consoles and dashboards of various services was not possible outside the office, so was essentially working blind. Anyhow, the shortlist was: DNS error, Server-App error or TLS/SSL Certificates expired error. The two latter got eliminated off the list in short order, and what one was left with was the first. Which was top of the agendum in the wee hours of the morning after.

What was done though - at once - was quickly set up a temporary DNS link for the client facing web-app, so that at least the customers could get to it. Hence there was little customer facing down-time. Perhaps an hour, at a time of day with very little traffic on the service provided.

Fixing the issue itself was not a problem as such. Once access was had early next morning, the DNS SOA and NS got set back to the original provider. This takes a while to propogate around the globe, so the time after was spent hunting for the original web-site. None working there were involved in the making or deploying the original WordPress website - so no-one got a clue where it is hosted. So for the timebeing, the domain was set to point to a silly little 404 page. That is, until Tuesday rolled by - and a quick and dirty new site was created in Hugo by yours truly, pushed onto Netlify and is now running live. This bare-bones site will work as the foundation to build more onto. In a proper way, and not using managed tools like Wix.

#### The end and lessons learned

First off, don’t leave the keys to the kingdom in the hands of anyone who do not know how to operate the realm. No permanent damage done luckily, just created a bit of excitement and activity. Oh and postponed already scheduled tasks by a couple of days.

Moving forwards, I think CEO and Board-Member both have a new found appreciation of my refusing to deploy to production policy on Fridays, Weekends and day before any national Holiday. I
f something goes sideways, one need time to fix it within normal working hours and normal working days. And of course, I do not think they’ll meddle in technical stuff again without first consulting with the people possessing the proper expertise.

Oh, and a little bit of knowledge is a very dangerous thing, but we already knew that… :)

https://redd.it/fa9274
@r_devops
Have anyone tested Azure DevOps + Wordpress integration?

Hello guys,

I'm currently learning & working on making our wordpress site a DevOps friendly environment. I've been trying to integrate Azure DevOps with our Wordpress site.

Here's what I'm thinking of doing:

I will have to init a git on my Kinsta hosting via SSH.

Then connect the repository to DevOps Azure.

​

Do you guys have any experience integrating Wordpress with DevOps? If yes, what are your suggestion?

I have experience with Gitlab. Looking for the suggestion of the wiser & more experienced one. THank you.

https://redd.it/fadew4
@r_devops
Unsupported value: “Always”: supported values: “OnFailure”, “Never”

Hi, I am trying to run the following cron job:

​

apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: my-cjob
labels:
job-name: my-cjob
spec:
schedule: "*/5 * * * *"
jobTemplate:
spec:
template:
metadata:
name: my-cjob
labels:
job-name: my-cjob
spec:
containers:
- name: my-cjob
image: my-image-name
restartPolicy: OnFailure

​

But get the error:

2020-02-27T14:01:18.7412341Z \* spec.jobTemplate.spec.template.spec.containers: Required value
2020-02-27T14:01:18.7412503Z \* spec.jobTemplate.spec.template.spec.restartPolicy: Unsupported value: "Always": supported values: "OnFailure", "Never"

2020-02-27T14:01:18.7511779Z ##\[error\]/usr/share/openshift/oc failed with return code: 12020-02-27T14:01:18.7528214Z ##\[error\]/usr/share/openshift/oc failed with error: /usr/share/openshift/oc failed with return code: 1

​

Any idea what I am doing wrong?

​

I've got my inspiration from OpenShift: [https://access.redhat.com/documentation/en-us/openshift\_container\_platform/3.11/html/developer\_guide/dev-guide-cron-jobs](https://access.redhat.com/documentation/en-us/openshift_container_platform/3.11/html/developer_guide/dev-guide-cron-jobs)

https://redd.it/facsm3
@r_devops
Need some help

Hi my name is Siddhart i am 15 years old and i am from the netherlands.
I started programming when i was 13 years old. I started with visual scripting but I didnt liked that. I jumped straight to unity and c#. I have been working with unity and c# for 3 years now. I started this year with python and learning to programm a arduino. I wanted to ask what the best things for me are to learn like what languages and which books i should read.
Ty

https://redd.it/facevp
@r_devops
Advice to a new career about devops

I'am new in devops, but currently a lead software engineer, i can handle AWS Service, have a basic knowledge in Ansible and Terraform, a great understanding in bash scripting too.

What should be the steps i can do in order to get my self into devops smoothly.

Thanks for the answers and advice, appreciate that.

https://redd.it/fac1ft
@r_devops
Recommendations for setting up a web app production environment on AWS with a CI/CD pipeline using Jenkins

I'm new to the DevOps world and I've joined a project where our engineering team is building a small web app. It's currently in development and a dev/QA environment has been set up on EC2 using EBS volumes.

The code for the app will be turned over to the client upon completion of the project, and I've been tasked with providing instructions to the client which would allow them to set up a production environment and CI/CD pipeline.

Relevant information/requirements:

* The repo for the web app is in GitHub
* The web app is made up of 3 components:
* React/Next.js front-end
* Neo4j database
* GraphQL API
* AWS EBS volumes are used for storage
* AWS CloudFront should be used as the CDN
* AWS CloudWatch should be used for monitoring the web app
* Jenkins should be used for the CI/CD pipeline for deployments to the production environment

So far I've been thinking:

* Use terraform to provision:
* a VPC
* the 3 EC2 instances needed for the web app (FE, DB, API)
* the EBS volumes
* the CloudFront distribution
* the CloudWatch logs/metrics/alarms/dashboard
* a 4th EC2 instance for the Jenkins server
* Configure Jenkins to build and deploy the app upon merges to the master branch

I just want to confirm whether this a good general strategy before I get started.

Are there any major things I'm overlooking? Anything you would do differently?

Also, what would your estimate be for how long it would take an experienced DevOps professional to complete this task?

https://redd.it/fayirv
@r_devops
Netdata release v1.20!

Hey all,

Our first major release of 2020 comes with an alpha version of our new **eBPF collector**. eBPF ([extended Berkeley Packet Filter](https://lwn.net/Articles/740157/)) is a virtual bytecode machine, built directly into the Linux kernel, that you can use for advanced monitoring and tracing. Check out the [full release notes](https://github.com/netdata/netdata/releases/tag/v1.20.0) and our [blog post](https://blog.netdata.cloud/posts/release-1.20/) for full details.

With this release, the eBPF collector monitors system calls inside your kernel to help you understand and visualize the behavior of your file descriptors, virtual file system (VFS) actions, and process/thread interactions. You can already use it for debugging applications and better understanding how the Linux kernel handles I/O and process management.

The eBPF collector is in a technical preview, and doesn't come enabled out of the box. If you'd like to learn more about\_why\_ eBPF metrics are such an important addition to Netdata, see our blog post: [*Linux eBPF monitoring with Netdata*](https://blog.netdata.cloud/posts/linux-ebpf-monitoring-netdata/). When you're ready to get started, enable the
eBPF collector by following the steps in our [documentation](https://docs.netdata.cloud/collectors/ebpf_process.plugin/).

This release also introduces **host labels**, a powerful new way of organizing your Netdata-monitored systems. Netdata automatically creates a handful of labels for essential information, but you can supplement the defaults by segmenting your systems based on their location, purpose, operating system, or even when they went live.

You can use host labels to create alarms that apply only to systems with specific labels, or apply labels to metrics you archive to other databases with our exporting engine. Because labels are streamed from slave to master systems, you can now find critical information about your entire infrastructure directly from the master system.

Our [host labels tutorial](https://docs.netdata.cloud/docs/tutorials/using-host-labels/) will walk you through creating your first host labels and putting them to use in Netdata's other features.

Finally, we introduced a new **CockroachDB collector**. Because we use CockroachDB internally, we wanted a better way of keeping tabs on the health and performance of our databases. Given how popular CockroachDB is right now, we know we're not alone, and are excited to share this collector with our community. See our [tutorial on monitoring CockroachDB metrics](https://docs.netdata.cloud/docs/tutorials/monitor-cockroachdb/) for set-up details.

We also added a new [**squid access log collector**](https://docs.netdata.cloud/collectors/go.d.plugin/modules/squidlog/#squid-logs-monitoring-with-netdata) that parses and visualizes requests, bandwidth, responses, and much more. Our [**apps.plugin collector**](https://docs.netdata.cloud/collectors/apps.plugin/) has new and improved way of processing groups together, and our [**cgroups collector**](https://docs.netdata.cloud/collectors/cgroups.plugin/) is better at LXC (Linux
container) monitoring.

Speaking of collectors, we **revamped our** [**collectors documentation**](https://docs.netdata.cloud/collectors/) to simplify how users learn about metrics collection. You can now view a [collectors quickstart](https://docs.netdata.cloud/collectors/quickstart/) to learn the process of enabling collectors and monitoring more applications and services with Netdata, and see everything Netdata collects in our [supported collectors list](https://docs.netdata.cloud/collectors/collectors/).

## Breaking Changes

* Removed deprecated bash
collectors apache
, cpu\_apps
, cpufreq
, exim
, hddtemp
, load\_average
, mem\_apps
, mysql
, nginx
, phpfpm
, postfix
, squid
, tomcat
If you were still using one of these collectors with custom configurations, you can find the new collector that replaces it in the [supported collectors list](https://docs.netdata.cloud/collectors/collectors/).
* Modified the Ne
tdata updater to prevent unnecessary updates right after installation and to avoid updates via local tarballs [\#7939](https://github.com/netdata/netdata/pull/7939). These changes introduced a critical bug to the updater, which was fixed via [\#8057](https://github.com/netdata/netdata/pull/8057) [\#8076](https://github.com/netdata/netdata/pull/8076) and [\#8028](https://github.com/netdata/netdata/pull/8028). **See** [**issue 8056**](https://github.com/netdata/netdata/issues/8056) **if your Netdata is stuck on v1.19.0-432**.

## Improvements

### Host Labels

* Added support for host labels
* Improved the monitored system information detection. Added CPU freq & cores, RAM and disk space
* Started distinguishing the monitored system's (host) OS/Kernel etc. from those of the docker container's
* Started creating host labels from collected system info
* Started passing labels and container environment variables via the streaming protocol
* Started sending host labels via exporting connectors
* Added label support to alarm definitions and started recording them in alarm logs
* Added support for host labels to the API responses
* Added configurable host labels to netdata.conf
* Added Kubernetes labels

### New Collectors

* eBPF kernel collector
* CockroachDB
* squidlog: squid access log parser

Check out the [full release notes](https://github.com/netdata/netdata/releases/tag/v1.20.0) and our [blog post](https://blog.netdata.cloud/posts/release-1.20/) for full details!

https://redd.it/faz2kc
@r_devops
How can I ask a company nicely to hurry up with the hiring process?

Company: "As I mentioned, I will review the results and then get back to you, hopefully sometime next week."

The last phrase sounds so long. How should I phrase this?

>Do you know how long the hiring process takes?
>
>I'm expecting a job offer pretty soon from a company and I'd love to get to know more about you and your company, as the job specs of your company match my skills better.

https://redd.it/fauqgj
@r_devops
Bro, do I even devops?

I'm a veteran programmer, working as an embedded "devops" guy in the games industry (indie studio level). I write tools and services that are consumed only by other developers - source code control, build servers, artifact storage, data storage/analytics/visualization, and misc quality-of-life stuff. As a programmer I worked my own way into this field, and I don't know anyone else who carries the title "devops", and I'd actually like to know - is what I do even called devops?

Recently I started looking around for another job and felt really out of my depth. Most of the openings seemed to involve customer-facing cloud services at massive scale, all of them using well-established tools. And here's crazy little me, writing my own servers and services and hand-deploying a mesh of docker containers, all of these things being just easier for me to customize for the bizarre needs that game developers have.

What am I even?

https://redd.it/fb2we8
@r_devops
Configuring nginx with docker-compose

I have a simple app of 3 containers which all run in the same AWS EC2 server. I want to configure Nginx to act as a reverse-proxy serving the same domain however I'm pretty new with Nginx and don't know how to set the conf file correctly.

Here is my docker-compose file:

version: "3"
services:

nginx:
container_name: nginx
image: nginx:latest
ports:
- "80:80"
volumes:
- ./conf/nginx.conf:/etc/nginx/nginx.conf

frontend:
container_name: frontend
image: myfrontend:image
ports:
- "3000:3000"

backend:
container_name: backend
depends_on:
- db
environment:
DB_HOST: db
image: mybackend:image
ports:
- "8400:8400"

db:
container_name: mongodb
environment:
MONGO_INITDB_DATABASE: myDB
image: mongo:latest
ports:
- "27017:27017"
volumes:
- ./initialization/db:/docker-entrypoint-initdb.d
- db-volume:/data/db

volumes:
db-volume:

The backend fetches data from the database and sends it to be presented by the frontend.

Here is what I tried to do with my nginx.conf file (which is obviously wrong):

events {
worker_connections 4096;
}

http {
server {
listen 80;
listen [::]:80;

server_name myDomainName.com;

location / {
proxy_pass https://frontend:3000/;
proxy_set_header Host $host;
}

location / {
proxy_pass https://backend:8400/;
proxy_pass_request_headers on;
}

}
}

Any help would be greatly appreciated.Note: I want all containers to run behind the same domain name

https://redd.it/faxz3q
@r_devops