Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
How to i find contracts for designing new pipelines?

I'm a freelancer and generally the contracts that I can find through my external sales organization lands me maintenance positions within old legacy systems. What I want to do to test my skills and stroke my creative mind is to design new systems, selecting state of the art tools and putting it all together. That's what I enjoy and I hate myself every time I have to mess around with some old crap like jenkins, or bitbucket, or god forbid perforce.

I'm not really sure how to look for these types of contracts. I don't know if I maybe should select for a particular industry, if I should market myself as an expert on a particular tool that I like, or any other trick to attract the right recruiters on LinkedIn.

https://redd.it/fb9drd
@r_devops
Kubernetes and Spring Boot MVC

My company is deploying their app as one huge monolythuc mvc web app, deployed on kubernetes. Does it make sence to do this? No microservices are involved.

https://redd.it/fbn63o
@r_devops
I have a python script on AWS EC2 that connects to a website via websockets that is hosted on Digital Ocean. Every method I've tried to start it up on boot up has resulted in failure.

I'll try to be thorough in my description of the problem, what I've tried so far, and what I think the problem(s) is. The flow of information look like this: `transformer.py` starts up, loads autobahn for the websocket connection, tries to find another half of the websocket waiting on the other side of a URL, and attempts to connect to that. If the handshake goes through it it connections, if it doesn't connect then it either results in an error or just hangs. The websocket connection used to connect to AWS Lambda and it worked just fine; it was only once I had it connect to my own website that the autostart stopped working. When I start the script manually it connects just fine, but when try to start the script automatically on boot up I get some weird connection issues. This is a stripped down version of [`transformer.py`](https://transformer.py):

from autobahn.asyncio.websocket import WebSocketClientProtocol
from autobahn.asyncio.websocket import WebSocketClientFactory
import asyncio
import json

uri = "wss://<domain_name>.com/ws/ai/"
domain = "<domain_name>.com"
port = 443

class ClientProtocol(WebSocketClientProtocol):
def onOpen(self):
message = input("Enter Prompt: ")
message = {"action": "handlePrompt", "prompt": message}
payload = json.dumps(message, ensure_ascii=False).encode("utf8")
self.sendMessage(payload, isBinary=False)
print("Sent: " + str(json.loads(payload)))

def onMessage(self, payload, isBinary):
print("Text Message Received: " + str(json.loads(payload)))


if __name__ == "__main__":

factory = WebSocketClientFactory(uri)
factory.protocol = ClientProtocol

carousel = asyncio.get_event_loop()
socket = carousel.create_connection(factory, domain, port, ssl=True)
carousel.run_until_complete(socket) # This line is where the error below appears
carousel.run_forever()
carousel.close()

This is the error that I get with autostartup:

Traceback (most recent call last):
File "transformer.py", line 358, in <module>
carousel.run_until_complete(socket)
File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/usr/lib/python3.6/asyncio/base_events.py", line 820, in create_connection
sock, protocol_factory, ssl, server_hostname)
File "/usr/lib/python3.6/asyncio/base_events.py", line 846, in _create_connection_transport
yield from waiter
ConnectionResetError

The file structure looks like this:

/home/ubuntu/project/transformer.py
/home/ubuntu/project/start_script.sh
/home/ubuntu/project/venv/

`start_script.sh` is executable. The contents are:

#!/bin/sh
cd ~
cd home/ubuntu/project/
venv/bin/python3 transformer.py

While the autostartup worked with AWS Lambda, I was using a cron job to run a shell script on reboot:

@reboot /home/ubuntu/project/start.sh | bash &

I later found out that it's kind of pointless to pipe the script to bash, but neverminding that it worked. Once I switched the websocket info over to my own website the above error appeared. I switched the cron job to:

@reboot /home/ubuntu/project/start.sh > /home/ubuntu/startup.log 2>&1

This results in the error above. I then tried to use the package `supervisor`. We use this on our webserver and are very familiar with how it works. The thought process behind this was that maybe there were some user permission errors and [`transformer.py`](https://transformer.py) needed to be run as ubuntu. This resulted in the same error. So at this point I was thinking maybe it has something to do with the loading order of modules for linux. Like, maybe the networking part of linux isn't fully loaded by the time the connection is
tried? From here I went on to try to start it as a service, put the script inside the `/etc/init.d/` folder, changed paths in the shell script where necessary, and made a symlink in `/etc/rc5.d/`. This resulted in the same error. I may have done something wrong in this last part seeing as I'd not super familiar with creating services in Linux. I followed [this](https://github.com/OpenLabTools/OpenLabTools/wiki/Launching-bash-scripts-at-startup) tutorial.

tl;dr: When I run [`transformer.py`](https://transformer.py) manually after logging in as ubuntu everything works exactly as it should. It's only when I try to start it up automatically that I get what appear to be network errors. The fact that this only happens on startup leads me to believe that the problem lies outside of my program and in the boot process of linux.

My working theory is that there is some AWS module inside of my linux image that was put there which boots before my program, which is why I was able to connect to AWS Lambda. Once I switched away from their stack though, the boot order changed to my script loading first and then any necessary network connections. Does anybody know what is going on? How can I get my script to work correctly? Any help would be greatly appreciated.

https://redd.it/fb98iy
@r_devops
What differentiates Monolith from Microservices?

I get the difference between Monolith and Microservices architecture.

Monolith is where all functionalities of the app are in one huge component so that whenever we make a change in one service, we have to re-deploy the whole thing.

Microservice is something that we make all services independent so that even if one service goes down, it won't kill the app and each service can have its own tech stack and be independently tested/deployed.

However, I feel hard to differentiate what is Monolith and what is Microservice.

For example, I have a project that has frontend and backend deployed to different servers and separately deployed. They can be deployed and tested separately having its own CI/CD pipeline. Is this clearly considered microservice?

I also have a project where both frontend and backend component goes in a single repo and has a single CI/CD pipeline where I build, test, and deploy both services. Is this a monolith because they are in one single repo?

What is the factor that makes a project monolith or microservices?

I understand the pros of the microservice, but I am not even sure if what I am building is a microservice way. Thanks in advance.

https://redd.it/fb3ufg
@r_devops
Using Terraform with public CI/CD outputs

A few CI/CD tools offer unlimited free execution minutes for public projects (eg: GitLab CI/CD and [Travis-CI.org](https://Travis-CI.org)).

I have a project which deploys to AWS using Terraform and my CI/CD pipeline consists of pushing a Docker image and running the \`plan\` and \`apply\` stages to deploy to ECS.

My question is: Assuming I use masked/secure variables in my Git project, is it safe to use Terraform on a project where the logs are visible to the public?

https://redd.it/fb24ma
@r_devops
Would you accept lower salary than your current job for a company with better fundamentals?

&#x200B;

I've been interviewing lately because of some concerns I have about the future of my current company. I really like my job, and my salary is competitive, I think. I am a fully remote Sr. SRE with a base of 145k and some bonus, good benefits, and good work/life balance. I'm self taught and have a lot of job experience but no degree.

&#x200B;

Today I received my first offer after several interviews. I've done more interviews this time around than ever before and it's been fairly exhausting and stressful.

&#x200B;

The offer is for 125k starting. This is \~14% less base salary than I'm currently making. I'm trying to get some more info so I can calculate the total comp, but at first glance it looks like a lesser package across the board, ie my current company covers 50% of my wife/daughter insurance etc, which this company does not.

&#x200B;

I'm definitely considering countering the offer before outright declining it, but they sortof told me in the email that this was what they could offer and it was partially based on the cost-of-living of my area, which is fairly LCOL. However I work remotely currently, and would work remotely for them as well (most of the time), and think of myself as a citizen of the internet and not of my city really when it comes to compensation.

&#x200B;

This seems like a no-brainer, but there are two big factors I'm considering.

&#x200B;

1. My current company is not profitable after \~8 years. There has been significant employee churn since I started, key players in engineering have left. It is V.C. funded, it's completely owned by investors and I assume extremely diluted stock wise. I don't really see a path upwards for myself, and the company could be insolvent in 12 months for all I know.
2. The new company was self-bootstrapped by it's founder/CEO into immediate profitability and has remained so for \~6 years. They are experiencing high growth, have told me that they want me to grow into the SRE lead/manager role at which point my comp could be raised. They've also supposedly 2.5x the options grant to try to bridge the gap. But these are just numbers of options, which obviously gives me very little information on actual value.

&#x200B;

If I had to guess I would say its going to be more work and less money though.

&#x200B;

&#x200B;

Thanks for your time and thoughts.

https://redd.it/fbpp8d
@r_devops
Grafana, K8s install troubles

hi community
Does anyone know why I would be able to see clusters but i cant see nodes or pods for my Grafana deployment?

https://redd.it/fbpktr
@r_devops
Have you tried traffic shadowing kind of testing?

Any experience with testing deployment code using traffic that is replicated from production? How did it work for you, any implementation tips? How did you handle the State problem - for many use cases the env under test needs the same state as production

https://redd.it/fbt3aq
@r_devops
Jop posting - is this depressing

I look at craigslist job posting occasionally and saw [this](https://portland.craigslist.org/mlt/sad/d/vancouver-jr-entry-level-python-linux/7082424324.html). Pay is $16 per hour 1099.

Getting paid via 1099 - no benefits and typically higher taxes. Is this typical for entry level?

https://redd.it/fbuzsg
@r_devops
A realistic lambda application

I'm looking to develop an application predominantly using AWS Lambda, potentially with some containers on Fargate (depending), using one or more hosted databases, potentially SNS, etc. and all deployed using Terraform. There are a lot of very simple guides out there for different aspects like deploying a single lambda / container, listening to a single event, etc. But I'm struggling to find a more wholistic guide, especially when it comes to the networking aspects and how to deploy to multiple environments (dev / staging / prod). I am reading through Terraform Up and Running, which is a great book, but it doesn't delve into the VPC / networking aspects. While I could of course read through all the AWS documentation, it's a bit overwhelming in complexity and I suspect I need a small subset of what's there - I'm not a Fortune 500 company. Could anyone suggest any pointers to books / tutorials? It would be greatly appreciated.

https://redd.it/fbudwc
@r_devops
How to stress test Prometheus host with Avalanche

Hi everyone..


I am trying to understand and do some basic stress testing for my Prometheus server but I am having a hard time going over the results and understanding them really.


I found this - [https://blog.freshtracks.io/load-testing-prometheus-metric-ingestion-5b878711711c](https://blog.freshtracks.io/load-testing-prometheus-metric-ingestion-5b878711711c)
[https://github.com/open-fresh/avalanche](https://github.com/open-fresh/avalanche)
And it seemed like a good idea/solution....
But sadly even tho I saw prometheus scrape response time spike with a few avalanche pods running..
I am not really sure how to go about into evaluating it better..


I tried looking here or other subreddits but no luck with finding a similar thread...
Anyone who has some experience to share?

https://redd.it/fbxeus
@r_devops
I'm burnt out Ex-Amazon engineer. What are my options other than DevOps?


Hello,

A little bit of history about myself. Last year, I burnt out due to stress and had a psychotic episode. I was diagnosed with Bipolar I and I had to quit my job at Amazon to focus on myself. I also had a major depression after which lasted for 3 months. Now, I'm relatively in better shape, taking my medication and seeing a therapist.

My problem is that I don't want to do DevOps related work anymore. It was my dream job but now it doesn't interest me. I'm fed up with oncalls and dealing with meaningless configuration files and systems. Something has changed in me.

Due to my condition, I need to do a remote work. I've taken a look at all available remote job websites but all I see is DevOps and programming posts. I tried to hunt for some technical writing (I love writing documentation!) but nothing has turned out. Same for customer support jobs. I think I can do technical support engineering without oncall as I love debugging systems but they're hard to find.

I'd appreciate if you can point some direction. What can I do?

Thanks.

https://redd.it/fbr8kv
@r_devops
Looking for a List of tasks for DevOps learning

I recall a roadmap and list of tasks for DevOps or a Linux admin to do in order to be one of you cool peeps.

It was "install WordPress. Deleting it. Write an Ansible playbook to do it. Delete it. Rewrite it to deploy to Kubernetes. "

https://redd.it/fbzcwx
@r_devops
Jenkins slave/master on top of Kubernetes

I am trying to create this slave, master architecture using [Jenkins/Kubernetes plugin](https://github.com/jenkinsci/kubernetes-plugin).

So this are my deployment/service files.

jenkins-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins
spec:
replicas: 1
template:
metadata:
labels:
app: jenkins
spec:
containers:
- name: jenkins
image: jenkins/jenkins:lts
env:
- name: JAVA_OPTS
value: -Djenkins.install.runSetupWizard=false
ports:
- name: http-port
containerPort: 8080
- name: jnlp-port
containerPort: 50000
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
volumes:
- name: jenkins-home
emptyDir: {}

jenkins-service.yaml

apiVersion: v1
kind: Service
metadata:
name: jenkins
spec:
type: NodePort
ports:
- port: 8080
targetPort: 8080
selector:
app: jenkins


These are screenshots for the jenkins configuration. For the IP addresses i have added the commands for getting these ip addresses, cause i wanted to show which ip address i`m using

https://paste.pics/72196977028a7838aaa25eef4a314e79
https://paste.pics/40179d0d2469833194d8501600c0a42b

So as you can see on these screen shots i can open Jenkins, also i can create Jenkins jobs but these jobs are running only on the master node. I have been following the tutorial from GitHub plugin link above.

https://redd.it/fbysm6
@r_devops
Database in the Cloud

Hi guys,

I built an app with Firebase and currently am using Firestore to fulfill my data needs. Firestore lacks decent querying capabilities though, so I am looking for another way to store my data. It needs to be in the cloud since I am using serverless functions to run my backend code and obviously a database cannot be installed on such a server. I’d like to have a NoSQL database, preferably mongo.

MongoDB Atlas gives my a shared node for free in the belgian region, which is quite nice. The only concern I have is that the free tier would be too weak on peak loads, but when I upgrade to the next in line package, it costs me €60 a month which is far too much for me right now.

I could run my own VPS on for example DigitalOcean, but then security is my own responsibility which is due to my limited knowledge of Linux/database security a substantial risk. Also I have the impression that running a server dedicated to running and exposing a mongodb database is security wise and performance wise bad practice. On the other hand, those VPSes are cheaper than anything else, like €5 a month.

In short I feel that there is a giant gap between a DIY database server and a cloud managed database server and I’m not sure which side of the gap I should go for. Am I overlooking something? Would the free tier of mongodb atlas be fine for a small startup (50k reads / 50k writes an hour on peak)? What do you guys say?

https://redd.it/fbxdh2
@r_devops
Few questions about prometheus - job definition, alertmanager, and selfsigned certs

Hello.

I am using prometheus for a while but now I am going to move it outside of docker to make it more reliable. Because of this, I have some spare time to look again into my configuration files.

Now there are my 3 questions:

\- What is the definition of a job? If I have a node exporter and cadvisor on 2 different ports running on [127.0.0.1](https://127.0.0.1) does it mean its a one job or two separate jobs? Its misleading when you can set multiple targets per job

\- Should I make alertmanager running on [0.0.0.0](https://0.0.0.0) instead of [127.0.0.1](https://127.0.0.1)? Generally speaking, are there any 3rd party integrations that could benefit from making it accessible from internet? Maybe grafana needs that?

\- I have both prometheus and node exporter (hidden from public network) on the same host, should I encrypt the connection with selfsigned certs to a node exporter that runs on [127.0.0.1](https://127.0.0.1) or this would be over engineering?

https://redd.it/fc38gh
@r_devops
The versatility of Kubernetes' initContainer

There are a lot of different ways to configure containers running on Kubernetes:

* Environment variables
* Config maps
* Volumes shared across multiple pods
* Arguments passed to scheduled pods
* etc.

Those alternatives fit a specific context, with specific requirements.

Read on https://blog.frankel.ch/versatility-kubernetes-initcontainer/

https://redd.it/fbx0qm
@r_devops
Monthly 'Getting into DevOps' thread - 2020/03

**What is DevOps?**

* [AWS has a great article](https://aws.amazon.com/devops/what-is-devops/) that outlines DevOps as a work environment where development and operations teams are no longer "siloed", but instead work together across the entire application lifecycle -- from development and test to deployment to operations -- and automate processes that historically have been manual and slow.

**Books to Read**

* [The Phoenix Project](https://www.amazon.com/Phoenix-Project-DevOps-Helping-Business/dp/1942788290) - one of the original books to delve into DevOps culture, explained through the story of a fictional company on the brink of failure.
* [The DevOps Handbook](https://www.amazon.com/dp/1942788002) - a practical "sequel" to The Phoenix Project.
* [Google's Site Reliability Engineering](https://landing.google.com/sre/books/) - Google engineers explain how they build, deploy, monitor, and maintain their systems.
* [The Site Reliability Workbook](https://landing.google.com/sre/workbook/toc/) - The practical companion to the Google's Site Reliability Engineering Book
* [The Unicorn Project](https://www.amazon.com/Unicorn-Project-Developers-Disruption-Thriving-ebook/dp/B07QT9QR41) - the "sequel" to The Phoenix Project.
* [DevOps for Dummies](https://www.amazon.com/DevOps-Dummies-Computer-Tech-ebook/dp/B07VXMLK3J/) - don't let the name fool you.

**What Should I Learn?**

* [Emily Wood's essay](https://crate.io/a/infrastructure-as-code-part-one/) - why infrastructure as code is so important into today's world.
* [2019 DevOps Roadmap](https://github.com/kamranahmedse/developer-roadmap#devops-roadmap) - one developer's ideas for which skills are needed in the DevOps world. This roadmap is controversial, as it may be too use-case specific, but serves as a good starting point for what tools are currently in use by companies.
* [This comment by /u/mdaffin](https://www.reddit.com/r/devops/comments/abcyl2/sorry_having_a_midlife_tech_crisis/eczhsu1/) - just remember, DevOps is a mindset to solving problems. It's less about the specific tools you know or the certificates you have, as it is the way you approach problem solving.
* [This comment by /u/jpswade](https://gist.github.com/jpswade/4135841363e72ece8086146bd7bb5d91) - what is DevOps and associated terminology.
* [Roadmap.sh](https://roadmap.sh/devops) - Step by step guide for DevOps or any other Operations Role

Remember: DevOps as a term and as a practice is still in flux, and is more about culture change than it is specific tooling. As such, specific skills and tool-sets are not universal, and recommendations for them should be taken only as suggestions.

**Previous Threads**
https://www.reddit.com/r/devops/comments/exfyhk/monthly_getting_into_devops_thread_2020012/

https://www.reddit.com/r/devops/comments/ei8x06/monthly_getting_into_devops_thread_202001/

https://www.reddit.com/r/devops/comments/e4pt90/monthly_getting_into_devops_thread_201912/

https://www.reddit.com/r/devops/comments/dq6nrc/monthly_getting_into_devops_thread_201911/

https://www.reddit.com/r/devops/comments/dbusbr/monthly_getting_into_devops_thread_201910/

https://www.reddit.com/r/devops/comments/cydrpv/monthly_getting_into_devops_thread_201909/

https://www.reddit.com/r/devops/comments/ckqdpv/monthly_getting_into_devops_thread_201908/

https://www.reddit.com/r/devops/comments/c7ti5p/monthly_getting_into_devops_thread_201907/

https://www.reddit.com/r/devops/comments/bvqyrw/monthly_getting_into_devops_thread_201906/

https://www.reddit.com/r/devops/comments/blu4oh/monthly_getting_into_devops_thread_201905/

https://www.reddit.com/r/devops/comments/b7yj4m/monthly_getting_into_devops_thread_201904/

https://www.reddit.com/r/devops/comments/axcebk/monthly_getting_into_devops_thread/

**Please keep this on topic (as a reference for those new to devops).**

https://redd.it/fc6ezw
@r_devops
How do I do this Jira post request in postman?

[https://developer.atlassian.com/server/jira/platform/jira-rest-api-example-add-comment-8946422/](https://developer.atlassian.com/server/jira/platform/jira-rest-api-example-add-comment-8946422/)

&#x200B;

&#x200B;

I am basic authing my account in the authorization tab, i'm not sure how to apply the comment body

https://redd.it/fc3b2h
@r_devops