Reddit DevOps
268 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Tools to provision and manage Public and Private Cloud.

We are a private cloud solutions company, and we are moving from only private cloud to hybrid + edge cloud, we use OpenStack as our private cloud solution, the problem is I am not able to find the perfect tool that manages and provisions Openstack and public clouds like AWS, GCP, and Azure. Terraform is what comes the closest to it, but the problem with terraform is that it's mostly a CLI tool, I am looking for something API-based or service-based. We don't mind using multiple tools if that gets the work done.

https://redd.it/nr7vhm
@r_devops
What's the most convenient order in which to install Consul, Nomad, and Vault

I'm trying to set up a simple 3+-machine Vault, Consul and Nomad DC:

- Machine 1: Vault-server, Consul-server, Nomad-server
- Machine 2: Consul-client, Nomad-client
- Machine 3: Consul-client, Nomad-client

What is the most convenient order to set up these services?

Consul first, then Nomad, then Vault;
or Vault, Consul, Nomad; or Consul, Vault, Nomad?

I could have Vault running in a container, managed by Nomad, or I could use Vault to provide the certificates needed to set up mTLS with Consul.

If you have any tips or tricks, feel free to share.

https://redd.it/nr7oka
@r_devops
Need advice on better designing a basic lamp workflow among multiple machines.

Hey there!

I'm struggling in trying to make this understandable.

I'm currently a one person show, developing with LAMP. I'm having trouble designing an efficient workflow when I decide to develop on a desktop alongside my macbook (air, 2014. pretty old. works well though.). I'd love other peoples input into their workflows of similar goals. I feel like I'm making this harder than it should be, but I'm at the level where I'm not sure what to google next.

I currently have an apache web server installed locally on my macbook, using the homebrew tool. Using the brew tool, I also install php etc. I initially write my files into a separate, project folder then push this to my local apache webserver for testing.

Now say I want to work on this project on my desktop. Right now I just have dropbox watching my source files, so they're readily available on both desktop and macbook.

On my desktop: I use vagrant to spin up a vanilla Ubuntu (16.04+) and install an lamp web server on that vm. and push my source files onto that server, as with my macbook workflow.

Problems;

When I develop on my desktop vagrant VM, the apache config works a little differently in both environments. I just don't feel confident with the fact that installing a webserver is done differently on both platforms, differing dependencies / other things I probably don't even know about.

I can't just run vagrant on my mac because of resource usage, battery life etc.

Between the latency connecting to central remote development server, and the fact I sometimes cannot afford to pay for a VPS, these rule out using digital ocean et al. as a development environment.

Having to push all my code to a local webserver every iteration for testing seems annoying. Is this just part of it? Should I set up some bash scripts to automate this file upload? AHHH

I'm not at the level where I absolutely need consistency between both platforms, but it's bothering me and I'm wondering how others approach it.

I would like to have a workflow that offers consistent develop environment across all platforms. It's easier if it's just front-end.

Thanks loads if you got through that!

https://redd.it/nr73jw
@r_devops
Question: GitLab CI/CD environments - One-click rollback with multiple jobs

Hi all, I'm trying to utilize the rollback function and Environments in GitLab, and currently trying to figure out how to properly use it.

I want to achieve a one-click rollback in the web-ui.

My current mock up deployment pipeline is set up as per below.
(the real ci file is quite long)


stages:
- build
- pre-deploy
- deploy

docker_build:
stage: build
script:
- docker build -t $TAG_COMMIT -t $TAG_LATEST .
- docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY
- docker push $TAG_COMMIT
environment:
name: dev
tags:
- builder_01

pre_deploy_server_1:
stage: pre-deploy
script:
- rsync -a src gitlab@fake-host-1:/opt/deployments/$CI_COMMIT_SHORT_SHA
- ssh gitlab@fake-host-1 "sudo ln -sfn /opt/deployments/$CI_COMMIT_SHORT_SHA /opt/appname"
environment:
name: dev
tags:
- builder_01

pre_deploy_server_2:
stage: pre-deploy
script:
- rsync -a src gitlab@fake-host-2:/opt/deployments/$CI_COMMIT_SHORT_SHA
- ssh gitlab@fake-host-2 "sudo ln -sfn /opt/deployments/$CI_COMMIT_SHORT_SHA /opt/appname"
environment:
name: dev
tags:
- builder_01

deploy:
stage: deploy
script:
- docker login -u $CI_DEPLOY_USER -p $CI_DEPLOY_PASSWORD $CI_REGISTRY
- docker stack deploy --compose-file docker-compose.yaml stack-name
environment:
name: dev
tags:
- docker_swarm_manager

The deploy stage and build/pre-deploy needs to be executed in separate runners, and using tags for this.
In my real ci file, the rsync task is executed to 10 severs, with a lot of additional commands not listed here.

I separated each of the rsync jobs as separate jobs to get the granularity in the UI to see exactly which node a deployment failed on.


In a rollback scenario with the current setup, I need to:

* Go to the Operations -> Environments section in GitLab
* Enter the "dev" environment
* Click the rollback button for **each** of the defined jobs as per the above ci file

I'm trying to achieve a one-click rollback solution, and I'm having a hard time understanding how I **should** structure the config to achieve this. Am I trying to implement something that is not possible?

Any advice or pointers is appreciated!

https://redd.it/nr5zci
@r_devops
Tips to deal with people that don't want to understand technology

Hi

I'm having a hard time in dealing with people that don't understand the technology and don't even bother to listen why things aren't as simple as they think it is.

I'm a CTO of a rather large company with multiple physical sites and my peers and CEO are in the top 10 that harass me most.

Thinks like "I just want to connect the damn thing to the internet" when we're talking about connecting a solar panel that requires access from WAN to it in a scenario of chained routers, VLANs, firewalls, and VPNs.

I don't feel listened to or respected when it comes to deciding/planning over technology and governance. I get a reaction like "you're overcomplicating" and "don't put problems where they don't exist". And later I show them that putting the cart before the horse screws things up.

It's becoming recurring, with all sorts of examples, and I'm lacking the soft skills to manage it.

And my patience too.

​

Any tips?

https://redd.it/ns0oje
@r_devops
DevOps Workflow Framework Repo

Hi!
I've been working on a python-based parallel workflow framework that is great for custom devops. It's still pre-alpha, but uses an innovative paradigm to write simple parallel task graphs that can orchestrate a variety of devops tasks, local or remote. Across cloud, containers, repos, etc.

Have a look and I appreciate any comments or contributions!
https://github.com/radiantone/entangle

Example task declarations:
@process
@aws(keys=[])
@ec2(ami='ami-12345')
def myfunc():
return

@process
@aws(keys=[])
@fargate(ram='2GB', cpu='Xeon')
def myfunc():
return

@process
@docker(image="tensorflow/tensorflow:latest-gpu")
def reduce_sum():
import tensorflow as tf
return tf.reduce_sum(tf.random.normal([1000, 1000]))


Write your own decorators and mix-n-match to get powerful workflows with simple python!

I do need to expand the readme for devops use cases and that is coming soon.

https://redd.it/ns6fhe
@r_devops
Deploy ROR application on ubuntu VM using Capistrano and Gitlab CI/CD

I am getting the below error when I deployed the Ruby application on Ubuntu VM using GitLab ci errorNet::SSH::AuthenticationFailed: Authentication failed for user **[email protected]**

Here is my Gitlab ci

deploy:
stage: deploy
script:
- which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )
- eval $(ssh-agent -s)- echo "$SSHPRIVATEKEY" | ssh-add -
- mkdir -p ~/.ssh
- chmod 700 ~/.ssh
- bundle install --jobs $(nproc) "${FLAGS@}"
- gem install capistrano
- gem install net-ssh --pre
- cap production deploy

I can access the deployment server from GitLab runner and I have also put the Deploy server private key in the GitLab variable.

Please let me know where I am doing wrong or am I missing any step? I have followed the below link but it not working as expected
https://medium.com/2glab/gitlab-continuous-delivery-with-capistrano-169055a6da51

https://redd.it/ns4ais
@r_devops
Where to begin with setting up TeamCity with existing AWS-Serverless project?

Hello,

I'm new to CI/CD, and I'm trying to gain experience with it by incorporating it into a side-project I have. I would like to set up TeamCity as it's what my company uses almost universally, and I'd like to expand my skill-set. Currently, I'm using the Serverless framework for my mono-repo, alongside a script that builds and deploys all relevant AWS services and the React front-end, but I have no idea where to begin in regards to TeamCity. I followed a Udemy tutorial to the T and the TeamCity setup refused to connect to an RDS DB I had created for the purpose of the tutorial.

​

I've tried searching for relevant tutorials/guides on how to start using TeamCity with my project but one downside of naming a framework after a computing model is that Google only seems to spit out irrelevant results...

​

Any pointers or information that could point me in the right direction would be greatly appreciated.

If there's anything I've missed please do not hesitate to ask, I'm still somewhat new to AWS in general and do not have any pre-existing experience with CI/CD.

https://redd.it/ns3gn7
@r_devops
Exposing Custom Resource Statuses outside of a cluster

This might be an odd question, so apologies if it's off the wall.

In our world of GitOps, we have a variety of checks and balances that go into every Merge/Pull Request.

Once a requested change to the end system is merged to master/main, we rely on operators to pull git templates and customer input to push into etc.d, where another operator makes the API calls to the end system from the custom resources.

Does anyone have any thoughts on how to expose the custom resource statuses to an external dashboard? We're using GKE in Google Cloud and the native dashboards don't expose this information very well. The operators also don't expose the state of the customer resources via metrics very well either.

Just curious if there is a pattern we should use to expose this data outside of the cluster to (perhaps) a prometheus/grafana stack?

https://redd.it/nq5bas
@r_devops
AWS NLB stuck on pending on new KOPS cluster

I have a new KOPS cluster I created today and am trying to get the cluster to apply a NLB so I can have my ingress work. I am using the YAML provided here: https://kubernetes.github.io/ingress-nginx/deploy/#aws \- I have taken the file and split it up into it's own sections and all depoys fine. Nothing wrong except the service for the load balancer is stuck in the pending stage and describing the service does nothing useful other than tell me how long it has been in that state.

Bottom of describe

Normal EnsuringLoadBalancer 103s (x47 over 3h27m) service-controller Ensuring load balancer

My ingress.yaml file

apiVersion: networking.k8s.io/v1beta1
# apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
# add an annotation indicating the issuer to use.
kubernetes.io/ingress.class: "nginx"
cert-manager.io/cluster-issuer: "letsencrypt-stage"
# needed to allow the front end to talk to the back end
nginx.ingress.kubernetes.io/cors-allow-origin: "https://api.dev.mydomain.ca"
nginx.ingress.kubernetes.io/cors-allow-credentials: "true"
nginx.ingress.kubernetes.io/enable-cors: "true"
nginx.ingress.kubernetes.io/cors-allow-methods: "GET, PUT, POST, DELETE, PATCH, OPTIONS"
# needed for monitoring - maybe
prometheus.io/scrape: "true"
prometheus.io/port: "10254"
#for nginx ingress controller
ad.datadoghq.com/nginx-ingress-controller.checknames: '["nginx","nginxingresscontroller"]'
ad.datadoghq.com/nginx-ingress-controller.initconfigs: '{},{}'
ad.datadoghq.com/nginx-ingress-controller.instances: '{"nginx_status_url": "https://%%host%%:18080/nginx_status"},{"prometheus_url": "https://%%host%%:10254/metrics"}'
ad.datadoghq.com/nginx-ingress-controller.logs: '{"service": "controller", "source":"nginx-ingress-controller"}'
name: nginx-ingress
namespace: custom-namespace
spec:
rules:
- host: api.dev.mydomain.ca
http:
paths:
- backend:
serviceName: express-api
servicePort: 8090
path: /
- host: socket.dev.mydomain.ca
http:
paths:
- backend:
serviceName: socketio
servicePort: 9000
path: /
tls:
- hosts:
- api.dev.mydomain.ca
secretName: express-ingress-cert
- hosts:
- socket.dev.mydomain.ca
secretName: socket-ingress-cert

I am wondering how I can get an NLB to provision and allow me to point DNS at it and have the above ingress resource direct traffic where it needs to go.

https://redd.it/nq7yie
@r_devops
Manychat

When using manychat ( fb messenger chat bot ) how to publish the bot and make it work ? Ps : I published the flows but still didn't work when someone text the page .

https://redd.it/nq6orn
@r_devops
How to use strace on threads managed by supervisor? i.e. i want to ´supervisorctl restart someService´ and strace someService.

This would be great for debugging and for understanding the system i am working on better. A core mechanism of our system is a collection of services started and restarted using supervisor and I would like to use strace to see that system calls these processes make.

I think it is possible to look up the PID of a already running process and then hook strace onto it, but this will miss the first system calls of the process and I would like to get them all.

Any suggestions?

https://redd.it/nshax6
@r_devops
What are some good tutorials that would allow you to add a little CI/CD scripts or pipeline to any project?

What are some good tutorials that would allow you to add a little CI/CD scripts or pipeline to any project? Looking for some recipes that can be used in almost any project be it backend or frontend.

https://redd.it/nsiylk
@r_devops
Managing Windows VM Cloud Node Images question.

Hello I am a devops engineer. My company uses Jenkins for CI and we use Openstack cloud nodes as our build nodes.

As a result of this my department handles building our Linux and Windows VM images to be used on these cloud nodes.

The problem I am running into is the sheer amount of installations on the Windows nodes is crazy. We use a SCCM Task List that IT manages and that task list has over 350 steps for our image alone.

How do you manage to keep images a reasonable size and deliver in a timely manner for your developers. For reference the company I work at is heavy on the embedded device side.

Any advice would help. Although I would prefer more in-depth advice than (SCCM is old use Chocolaty).

Thank you for your time.

https://redd.it/nsigfj
@r_devops
Looking for work experience

Hi all

I am between a rock and a hard place and can't seem to get myself out of a situation.

I am transitioning into tech from the oil and gas industry as a metallurgist.

I have successfully completed the AWS solutions architect exam. I have also undertook training on docker, python, cloudformation but can't seem to land my first role even as a junior.

I wanted to ask the community if anyone is willing to give me some work experience. I have already lost 1.5 years because of covid and I am desperately trying to secure some work. I do not need paid work (although that would be nice) but a 3 month project would be really beneficial.

https://redd.it/nsdkz9
@r_devops
Dev site being attacked, cant access certain parts of CMS

I have 2 sites hosted on Azure setup as an App Service with a Front door. One site is the Dev site, the other is the prod site. They are getting hit within milliseconds and anon contacts are being created in the database and the DB is growing to be huge. Its slowing everything down and the CMS is difficult to access.

What should I do to protect the dev and the prod sites from these attacks that wont break the bank?

The people that work on the sites work remotely and the IPs they have can change depending on where they are.

Is there a way to get the IPs of the bot and block it via Azure?

https://redd.it/ns920z
@r_devops
Python use cases for devops

Hi guys I have been learning python lately
Is there a practical use case of python which you are doing in your work am not asking about development
I need to know the use cases in the ops side of things
Whenever I automate something I always shell it been shell scripting for several years so it comes automatically
Anyone tried replacing shell with python ?
Any good examples of that sort please

https://redd.it/ns8yp3
@r_devops
Puppet and Openshift?

Good afternoon /r/devops,

I have a question about puppet in regarding how it’s configured ideally in a openshift environment. From my general understanding isn’t Ansible better within openshift? I ask because for this job description within applying lately it states typical Linux system admin requirements but additionally states; preferable knowledge of “puppet and openshift”

So I am assuming their environments are most likely in the cloud going on a whim but not understanding the caveat of why use puppet with openshift instead of ansible ( I thought ansible was more used with openshift ). Additionally architecture wise, in a clustered environment within the cloud; is it common practice to have your masters separate from your workers to talk?

Example; would master 1, 2, 3, talk to workers a, b, c using the puppet files and have kubernetes configs within those same files?

Sorry if this is some noob question but I’m trying to really grasp the higher level of this so it clicks better and makes sense on why you would use X over Y thing. Analogies, differences, are welcomed. Thanks a lot!

https://redd.it/nt5dub
@r_devops
Automating database migration with CI/CD

Hi there. I like to automate deployment steps in our repos. Our current pipeline supports fully automated deployment of the software to our k8s cluster. But there's a catch. We have to manually migrate database before merging codebase to main tree. I'm currently using GitHub Actions for executing pipeline jobs. Previously used GitLab CI and also tried Azure DevOps but GitHub feels a bit more friendly to me (also has lots of community provided jobs on market).

So, I wonder if there's a way (I'm sure there's) to automate database migration steps in CI? How is it done usually? Any tips or links would be appreciated.


A bit more details:

We are using "code first migration" to migrate database manually. So It should be easy to execute the actual migration part. I just want to learn more about security and best practices about that before actually applying this. I'll probably create a workflow with dispatch_job trigger to manually trigger migration steps. But I haven't yet figured out how should I securely connect to database to do migration or should I do migration from our k8s cluster by somehow creating one time job to do the migration. I'm currently just exploring the possibilities...

Thanks

https://redd.it/nt0n6j
@r_devops
Troubleshooting a question

**How would you go about troubleshooting a server that is down?**


* Check the server logs to see if you can find anything event that might have shut down the server it.

* Reboot the server and see if that fixes the issue

* Check network connectivity by pinging another server on the network.

https://redd.it/nt8a7s
@r_devops
Automate Setting Up WordPress Server

Hey folks, my friend who I'm helping set up his Wordpress site had me do these steps three times already and I want to automate it:

1. Spin up an EC2 instance with Bitnami Wordpress as its AMI
2. Download its keypair
3. Create elastic IP and associate with EC2
4. Assign elastic IP to DNS record on Route 53
5. Get Wordpress password from EC2 system log
6. Use bncert tool on EC2 CLI to enable https on Wordpress site

Could someone please help me determine what DevOps tool should I use for each step?

https://redd.it/nsy9e6
@r_devops