Reddit DevOps
269 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Deploying TIG stack for global network

Hello, we have become more interested in getting rid of our current network monitoring system and replacing it with
Telegraf
Influx/Kapacitor
Grafana

We stood up some test instances and really like what we are able to do/customization and currently playing with kapacitor to alert us. This is entirely for monitoring our network devices via SNMP/Telemetry.

The next step is to start and properly setup the entire stack for production, but we are a little stuck on the direction to go about this. We have 10+ data centers globally (500+ network devices) and would like to distribute the instances based on regions (AMER/APAC). What approach have you seen or recommend for standing up TIG for a larger environment.thanks.

Was thinking of deploying a pair of telegraf/influxdbs at each DC or general region (Midwest/east coast) and feed a single grafana instance. With that is there tools to help manage multiple telegraf/influx hosts? Any way to aggregate into a single data source? Or will have to be separate?

https://redd.it/z28uls
@r_devops
Who defines secret management / certificate management in your company

Hi All,

Wanted to check who generally defines some of the DevOps surrounding processes like secret management and certificate management in your organisations. In my experience I have seen enterprise or dev architects define this mostly not DevOps team or DevOps architect. For secret management, you also need to adapt to a coding framework/ coding methods to be able to read and use those secrets in code.

What do you think?

Thanks

https://redd.it/z2d7zy
@r_devops
Alternative to InSpec: what do you use to "assert things have been correctly configured"?

I used InSpec the past to run once in a while to assert that some things have been correctly configured and report back if not.

Typically:

Checking the content of some files or the status.of some services after build a new AMI with Packer
Checking that security groups re correctly configured according to our "compliance du jour"

I really wanted to love InSpec for that but it still a PITA to use:

The doc is really not great, especially when bootstrapping a new project. Once you have resources configured and want to add more or tweak some, it's a bit better, but still
It's Ruby and all its dependency issues :)
Plugin support is still... Interesting. I'm still not sure what to use/instal to assert a few basic Kubernetes. Trying to install `train-kubernetes` gives me a dependency conflict error (\o/), last `inspec-k8s` is super old and installing that gem installs ... `inspec-k8s` 0.0.0
Just tried a few more things and use a k8s_deployment resource to assert a deployment. It fails with a terrible traceback because it doesn't know the resource.


Is any of you using something else and would recommend it?

At the moment, I'm mostly interested in testing GCP and Kubernetes resources, but that may change in the future.

https://redd.it/z2gb9d
@r_devops
robocopy time estimate

How do I calculate the time it will take to copy from source to destination using robocopy?

I performed a "dry run" test with the robocopy command and the /L parameters. It listed all of the files it would copy as well as a summary.

It displays a Time column in the log file's summary. Is this an estimated time to copy ? If not , how to determine the estimated time to copy ?

https://redd.it/z2gs9n
@r_devops
ECS deployed on EC2 not accessible via HTTP

I have deployed an ECS cluster with EC2, tasks are running fine even I have checked inside EC2 that the container is running with desired port mapping. But, when I try to access it with that port it's not accessible. It says connection refused. Checked with curl and ping to EC2 IP, has no reply. I have configured security rules accordingly, still no luck. The same docker image runs with fargate launch successfully. Only having issue with EC2 type. Can not figure out the issue.

https://redd.it/z2j8cn
@r_devops
Do you guys know where I could find stuff like this?

[https://imgur.com/a/Ay7Vdqw](https://imgur.com/a/Ay7Vdqw)

Basically, a website where it showcase how to achieve the following on a service:

* Design a Resilient Architecture
* Design High-Performing Architecture
* Design Cost Optimized Architecture

https://redd.it/z2kkk2
@r_devops
BitBucket - non-consecutive manual triggers

Howdy all,

Bitbucket have a manual triggers where I can run or not, good example is a deployment step. But I didn't found a way to run the steps non-consecutive, because sometimes my pipeline will not need to execute a certain step.

​

Trivial code snippet:

>\- step:
name: List files
trigger: manual
script:
\- ls -lah


Thanks.

https://redd.it/z2ljpi
@r_devops
DevOps and Localization: Improving an overlooked area

My experiences working as a developer is that localization in many companies (teams) sticks out as the odd child that isn't properly integrated into the development flow. What I have noticed:


- localization process often stalls development because copy/translations are required to proceed


- developers has to copy-paste translations around they don't understand


- localization are spread across Google Sheets, Jira tickets, Emails, and code base (not to mention spontaneous changes)


- cumbersome process to add or update copy and translation means it often get neglected resulting in suboptimal end product (unprofessionalism, out of date content, unclear or buggy content)


This adds up to a a lot of wasted time and effort. I am trying to find out if others share some of my observations and what the best solutions are? I have been trying out a few different solutions in a small side project as well.

https://redd.it/z2p0ka
@r_devops
Ensure that an ansible secrets.yml is never committed unencrypted

I use GitLab for version control and have a lot of secret variables I need to have version control over. However, we I don't want this committed in plain yaml without being encrypted first. How do people typically manage this problem?

I'm wondering if there is some kind of pre-commit hook within GitLab that i could link to a script that checks/validates the contents before accepting the commit.

​

edit: just found this https://aaron.cc/prevent-unencrypted-ansible-vaults-from-being-pushed-to-git/ so it seems Gitlab hooks is the correct way to enforce this server side.

https://redd.it/z2tucy
@r_devops
Kubernetes on DigitalOcean pricing for low usage and limitations

Hello,
I'm hosting a Rails application for a client of mine and the total cost is about 30$/month, that is 3 droplets, one is postgres, 1 is "background workers" and one is "web" (6$+6$+18$).

I've been "constantly" dealing with the annoyance of having to upgrade the servers, which deserve way more maintenance than I give them, but lately after I made a bunch of changes to the application, I realized that I would like to avoid this all together: makes deployment stressful, I feel uncomfortable because of security patches I need to keep up with at the server level (on top of the app!) and in general even two identical servers diverge somewhat over time.
I thought of just running Docker images inside simple droplets, but putting docker images inside systemd seems to come with some limitations (the process for killing the image is not straightforward)

Out of curiosity, I'm exploring the idea of using kubernetes and terraform as an alternative, I like learning so studying comes as a plus in some way.

Notice that this app has been up for over 10 years, but it still receives new features, so I'm expecting more requests and changes over time.

Is it possible to come close in pricing in DO using Kubernetes? I'd like to know before I commit to even studying stuff. The main questions are:

Basic node pricing seems to be 12$, but it also say that the cost is based on a per-droplet basis, which one is it, if I want to use a 6$ droplet for one of the nodes, will it still cost 12$?
Is a load balancer required? What's the downside of not having one in this case
I need a safe place to store file uploads. The app doesn't support S3 yet, so it needs to be something mounted as a filesystem. I was reading that I can't mount NFS on multiple instances, which would be a serious limitation, is this correct? Is there anything I could share between nodes that I can mount as a filesystem?
are my app logs kept anywhere if I need to debug?


There are quite a few things I can compromise on:

managed postgres, i already discussed this additional expense, so the "postgres" node will be gone
I can pay for Spaces ($5/month),it's a waste because space usage is like 10GB,definitely not 100
one of the machines is currently 2gb of ram, 2 vcpus (needed for the big excel files generated occasionally). I can split this in 2, but from my understanding, I would need a load balancer in that case, increasing the price.

Is there any way to scale to 0 the "background worker" node without spending money on a node to orchestrate that (keda or knative)? That node is used highly infrequently.

Based on my calculation, I would end up needing:

2 nodes, one will cost 12$ and the other 18$
1 managed database
Spaces

Total cost is 50$/month. I would love to have the worker node be small (6$/month),but I can't figure out if DO allows that.

On top of this, I won't be able to increase availability unless I pay 12$ for the load balancer, sadly.

App requirements:

filesystem for file uploads (at some point, I'll move this to s3-like)
postgres
1gb ram
worker process and web process need to be split to avoid workers using all resources needed to serve web
static file serving and proxy to the main application, I usually use nginx (does it need to be another node?)
ideally the filesystem for file uploads can be shared in some way between worker and web nodes. It needs to be accessible by nginx for serving the files
backups for db and Spaces
https, currently managed by letsencrypt with nginx plugin

Based on this huge wall of text, sorry, I'm not confident it's straightforward to keep price in the same range. 50$/month would be acceptable, but if I had to add a load balancer and a separate node for filesysyem sharing, that would put me at 74$/month which is almost 3 times as of the current price.
On top of that, I'm uncertain about nginx: unless it's provided, I will need to put it on the same node as the main application, or a separate node and pay another 12$,but then
how does it access the file upload filesystem?

Please forgive any term misusage, I'm realizing that "node" might be the wrong term, as I said, I know very little about Kubernetes at this time.

EDIT: Based on the reading I'm doing, it seems like I'm missing the concept of a Pod.
While I don't believe the background worker belongs necessary to the same Pod as the web app, I could put them in the same pod and limit the resources of the background worker process. I would use a 12$ droplet and I could set autoscaling to max 2 or even 3 and include a load balancer. This would bring me still close to 24$+20$ "base price", but the app would be able to tolerate bursts.
The filesystem seems to be sharable in the Pod, so this could solve the problem.

https://redd.it/z2q4cp
@r_devops
K8S Operators - How do you reserve on every node resources for system daemonsets ?

Lets say that my workloads are running on instances with 8CPU and 64gb mem.

I need to make sure that every node future or past - will have around 2CPU and 16GB reserved for my "future or past" daemonsets Ill have to deploy as an admin.

How to make sure that scheduler for X workloads will see there is only 6CPU available and for others will see there is only 2CPUs available.

I want to split lifecycle and resource management of my administration pods and customer pods.

Currently sometimes customer workloads make it impossible to deploy administration pods due to resources saturation. (I may add more or remove some administration pods later).

Edit. - my only find for now is to add high priorityClass for administration pods, but this can cause downtime for any other deployment that has lower priority. I would prefer to avoid that problem from the go.

https://redd.it/z2m7ep
@r_devops
How do you configure an nginx server block without an "index.html"? Is that possible?

I have this code running on DigitalOcean VM right now as a backend.

https://github.com/u/netflix-clone-back

I'm trying to configure an Nginx server block with it but I don't have an "index.html" file to use.

I used this package pm2 to keep it running perpetually as a process with the command "pm2 index.js".

And due to the fact it's just a node and ExpressJS app, I don't have a "npm build" option in the "package.json" to get a build folder with an "index.html" to use in the server block.

My question is, how do I run the equivalent to "npm build" here?

And if I do get a build folder, will I still be able to run "pm2 index.html" to keep it perpetually running as a process in the VM? Will I even need pm2? Or will Nginx keep it running perpetually itself?

https://redd.it/z30xl3
@r_devops
set up and tear down eks nodes to run integration tests

I'm looking for suggestions on optimizing resources when running tests in an EKS environment. We have difrerent accounts/environments for dev, staging, test and prod deployed with terraform on AWS. They have the same architecture/resources but we want to reduce costs and I am wondering if someone has experience on provisioning eks nodes on demand. The idea is to reduce the eks workers node to cero and when someone wants to run tests just provision a worker node, deploy the pods/resources and when the tests are over just tear down the compute node again. We use CircleCI right now as our CI/CD....so wanted to ask if do you see any drawbacks/disvantages on this plan? Someone is doing something similar? Thanks in advance

https://redd.it/z35acn
@r_devops
How to manage your dotfiles with git

https://fwuensche.medium.com/how-to-manage-your-dotfiles-with-git-f7aeed8adf8b

It's been a while that I wrote this post, but I've just got a new computer and the post was still surprisingly relevant. I'm sharing it here in the hope that it's also useful to others 🤗

https://redd.it/z363nm
@r_devops
What's a good way to design this?

I once wrote a script which would check an AWS S3 bucket for the existence of an object, and if it existed it would run a script locally on my laptop. Like a sort of automated task which I could trigger remotely by placing an object in an S3 bucket. However it turned out that sending an API call every 30 seconds cost quite a bit considering its literally doing nothing most of the time.

Now that we have things like AWS event bridge (which didn exist when i wrote this years ago). Is there a nice way of accomplishing this? I'm curious how other people would go about it.

https://redd.it/z2mowl
@r_devops
DevOps/Cloud engineer reporting to a Developer?

I recently joined a company that implements a different kind of setup for teams. For example, one team has backend developers, QA's, Product manager, frontend/mobile devs and one dedicated cloud/devops engineer, about 10-12 people per team. The lead of this team is a developer, so I'm reporting to a developer who has no idea what I do as a Cloud Engineer, they basically just throw requests at me to build and deploy (quite frankly the same damn thing thay just differ in service names).

My question to the community is: is this a normal setup nowadays?

I'm having a hard time adjusting to it because I feel like it's a relegation from what I do then (leans more towards infrastructure design and build, CI/CD, cloud solutions architecture, etc) and now (developer support quite frankly).

Adding fuel to the fire is that my feedback from the team is usually just stuff that patronizes me. No concrete feedback on the technical stuff (is my output any good or how can I improve this skill, stuff like that).

https://redd.it/z38epb
@r_devops
Imagine you could have a secret manager with any functionality at all. What would you want in it?

Hi everyone!

We're building a new open-source secret manager (https://github.com/Infisical/infisical) that's modern and easy to use. We're still super early but wanted to get your thoughts on what the ideal secret manager experience should look like if you could rebuild one from scratch — Imagine anything is possible.

https://redd.it/z39ij4
@r_devops