Reddit DevOps
267 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Best way to learn Linux?

I've been looking at improving my core skills like networking and Linux. I was thinking about using LA playgrounds, installing Linux as dual boot on my laptop, renting a VPS etc...

Has anyone got any good recommendations?

https://redd.it/lob2ck
@r_devops
Watch Kubernetes Experts Fix Broken Kubernetes Clusters Live

I’ve launched a new series of episodes called Klustered. These episodes feature myself and a guest from the Kubernetes community attempting to fix some Kubernetes clusters. These clusters are also broken by community members 😀

We know nothing upfront. The first episode was very fun. Episodes will be live on YouTube every Thursday. Best week we have clusters broken by Jason DeTiberus and Justin Garrison.

I hope you enjoy

https://youtu.be/teB22ZuV_z8

https://redd.it/lo7a8v
@r_devops
Monitoring 5,000 nodes

Hello.

I’m curious what solutions a community like this employs for the following scenario:

We’re looking to put about 5,000 Linux boxes across America inside of stores. They serve an important purpose and will be more or less 5,000 of the same image. This is a big increase in scale for us as our existing Linux server footprint is roughly 1,500.

We currently use Zabbix but I find it lacks in scalability and supportability.

The support will require cross collaboration between Linux OS support, database support, and application developers, so I am looking for a solution where these disparate teams can write their own monitoring and alerting solutions for their use-cases relatively easily (definitely a challenge to do with Zabbix).

I’ve been thinking about Sensu but I am interested in hearing other options/experiences here.

https://redd.it/lo9l76
@r_devops
How do you trace root cause analysis on your microservices



Hey guys trying to gain some inspiration to rethink how can I make this process less horrible in my own life

Seems to me that everyone is using the same method when doing root cause analysis (on dev/staging/prod envs), Plugging it all to some ELK, Using Kiali/Other tool for specific MS log trailing.

The process is usually something like getting some first order cause like a request failing -> finding where it started -> going to the Log trailing tool(Kiali etc.) finding the exception -> getting the trace id -> search in Kibana with trace id -> move through massive number of lines -> find next stacktrace on another MS -> repeat until finding root cause.

This is of course when you even have a stack trace that gives you more info, what if it is some authorization issue between services or some other DevOps tools in the stack (istio etc.)

Tools like datadog/splunk show the request trace and status but this doesn't solve the long root cause analysis in most of the cases

Hope you guys have something better in practice =)


Thanks in advance

https://redd.it/loawxb
@r_devops
Best practices for domain configuration

I'm setting up my own ci/cd pipeline on Docker with GitLab-CE and NGINX as reverse proxy.

I'm trying to set it up in a way where it will be fairly portable so I can use it, set it up quickly on different VPS with just docker compose.

Right now I'm on my laptop and in my host file I just set my localhost to some fake local domain local.lab

What's the proper, secure way of doing this and how is it done in a companies?

When I'm preparing setup like that should I even rely on the localhost or should I use actual domain and use SSL certificates? If you use the real domain name, how do you restrict it and make it secure?

https://redd.it/loaq1r
@r_devops
Question regarding database for responsive analytics

On current project we have a webapp with analytics module. The users select some filters and based on those filters table or graph is shown. We want the module to be responsive, so when the users select the filter that it can get data in matters of seconds.

Users filter are querying a large table (~1,000,000,000 rows and 20 columns). All columns except two are filtrable.Currently we are using Redshift but it's way too slow. Also, there is daily import in a table lasts around 15 hours (it is also too slow).

We are discussing between Clickhouse, Vertica and  BigQuery to replace Redshift.

Did anyone had similar a use case and which database solution would you recommend?

https://redd.it/loal9f
@r_devops
Nginx / uWsgi crashing about once an hour, please help

I’m running uWsgi and Nginx with Python.About once an hour my application is going down. When it goes down, I am unable to make API calls from the frontend (or hit any url for that manner).


I AM still able to SSH in, I run htop and the CPU and memory are just fine. Even our long running scripts are running and logging correctly. The var/log/nginx/error.log file has these main errors:

connect() to unix:///tmp/price.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 127.0.0.1, server

and

upstream timed out (110: Connection timed out) while reading response header from upstream,

and

upstream prematurely closed connection while reading response header from upstream

I have tried increasing the max socket connections:
https://stackoverflow.com/questions/44581719/resource-temporarily-unavailable-using-uwsgi-nginx

I have tried increasing worker_rlimit_nofile and worker_connections

https://gist.github.com/denji/8359866

I have tried spinning up a heavier EC2 server (although like I said, memory and CPU are not issues)

I have tried increasing the listensettings on my uwsgi.ini
file https://stackoverflow.com/questions/12340047/your-server-socket-listen-backlog-is-limited-to-100-connections

If you have any idea what could be causing this please help, I’m running out of ideas.

https://redd.it/lon7cn
@r_devops
Is there a good certificate manager for managing all VMs, CF and K8s workload certificates?

Running in private cloud, so please no cloud vendor solutions.

https://redd.it/lol0zv
@r_devops
Thoughts on a new CI

Just saw these guys on Hackernews: https://deltaci.com

I’m tempted as the build time at my company is about 40 minutes and I’ve spent days shaving off minutes.

Is this really true? What am I missing here?

https://redd.it/lokltw
@r_devops
Publicly share IAC orchestration template for AWS/GCP/Azure etc...?

Is there a free SaaS IAC orchestrator? Basically looking for something like AWS Cloud Formation that I can export and give to other people, but works for AWS/GCP/Azure etc...

Scenario: Build an IAC template that deploys a project (vm or container) that I can share with a community. The project is a node.js game server which uses a webserver

Goal: Share the 'IAC template' & wiki documentation via github to the community. Community would be able to import the template, input their parameters, deploy to their AWS/GCP/Azure account.

Reason: Bored ops + programing tinkerer person that would like a project to play with (to learn more AWS/GCP/Azure) and to support my community

Someone else has already built this in AWS Cloud Formation, I could go rebuild this in Azure Resource Manager and the like.... but then there are multiple independent templates

I am about to start researching Terraform cloud free tier and plumi free but wondering what other free hosted service is out there to look into.

https://redd.it/loerka
@r_devops
Industry Standards Now for CI/CD

Which technologies should I learn for setting up CI/CD, pipelines, etc? I work in an azure environment if that matters and will be using containers and orchestrator like AKS.

https://redd.it/lo5w2r
@r_devops
Question about learning aws

Hey guys, i am currently a software student and i am interested in getting a job in devops, currently i am trying to improve my python ,git and linux knowledge and i saw that it is important to learn a cloud provider service as well for the start like aws but im not sure what that means, should l just learn matiriel from resources for certifications even if im not going to take a cert exam, or i should focus on certain services, and if so then any recommandations on what to focus on that is more relevent to devops ?

https://redd.it/lo9vjc
@r_devops
Is there any good tutorial on how to make a dev and a production environment for a Wordpress application?

Is there any good tutorial on how to make a dev and a production environment for a Wordpress application? I am trying to learn some basics, so I can make a dev and a production environment with docker for most simple Wordpress application.

https://redd.it/lo8eb9
@r_devops
Question about setting rolling updates and pipelines

I'm trying to get a better understanding of devops concepts and haven't had much luck reading through aws documentation for rolling updates.

I'm aware of how rolling updates are supposed to work, my question is more to the specifics of how it would be configured.

Is there a specific aws tool that would work best to setup automated rolling updates?

My example scenario would be a working pipeline set to a test instance. The rolling update would then be set up and applied from the test instance on to a live production environment using cloudformation (or is there a better service for this?).

https://redd.it/lo92fu
@r_devops
How to run elixir commands with gitlab-ci

To automate elixir based application ,created a `systemd` service

```
[Service]
Type=simple
User=gitlab-runner
Group=gitlab-runner
Environment=LANG=en_US.UTF-8

WorkingDirectory=/path/to/elixirModule

ExecStart=/path/to/elixirModule/bin/elixir_module start
ExecStop=/path/to/elixirModule/bin/elixir_module stop
```

In `.gitlab-ci.yml`
```
build-project:
stage: build_elixir_module
tags:
- elixer-shell
script:
- mix ecto.drop
- MIX_ENV=prod mix release
- ls -lh _build/prod
- cp _build/prod/elixir_module-0.1.0.tar.gz /path/to/elixirModule/
- tar -xvf /path/to/elixirModule/elixir_module-0.1.0.tar.gz
- sudo systemctl stop elixirModule
- cd /path/to/elixirModule
- bin/elixir_module start_iex
- EctoMnesia.Storage.storage_up(ElixirModuleRepo.config)
- ElixirModuleRepo.ReleaseTasks.migrate
- sudo systemctl stop elixirModule
```

These two commands should be executed within `iex` terminal.
```
EctoMnesia.Storage.storage_up(ElixirModuleRepo.config)
ElixirModuleRepo.ReleaseTasks.migrate
```

With current configuration, I get following error

> iex(elixir_module@node1)1> $ EctoMnesia.Storage.storage_up(ElixirModuleRepo.config)
1468 bash: eval: line 125: syntax error near unexpected token `ElixirModuleRepo.config'

How to run these two commands inside `iex`. Also if my approach is buggy or any sort of wrong config, please let me know.

https://redd.it/lnxjmf
@r_devops
Node Express microservice on AWS Fargate with Terraform

Hello community,

I have created a reference project to deploy a Node Express microservice onto Amazon ECS on AWS Fargate with Terraform. I hope you find this useful!

* Node Express app containerised with Docker
* CI/CD with AWS CodePipeline
* Deploys app on AWS Fargate
* Creates, and retrieves data from MongoDB
* AWS resources managed in Terraform

If you find this useful, please give this project a star!

Github project URL: [https://github.com/MatthewCYLau/node-aws-fargate-terraform](https://github.com/MatthewCYLau/node-aws-fargate-terraform)

https://redd.it/lnsa9w
@r_devops
Open API Enabler (?)

This might be a dumb question. I was thinking about open APIs. Customers always want it out of a platform so they can tie a tool into the rest of their ecosystem. Vendors take their time creating them for the most part for some reason. Then there is a bunch of documentation that needs to be made to help customers tie it in. Or the customer needs some folks that can do it on their own. A lot of the time I notice customer don’t even end up using it much.

Do you think there is a way to make some sort out open api enabler tool? Something that speeds up the process and makes it easier for vendors to get it setup and customers to tie it in faster and more effectively?

That is vague and like I said may be a dumb question. Hoping people more technically savvy then I may have some answers.

https://redd.it/lnuv5z
@r_devops
I'm Looking for any recommendations on where to find log management tips and best practices

I'm looking to increase my knowledge of log management best practices for security and infrastructure and I wanted to ask the experts for your suggestions on good training or youtube videos. Of course, I prefer free training but I'll I will take low entry cost as well. Here is what I have found so far. Does someone have anything valuable to add to this?



Log management best practices for SIEM (Youtube)

https://www.youtube.com/watch?v=t5NOhVmhbGE

Advanced Log Management Course (6 sessions) (Live)

https://www.humio.com/advanced-log-management-course-strategies-techniques-and-tactics

Advanced Techniques for AWS Monitoring, Metrics and Logging Course (Pre-recorded)

https://cloudacademy.com/course/advanced-techniques-for-aws-monitoring-metrics-and-logging/introduction-27/

https://redd.it/lnrsqs
@r_devops
what are your experiences

.      Describe your experience with DevOps platforms, source code management, CI/CD Pipelines, et cetera. Experience with developing and maintaining pipelines. Mention security tools used in pipelines, in any.


3.      What AWS native services have you used/deployed and what method did you use to deploy them?


4.      Describe your VMware Cloud on AWS experience if any.


5.      Describe your experience with VMware SRM or equivalent third-party solution. Please include DR related experience and how many protected VMs, etc.


6.      List the scripting and/or programming languages you have experience with and give an example or two about a script you wrote.

​

7.      Describe your experience with deploying new Windows or Linux Server builds, can include both VMware templates and Cloud native.


8.      Describe your experience with networking and network equipment, such as routing and firewalls.


9.      In one sentence, what is your favorite product or solution to work with?

https://redd.it/lnyibs
@r_devops
Blog Last week --> Kernel 5.11; Schedule IstioCon 2021; Disaster Recovery for Consul; AWS EKS 1.19; +35 other news and press releases

Keep informed: one place, many sources! This is my weekly post, where I collect news/* from the last week and make this batch news/* post.

Feedbacks/suggestions/* are always welcome :)

See on Medium: https://lozanomatheus.medium.com/7387db26d017?source=friends\_link&sk=04f1bb2e9ecc56253db5b267152b24c4

See on my Website: https://www.lozanomatheus.com/post/week07-news-updates-reminders-aws-hashicorp-istio-kubernetes-linux

https://redd.it/lp5n29
@r_devops