Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Anyone know how I can POST a Binary of JSON files (zip, tar) to API Gateway so Lambda can unzip and process those JSON files?

Id like to grab a group of JSON which can be mutated in Lambda and pushed into DynamoDB. Thanks :)

https://redd.it/lnuga9
@r_devops
Can I forward Nginx logs to an API - How can I do this?

I want to process nginx logs of multiple machines. I thought it would be nice to forward the data to an endpoint of an API where I can parse the log and save it to a databse. How would you tackle this?

https://redd.it/lnkc4l
@r_devops
I'm a junior DevOps Engineer at my company. How do I lose the Junior?

Basically the title.
Last position I had was a sysadmin where I did mostly everything there was to do. I have studied CS, but I don't like programming very much, unless we are talking about scripting. I use bash, PS and python for my scripts.

I am working in a team of two, myself included. My colleague claims he has 9+ years of DevOps experience, but I feel I can learn nothing from him (long story and I also don't like to focus on him now). We have an external consultant and that guy is awesome. I'd love to be like that one day.

Currently we are working on Azure and it is planned that this year we're getting the AZ-102 (developer) and AZ-400 (devops) certificates. This will help ofc, but I also need to gather some more exp. with Docker & Kubernetes. We don't yet use VM automation, so that's not a priority.

I have bought some courses for CKA and CKAD. I have started with CKA, but I feel that one is not the right for me, since what we mostly do has to do with helm, tls and k8s. I'd like to learn this stuff too. Some good sources are appreciated, maybe in connection to Azure, so that I may use the newly gained knowledge asap.

Thanks :D

tldr: need to be better at helm, azure and k8s and also lose my junior in the title. What do I do?

https://redd.it/lnazlq
@r_devops
SRE vs. Platform Engineering

Over the past decade, engineering and technology organizations have converged on a common set of best practices for building and deploying cloud-native applications. These best practices include continuous delivery, containerization, and building observable systems.

At the same time, cloud-native organizations have radically changed how they’re organized, moving from large departments (development, QA, operations, release) to smaller, independent development teams. These application development teams are supported by two new functions: site reliability engineering and platform engineering. SRE and platform engineering are spiritual successor of traditional operations teams, and bring the discipline of software engineering to different aspects of operations.

https://blog.getambassador.io/the-rise-of-cloud-native-engineering-organizations-1a244581bda5

https://redd.it/lnhwkb
@r_devops
Should DevOps Toolchain contain Azure KeyVault

Basically what the title says. In your opinion, should a tool like Azure KeyVault be in a DevOps Toolchain?

https://redd.it/lnh5er
@r_devops
Infrastructure for hosting a web scraper that scrapes huge quantities of data? (Interview Q)

Hey guys, I’ve been given an interview question to complete over the next few days which I’m a little stuck with. Basically - I’ve been asked to design the infrastructure for hosting an internal web scraper (the code for the scraper has already been written). Have to create a diagram and name the technologies (Docker/AWS/HAProxy etc) and explain my decisions. It’s a little out of my depth and I’m wondering if anyone has any resources or tips where I could learn more about infrastructure design? I know I’ll need lots of databases and probably a load balancer to divide up work between worker nodes/servers and maybe a load balancer between those and the databases? I just want to learn a bit more about the specifics so that I can design something that makes sense! I know it’s a very open ended question and there is an infinite amount to learn - but any examples or central ideologies would be great! Thanks in advance :)

https://redd.it/lnb3eh
@r_devops
training question (employer paid vs PTO)

I'm a senior developer for a major national consulting company, billing 100% for a project (years & years). My employer offers FT employees seasonal certification bootcamps - DevSecOps, AWS, Azure, etc, which are usually several days and then the exam. Senior mentors from my company do the training in-house. The company pays for the test but they don't cover all the time for attending the bootcamp -- they do half and you have to take vacation time for half. (Since these certs are not-required it's legal to require PTO for training.)

I'm just curious how common this is. In my past jobs w/ non-consultant IT departments, the company covered all time & costs of training -- as a perk for the employee, and I imagine also because it benefits them to have better-qualified staff. This seems kinda cheap to me, considering the training certs are relevant to the work & tools we use on the project.

What are your experiences? Ever have to use PTO for your skills training?

https://redd.it/ln2om4
@r_devops
What are the disadvantages of going cloud-native?

So, I think my previous post about the benefits of going cloud-native (https://www.reddit.com/r/devops/comments/lkbx9e/what\_cloud\_native\_is\_really\_good\_for) was entertaining and certainly useful. My main take-away is that with cloud-native you design your software to make the best use of a public cloud infrastructure - with all the benefits that [public cloud infra\] entails, such as scaling up and down, deploying when and where you needed it, etc. All the other benefits mentioned (e.g. speed) can be realised without "cloud-native" in my view.

But surely cloud-native has its drawbacks too. Off the top of my head, I'd say performance overhead and dependance on a rather limited number of public cloud service providers.

Other views?

https://redd.it/lky489
@r_devops
Best way to learn Linux?

I've been looking at improving my core skills like networking and Linux. I was thinking about using LA playgrounds, installing Linux as dual boot on my laptop, renting a VPS etc...

Has anyone got any good recommendations?

https://redd.it/lob2ck
@r_devops
Watch Kubernetes Experts Fix Broken Kubernetes Clusters Live

I’ve launched a new series of episodes called Klustered. These episodes feature myself and a guest from the Kubernetes community attempting to fix some Kubernetes clusters. These clusters are also broken by community members 😀

We know nothing upfront. The first episode was very fun. Episodes will be live on YouTube every Thursday. Best week we have clusters broken by Jason DeTiberus and Justin Garrison.

I hope you enjoy

https://youtu.be/teB22ZuV_z8

https://redd.it/lo7a8v
@r_devops
Monitoring 5,000 nodes

Hello.

I’m curious what solutions a community like this employs for the following scenario:

We’re looking to put about 5,000 Linux boxes across America inside of stores. They serve an important purpose and will be more or less 5,000 of the same image. This is a big increase in scale for us as our existing Linux server footprint is roughly 1,500.

We currently use Zabbix but I find it lacks in scalability and supportability.

The support will require cross collaboration between Linux OS support, database support, and application developers, so I am looking for a solution where these disparate teams can write their own monitoring and alerting solutions for their use-cases relatively easily (definitely a challenge to do with Zabbix).

I’ve been thinking about Sensu but I am interested in hearing other options/experiences here.

https://redd.it/lo9l76
@r_devops
How do you trace root cause analysis on your microservices



Hey guys trying to gain some inspiration to rethink how can I make this process less horrible in my own life

Seems to me that everyone is using the same method when doing root cause analysis (on dev/staging/prod envs), Plugging it all to some ELK, Using Kiali/Other tool for specific MS log trailing.

The process is usually something like getting some first order cause like a request failing -> finding where it started -> going to the Log trailing tool(Kiali etc.) finding the exception -> getting the trace id -> search in Kibana with trace id -> move through massive number of lines -> find next stacktrace on another MS -> repeat until finding root cause.

This is of course when you even have a stack trace that gives you more info, what if it is some authorization issue between services or some other DevOps tools in the stack (istio etc.)

Tools like datadog/splunk show the request trace and status but this doesn't solve the long root cause analysis in most of the cases

Hope you guys have something better in practice =)


Thanks in advance

https://redd.it/loawxb
@r_devops
Best practices for domain configuration

I'm setting up my own ci/cd pipeline on Docker with GitLab-CE and NGINX as reverse proxy.

I'm trying to set it up in a way where it will be fairly portable so I can use it, set it up quickly on different VPS with just docker compose.

Right now I'm on my laptop and in my host file I just set my localhost to some fake local domain local.lab

What's the proper, secure way of doing this and how is it done in a companies?

When I'm preparing setup like that should I even rely on the localhost or should I use actual domain and use SSL certificates? If you use the real domain name, how do you restrict it and make it secure?

https://redd.it/loaq1r
@r_devops
Question regarding database for responsive analytics

On current project we have a webapp with analytics module. The users select some filters and based on those filters table or graph is shown. We want the module to be responsive, so when the users select the filter that it can get data in matters of seconds.

Users filter are querying a large table (~1,000,000,000 rows and 20 columns). All columns except two are filtrable.Currently we are using Redshift but it's way too slow. Also, there is daily import in a table lasts around 15 hours (it is also too slow).

We are discussing between Clickhouse, Vertica and  BigQuery to replace Redshift.

Did anyone had similar a use case and which database solution would you recommend?

https://redd.it/loal9f
@r_devops
Nginx / uWsgi crashing about once an hour, please help

I’m running uWsgi and Nginx with Python.About once an hour my application is going down. When it goes down, I am unable to make API calls from the frontend (or hit any url for that manner).


I AM still able to SSH in, I run htop and the CPU and memory are just fine. Even our long running scripts are running and logging correctly. The var/log/nginx/error.log file has these main errors:

connect() to unix:///tmp/price.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 127.0.0.1, server

and

upstream timed out (110: Connection timed out) while reading response header from upstream,

and

upstream prematurely closed connection while reading response header from upstream

I have tried increasing the max socket connections:
https://stackoverflow.com/questions/44581719/resource-temporarily-unavailable-using-uwsgi-nginx

I have tried increasing worker_rlimit_nofile and worker_connections

https://gist.github.com/denji/8359866

I have tried spinning up a heavier EC2 server (although like I said, memory and CPU are not issues)

I have tried increasing the listensettings on my uwsgi.ini
file https://stackoverflow.com/questions/12340047/your-server-socket-listen-backlog-is-limited-to-100-connections

If you have any idea what could be causing this please help, I’m running out of ideas.

https://redd.it/lon7cn
@r_devops
Is there a good certificate manager for managing all VMs, CF and K8s workload certificates?

Running in private cloud, so please no cloud vendor solutions.

https://redd.it/lol0zv
@r_devops
Thoughts on a new CI

Just saw these guys on Hackernews: https://deltaci.com

I’m tempted as the build time at my company is about 40 minutes and I’ve spent days shaving off minutes.

Is this really true? What am I missing here?

https://redd.it/lokltw
@r_devops
Publicly share IAC orchestration template for AWS/GCP/Azure etc...?

Is there a free SaaS IAC orchestrator? Basically looking for something like AWS Cloud Formation that I can export and give to other people, but works for AWS/GCP/Azure etc...

Scenario: Build an IAC template that deploys a project (vm or container) that I can share with a community. The project is a node.js game server which uses a webserver

Goal: Share the 'IAC template' & wiki documentation via github to the community. Community would be able to import the template, input their parameters, deploy to their AWS/GCP/Azure account.

Reason: Bored ops + programing tinkerer person that would like a project to play with (to learn more AWS/GCP/Azure) and to support my community

Someone else has already built this in AWS Cloud Formation, I could go rebuild this in Azure Resource Manager and the like.... but then there are multiple independent templates

I am about to start researching Terraform cloud free tier and plumi free but wondering what other free hosted service is out there to look into.

https://redd.it/loerka
@r_devops
Industry Standards Now for CI/CD

Which technologies should I learn for setting up CI/CD, pipelines, etc? I work in an azure environment if that matters and will be using containers and orchestrator like AKS.

https://redd.it/lo5w2r
@r_devops