Log aggregation and parsing
Hello redditors working on log aggregation and log parsing, what tools do you use for log aggregation and parsing? are you using any vendor tools or open source solutions like NiFi, Storm, etc.? and why?
I am working on a project and would like to learn from experience of someone implementing these solutions. I am trying to understand pros and cons of different solutions.
https://redd.it/l012fy
@r_devops
Hello redditors working on log aggregation and log parsing, what tools do you use for log aggregation and parsing? are you using any vendor tools or open source solutions like NiFi, Storm, etc.? and why?
I am working on a project and would like to learn from experience of someone implementing these solutions. I am trying to understand pros and cons of different solutions.
https://redd.it/l012fy
@r_devops
reddit
Log aggregation and parsing
Hello redditors working on log aggregation and log parsing, what tools do you use for log aggregation and parsing? are you using any vendor tools...
Best way to manage multiple game server containers?
Hi all. Long time lurker, first time poster. I am looking to create several game servers (personal servers hosted for the public) for older games such as Unreal Tournament and Call of Duty 2 and would like to remain independent of locking myself into specific services. That is, I would like to remain as vendor neutral as possible (aside from Docker, I suppose) so that, if for some reason I feel the need to change vendors, I can do so with relative ease. My thought was to build Docker containers with the base image for each game (probably on Alpine, if possible) and then modify each container as needed by attaching the necessary config file and third-party maps, etc. I will have over 10 servers, each in their own Docker container and probably each of them on their own individual instance (a t3.nano on AWS or equivalent). I realize managing all of these containers/servers this might get unruly and so I was trying to understand the best way to manage them. I have very little exposure to Kubernetes, but would this be one method of managing these or would that be a separate use case? Any thoughts much appreciated!
https://redd.it/l04xiw
@r_devops
Hi all. Long time lurker, first time poster. I am looking to create several game servers (personal servers hosted for the public) for older games such as Unreal Tournament and Call of Duty 2 and would like to remain independent of locking myself into specific services. That is, I would like to remain as vendor neutral as possible (aside from Docker, I suppose) so that, if for some reason I feel the need to change vendors, I can do so with relative ease. My thought was to build Docker containers with the base image for each game (probably on Alpine, if possible) and then modify each container as needed by attaching the necessary config file and third-party maps, etc. I will have over 10 servers, each in their own Docker container and probably each of them on their own individual instance (a t3.nano on AWS or equivalent). I realize managing all of these containers/servers this might get unruly and so I was trying to understand the best way to manage them. I have very little exposure to Kubernetes, but would this be one method of managing these or would that be a separate use case? Any thoughts much appreciated!
https://redd.it/l04xiw
@r_devops
reddit
Best way to manage multiple game server containers?
Hi all. Long time lurker, first time poster. I am looking to create several game servers (personal servers hosted for the public) for older games...
DNS Load balancers?
Hello,
I'm currently working on a DDoS protection service for locations that other providers don't provide. The host I'm using offers cheap servers, but low bandwidth (750GB). That's plenty enough for a few customers home-hosting a SMP for their friends, but not nearly enough for a few hundred player on the network 100% of the time.
So far, I have found no TCP load balancers that can provide load balancing without proxying the connection as well. I started looking into the DNS side of things, so far what I have found that could possible work is a DNS ROUND-ROBIN solution. This would forward a player to any proxy that is listed in the list of proxies.
If anyone has a better solution to this problem, please contact me.
Mitch, AusGuard
https://redd.it/l0e50o
@r_devops
Hello,
I'm currently working on a DDoS protection service for locations that other providers don't provide. The host I'm using offers cheap servers, but low bandwidth (750GB). That's plenty enough for a few customers home-hosting a SMP for their friends, but not nearly enough for a few hundred player on the network 100% of the time.
So far, I have found no TCP load balancers that can provide load balancing without proxying the connection as well. I started looking into the DNS side of things, so far what I have found that could possible work is a DNS ROUND-ROBIN solution. This would forward a player to any proxy that is listed in the list of proxies.
If anyone has a better solution to this problem, please contact me.
Mitch, AusGuard
https://redd.it/l0e50o
@r_devops
reddit
DNS Load balancers?
Hello, I'm currently working on a DDoS protection service for locations that other providers don't provide. The host I'm using offers cheap...
Laptop?
I’ve always used a MBP for all of my dev and admin stuff at my previous job without an issue.
But I’m now taking a new Sr Role and they asked what I wanted.
Have any MBP users had any DevOps Mac regrets?
I just need my stuff to work.
https://redd.it/l023r8
@r_devops
I’ve always used a MBP for all of my dev and admin stuff at my previous job without an issue.
But I’m now taking a new Sr Role and they asked what I wanted.
Have any MBP users had any DevOps Mac regrets?
I just need my stuff to work.
https://redd.it/l023r8
@r_devops
reddit
Laptop?
I’ve always used a MBP for all of my dev and admin stuff at my previous job without an issue. But I’m now taking a new Sr Role and they asked what...
docker-compose failing
Hey,
I'm just a beginner in this ,
I built this docker-compose file which in the same folder I have a folder for the app and a Dockerfile file.
Why do I keep reeving this errors after running docker-compose up ?
Is there anything wrong with the .yml ?
​
version: ‘3’
services:
consul:
image: consul
restart: unless-stopped
command: agent -server -ui -node=server-1 -bootstrap-expect=1 -clinet=0.0.0.0
ports:
- 0.0.0.0:8500:8500
- 0.0.0.0:8600:8600/udp
networks:
- consul
proudctionapp:
build: .
ports:
- 0.0.0.0:8000:8000
networks:
- consul
networks:
consul:
**The errors I receive:**
Traceback (most recent call last):
File "bin/docker-compose", line 6, in <module>
File "compose/cli/main.py", line 71, in main
File "compose/cli/main.py", line 124, in performcommand
File "compose/cli/command.py", line 41, in projectfromoptions
File "compose/cli/command.py", line 113, in getproject
File "compose/config/config.py", line 385, in load
File "compose/config/config.py", line 385, in <listcomp>
File "compose/config/config.py", line 518, in processconfigfile
File "compose/config/config.py", line 226, in getservicedicts
File "distutils/version.py", line 46, in eq
File "distutils/version.py", line 337, in cmp
TypeError: '<' not supported between instances of 'str' and 'int'
3972 Failed to execute script docker-compose
Thanks...
https://redd.it/l04g76
@r_devops
Hey,
I'm just a beginner in this ,
I built this docker-compose file which in the same folder I have a folder for the app and a Dockerfile file.
Why do I keep reeving this errors after running docker-compose up ?
Is there anything wrong with the .yml ?
​
version: ‘3’
services:
consul:
image: consul
restart: unless-stopped
command: agent -server -ui -node=server-1 -bootstrap-expect=1 -clinet=0.0.0.0
ports:
- 0.0.0.0:8500:8500
- 0.0.0.0:8600:8600/udp
networks:
- consul
proudctionapp:
build: .
ports:
- 0.0.0.0:8000:8000
networks:
- consul
networks:
consul:
**The errors I receive:**
Traceback (most recent call last):
File "bin/docker-compose", line 6, in <module>
File "compose/cli/main.py", line 71, in main
File "compose/cli/main.py", line 124, in performcommand
File "compose/cli/command.py", line 41, in projectfromoptions
File "compose/cli/command.py", line 113, in getproject
File "compose/config/config.py", line 385, in load
File "compose/config/config.py", line 385, in <listcomp>
File "compose/config/config.py", line 518, in processconfigfile
File "compose/config/config.py", line 226, in getservicedicts
File "distutils/version.py", line 46, in eq
File "distutils/version.py", line 337, in cmp
TypeError: '<' not supported between instances of 'str' and 'int'
3972 Failed to execute script docker-compose
Thanks...
https://redd.it/l04g76
@r_devops
reddit
docker-compose failing
Hey, I'm just a beginner in this , I built this docker-compose file which in the same folder I have a folder for the app and a Dockerfile...
Creating a SaaS App
Hello all,
I apologize for not being able to be more specific, but I'm going to do my best with what I know.
Currently, we have an application that was built in-house on .NET and is written in C#. We feel this application is a really strong tool that could be templated and resold as a SaaS application to other service companies in this industry. I have no idea where to start or what to look for in this scenario. My background is basically jack-of-all trades sysadmin. I do have a developer on staff who helped created this application.
What should I be looking for to better understand how to make this happen? Who should I be talking to about hosting? How do I create an environment where we can rebuild/develop the app from scratch? Basically looking for a getting started guide for building your own SaaS solution based on an existing application on-prem.
TIA
https://redd.it/l03162
@r_devops
Hello all,
I apologize for not being able to be more specific, but I'm going to do my best with what I know.
Currently, we have an application that was built in-house on .NET and is written in C#. We feel this application is a really strong tool that could be templated and resold as a SaaS application to other service companies in this industry. I have no idea where to start or what to look for in this scenario. My background is basically jack-of-all trades sysadmin. I do have a developer on staff who helped created this application.
What should I be looking for to better understand how to make this happen? Who should I be talking to about hosting? How do I create an environment where we can rebuild/develop the app from scratch? Basically looking for a getting started guide for building your own SaaS solution based on an existing application on-prem.
TIA
https://redd.it/l03162
@r_devops
reddit
Creating a SaaS App
Hello all, I apologize for not being able to be more specific, but I'm going to do my best with what I know. Currently, we have an application...
Crucial differences in MLOps for deep learning in comparison to other ML approaches
I am new to the field of MLOps and about to set up a pipeline for a deep learning project based on TensorFlow.
I am looking for differences when comparing deep learning to other machine learning approaches in the context of MLOps. So far I have only found resources that introduce the general MLOps principles. Does anybody know what pipeline components may differ specifically? Which aspects may be more challenging?
What tools & resources would you suggest to start with? Any recommendations?
Thanks!
https://redd.it/l02vva
@r_devops
I am new to the field of MLOps and about to set up a pipeline for a deep learning project based on TensorFlow.
I am looking for differences when comparing deep learning to other machine learning approaches in the context of MLOps. So far I have only found resources that introduce the general MLOps principles. Does anybody know what pipeline components may differ specifically? Which aspects may be more challenging?
What tools & resources would you suggest to start with? Any recommendations?
Thanks!
https://redd.it/l02vva
@r_devops
reddit
Crucial differences in MLOps for deep learning in comparison to...
I am new to the field of MLOps and about to set up a pipeline for a deep learning project based on TensorFlow. I am looking for differences when...
Samba - only allow users access their folder (and custom dirs)
Currently, users in our samba directory can access any folders, including each others so I've found that by including this line in the config file
Now the only problem is that other folders are not accessible for everyone and I can't seem to have it working.
How do I make the home/USER accessible to USER while keeping other dirs open for everyone?
Thanks ahaed.
https://redd.it/l0jlxv
@r_devops
Currently, users in our samba directory can access any folders, including each others so I've found that by including this line in the config file
valid users = %S uncommented, users can only access their home directories.Now the only problem is that other folders are not accessible for everyone and I can't seem to have it working.
How do I make the home/USER accessible to USER while keeping other dirs open for everyone?
Thanks ahaed.
https://redd.it/l0jlxv
@r_devops
reddit
Samba - only allow users access their folder (and custom dirs)
Currently, users in our samba directory can access any folders, including each others so I've found that by including this line in the config file...
Don't pay to learn Flask with python! Just watch this free course of Flask with python!!
https://www.youtube.com/watch?v=6ea9KxusS0M&t=36s
This is the third tutorial to my ultimate flask series :)
https://redd.it/l0k3y8
@r_devops
https://www.youtube.com/watch?v=6ea9KxusS0M&t=36s
This is the third tutorial to my ultimate flask series :)
https://redd.it/l0k3y8
@r_devops
YouTube
Web development with flask and python | Making a html form | Tutorial 3
Don't Forget to subscribe and turn on the notifications and also like the video so you can help me beat the YouTube algorithm
MT Gaming: https://www.youtube.com/channel/UCNlmueB-bN5kSOXvSM6ZLTw
MT Gaming: https://www.youtube.com/channel/UCNlmueB-bN5kSOXvSM6ZLTw
What does GitLab offer that GitHub doesn't?
As far as I can tell the two platforms are essentially the same. I currently use GitHub, but I've had a couple of people suggest I look into GitLab. I've done some basic research, and I can't see anything that would make me go "Huh, yeah I should switch to GitLab" or at least consider using both platforms. Does GL offer something that GH doesn't? On the personal or business level?
https://redd.it/l0nkiy
@r_devops
As far as I can tell the two platforms are essentially the same. I currently use GitHub, but I've had a couple of people suggest I look into GitLab. I've done some basic research, and I can't see anything that would make me go "Huh, yeah I should switch to GitLab" or at least consider using both platforms. Does GL offer something that GH doesn't? On the personal or business level?
https://redd.it/l0nkiy
@r_devops
reddit
What does GitLab offer that GitHub doesn't?
As far as I can tell the two platforms are essentially the same. I currently use GitHub, but I've had a couple of people suggest I look into...
Has anyone moved from DevOps to backend engineering?
Has anyone done such a career change?
I know many peers that were previously backend devs and then moved to DevOps but haven't heard from anyone doing the opposite.
If you did so, what was the reason?
https://redd.it/l0xyim
@r_devops
Has anyone done such a career change?
I know many peers that were previously backend devs and then moved to DevOps but haven't heard from anyone doing the opposite.
If you did so, what was the reason?
https://redd.it/l0xyim
@r_devops
reddit
Has anyone moved from DevOps to backend engineering?
Has anyone done such a career change? I know many peers that were previously backend devs and then moved to DevOps but haven't heard from anyone...
As a beginner, how can I use a Raspberry Pi to learn?
Hi everyone--
I was gifted a Raspberry Pi 4 recently. I have been really interested in learning about DevOps and cloud computing.
Are there any projects or things you recommend to do with a Raspberry Pi (that a beginner could figure out)? I want to use the Pi as a "learn by doing" tool to get hands-on experience while I begin learning.
Has anyone used a Raspberry Pi to learn and recommend any projects?
https://redd.it/l0yw9p
@r_devops
Hi everyone--
I was gifted a Raspberry Pi 4 recently. I have been really interested in learning about DevOps and cloud computing.
Are there any projects or things you recommend to do with a Raspberry Pi (that a beginner could figure out)? I want to use the Pi as a "learn by doing" tool to get hands-on experience while I begin learning.
Has anyone used a Raspberry Pi to learn and recommend any projects?
https://redd.it/l0yw9p
@r_devops
reddit
As a beginner, how can I use a Raspberry Pi to learn?
Hi everyone-- I was gifted a Raspberry Pi 4 recently. I have been really interested in learning about DevOps and cloud computing. Are there any...
What do you prefer for managing AWS EKS cluster? eksctl or Terraform or something else?
HashiCorp Terraform can handle (almost) any infrastructure management needs but, if all we need is AWS EKS cluster, maybe eksctl is a better choice, especially since it acts as both CLI and Infrastructure-as-code (IaC) tool. WDYT?
\>>> https://youtu.be/pNECqaxyewQ
https://redd.it/l0mvwl
@r_devops
HashiCorp Terraform can handle (almost) any infrastructure management needs but, if all we need is AWS EKS cluster, maybe eksctl is a better choice, especially since it acts as both CLI and Infrastructure-as-code (IaC) tool. WDYT?
\>>> https://youtu.be/pNECqaxyewQ
https://redd.it/l0mvwl
@r_devops
YouTube
eksctl - How to Create and Manage AWS EKS clusters
A review of eksctl and step by step guide how to create and manage AWS Elastic Kubernetes Service (EKS) clusters.
#eksctl #eks #kubernetes
Timecodes ⏱:
00:00 Intro
04:49 Setup
07:00 How NOT to create a cluster
09:13 Creating a cluster
14:57 Exploring the…
#eksctl #eks #kubernetes
Timecodes ⏱:
00:00 Intro
04:49 Setup
07:00 How NOT to create a cluster
09:13 Creating a cluster
14:57 Exploring the…
Tool to monitor latest releases for different applications?
Hi! We're building a platform using EKS, helm, Terraform, the usual.
Since many of the tools we're using are fast evolving, we're looking for a tool that can be used to check for available updates for different technologies. Imagine a dashboard that shows you "EKS on cluster A has version v1.17, cluster B has v1.16, newest release is v1.18; helm chart foo is deployed with release 3.2.0, newest release is 3.4.0".
I can't seem to find something like that. One approach would be to create somehow create metrics to push to prometheus, then build a grafana dashboard for that.
Any recommendations? Or search terms? Thanks in advance!
https://redd.it/l14z1h
@r_devops
Hi! We're building a platform using EKS, helm, Terraform, the usual.
Since many of the tools we're using are fast evolving, we're looking for a tool that can be used to check for available updates for different technologies. Imagine a dashboard that shows you "EKS on cluster A has version v1.17, cluster B has v1.16, newest release is v1.18; helm chart foo is deployed with release 3.2.0, newest release is 3.4.0".
I can't seem to find something like that. One approach would be to create somehow create metrics to push to prometheus, then build a grafana dashboard for that.
Any recommendations? Or search terms? Thanks in advance!
https://redd.it/l14z1h
@r_devops
reddit
Tool to monitor latest releases for different applications?
Hi! We're building a platform using EKS, helm, Terraform, the usual. Since many of the tools we're using are fast evolving, we're looking for a...
Plugin for Intellij IDE to manage your Gitlab CI builds
I've created a plugin to manage your Gitlab CI builds from Intellij based IDE (WebStorm, PyCharm, Android Studio,, etc). It lets you:
\- List pipelines from your projects
\- Check status
\- Trigger, abort and retry pipeline
\- Works with gitlab.com and self hosted gitlab edition
Plugin can be download from the IDE or on JetBrains: https://plugins.jetbrains.com/plugin/15457-gitlab-ci
Feedbacks appreciated
https://redd.it/l15m9w
@r_devops
I've created a plugin to manage your Gitlab CI builds from Intellij based IDE (WebStorm, PyCharm, Android Studio,, etc). It lets you:
\- List pipelines from your projects
\- Check status
\- Trigger, abort and retry pipeline
\- Works with gitlab.com and self hosted gitlab edition
Plugin can be download from the IDE or on JetBrains: https://plugins.jetbrains.com/plugin/15457-gitlab-ci
Feedbacks appreciated
https://redd.it/l15m9w
@r_devops
JetBrains Marketplace
Gitlab CI - IntelliJ IDEs Plugin | Marketplace
##Gitlab CI Plugin
Manage your Gitlab pipelines from your any JetBrain IDEs.
Manage your Gitlab pipelines from your any JetBrain IDEs.
What are my options for speeding up pulls of frequently-used docker images in CI?
My team's using CircleCI right now, and I'm frustrated at the number of cache misses we get for docker images -- even Circle's own standard docker images sometimes take nearly a minute to load. I have ambitions for a big rewrite of our CI pipeline, and I'd love to make a custom docker image for this purpose, but as it stands I'm worried it's going to make the startup time for each job unacceptably long.
I'm curious what the range of options is for caching images in the cloud. Specifically, if I want to:
* Cache a very small number of images
* Which change very infrequently
* With most container starts under 5s
What's the ideal approach? I know I can use a cache at the registry level, but I'm not sure why that would be more performant than the registries themselves, given that (presumably?) all this stuff is happening in the same data centers anyway. Does it really help to stand up a server just for my images, for such little traffic? Does it make more sense to have long-lived CI service instances that we host ourselves, with attached SSD storage? Are there any other clever tricks I'm not thinking of? I'm open to self-hosting something, if that gives me better options.
https://redd.it/l0zbvn
@r_devops
My team's using CircleCI right now, and I'm frustrated at the number of cache misses we get for docker images -- even Circle's own standard docker images sometimes take nearly a minute to load. I have ambitions for a big rewrite of our CI pipeline, and I'd love to make a custom docker image for this purpose, but as it stands I'm worried it's going to make the startup time for each job unacceptably long.
I'm curious what the range of options is for caching images in the cloud. Specifically, if I want to:
* Cache a very small number of images
* Which change very infrequently
* With most container starts under 5s
What's the ideal approach? I know I can use a cache at the registry level, but I'm not sure why that would be more performant than the registries themselves, given that (presumably?) all this stuff is happening in the same data centers anyway. Does it really help to stand up a server just for my images, for such little traffic? Does it make more sense to have long-lived CI service instances that we host ourselves, with attached SSD storage? Are there any other clever tricks I'm not thinking of? I'm open to self-hosting something, if that gives me better options.
https://redd.it/l0zbvn
@r_devops
reddit
What are my options for speeding up pulls of frequently-used...
My team's using CircleCI right now, and I'm frustrated at the number of cache misses we get for docker images -- even Circle's own standard docker...
Learning and Applying the Basics (at home or at work!)?
Hi all,
​
First time poster, medium time lurker...
​
I've recently come to to the realisation that I'd like to make a move to the devops space with a focus on SRE (in my current role I'm the guy always asks "Where is this documented?", "What can we do to automate or improve this process or task" and when we get a high incident I'm usually marshalling the teams, updating the client, ensuring that the problem is worked and write up the post incident report).
So on the back of that I wanted to ask if the following would be a good roadmap for me:
https://medium.com/@devfire/how-to-become-a-devops-engineer-in-six-months-or-less-366097df7737
In terms of background, prior to my current role I worked in what my employer called a "Techops" role - we were the team who monitored application and environment health, investigated and resolved any issues that occurred using bash scripts and used puppet for deployments - however, we didn't do any configuring of these tools - merely used them from a runbook, fixed basic issues or escalated to the devops team. Other tools that we used were Splunk and Nagios, plus a small amount of Riak and Redis and the environments were all Ubuntu.
My current role is more of a senior helpdesk role who's the primary contact for a key client in a Windows shop who are starting to use Azure for hosting Ubuntu servers and Splunk to monitor. They are getting into the Devops mindset but my role won't intersect with this despite conversations with my manager about my previous skillset (mainly the Splunk stuff).
Prior to these roles I bounced around various IT roles, some of which were workstation, others were data centre tech and others were networking roles so I have a plethora of experience but very few certs to back it up.
The way I see it is that I either need to prove to my manager that I have the ability to do the work via training or look for a new role which means that I'll need to build out some sort of homelab to get the experience.
I appreciate that I won't be a fully skilled up Devops engineer in 6 months but I wanted to as if the route is a good one?
I'm currently using LinkedIn Learning to upskill on Linux, Python and then will be working on Cloud computing basics before I get into the meat and potatoes of the Devops methodology and tools.
​
Thanks for the help!
https://redd.it/l0zzae
@r_devops
Hi all,
​
First time poster, medium time lurker...
​
I've recently come to to the realisation that I'd like to make a move to the devops space with a focus on SRE (in my current role I'm the guy always asks "Where is this documented?", "What can we do to automate or improve this process or task" and when we get a high incident I'm usually marshalling the teams, updating the client, ensuring that the problem is worked and write up the post incident report).
So on the back of that I wanted to ask if the following would be a good roadmap for me:
https://medium.com/@devfire/how-to-become-a-devops-engineer-in-six-months-or-less-366097df7737
In terms of background, prior to my current role I worked in what my employer called a "Techops" role - we were the team who monitored application and environment health, investigated and resolved any issues that occurred using bash scripts and used puppet for deployments - however, we didn't do any configuring of these tools - merely used them from a runbook, fixed basic issues or escalated to the devops team. Other tools that we used were Splunk and Nagios, plus a small amount of Riak and Redis and the environments were all Ubuntu.
My current role is more of a senior helpdesk role who's the primary contact for a key client in a Windows shop who are starting to use Azure for hosting Ubuntu servers and Splunk to monitor. They are getting into the Devops mindset but my role won't intersect with this despite conversations with my manager about my previous skillset (mainly the Splunk stuff).
Prior to these roles I bounced around various IT roles, some of which were workstation, others were data centre tech and others were networking roles so I have a plethora of experience but very few certs to back it up.
The way I see it is that I either need to prove to my manager that I have the ability to do the work via training or look for a new role which means that I'll need to build out some sort of homelab to get the experience.
I appreciate that I won't be a fully skilled up Devops engineer in 6 months but I wanted to as if the route is a good one?
I'm currently using LinkedIn Learning to upskill on Linux, Python and then will be working on Cloud computing basics before I get into the meat and potatoes of the Devops methodology and tools.
​
Thanks for the help!
https://redd.it/l0zzae
@r_devops
Medium
How To Become a DevOps Engineer In Six Months or Less
Introduction
Redhat Container repo on Azure Container Instances
Howdy,
I am wondering if we are able to utilitise the Redhat Container catalog to create new Azure Container Instances. It seems like you need to authenticate with Redhat to pull the image.
Specifically I am trying to get the zabbix-appliance image
.
https://redd.it/l0tky6
@r_devops
Howdy,
I am wondering if we are able to utilitise the Redhat Container catalog to create new Azure Container Instances. It seems like you need to authenticate with Redhat to pull the image.
Specifically I am trying to get the zabbix-appliance image
.
https://redd.it/l0tky6
@r_devops
Redhat
Red Hat Ecosystem Catalog
Red Hat Ecosystem Catalog - Discover Red Hat Certified Products for Hardware, Software, and Cloud.
CRD for CRDs to design multi-tenant platform services from Helm charts
Kubernetes platform engineering teams prepare Kubernetes clusters for sharing between multiple users and workloads. This involves building Helm charts for variety of operational workflows. We have developed a framework to turn such Helm charts into Kubernetes-style APIs. Here is a link to the blog post about it:
https://medium.com/@cloudark/crd-for-crds-to-design-platform-services-from-helm-charts-e83816974e47
https://redd.it/l18kp1
@r_devops
Kubernetes platform engineering teams prepare Kubernetes clusters for sharing between multiple users and workloads. This involves building Helm charts for variety of operational workflows. We have developed a framework to turn such Helm charts into Kubernetes-style APIs. Here is a link to the blog post about it:
https://medium.com/@cloudark/crd-for-crds-to-design-platform-services-from-helm-charts-e83816974e47
https://redd.it/l18kp1
@r_devops
Medium
CRD for CRDs to design multi-tenant platform services from Helm charts
Kubernetes platform engineering teams prepare the cluster for sharing between multiple users or workloads. They perform a variety of tasks…
How to create an EFK logging stack on Amazon?
Until now we were hosting our clusters ourselves on bare metal but now we decided to migrate to AWS. For logging we use EFK (Elasticsearch, Fluentbit and Kibana) which are based on Helm charts and all were running on the Kubernetes clusters (we have multiple clusters and launching the EFK was always easy).
On AWS we are using the EKS and we decided to use the AWS Elasticsearch service. After creating the ES domain and putting it on the same VPC as the EKS and testing the reachability (From within EKS I can successfully curl the ES) now I am facing many problems regarding connecting Kibana and Fluentbit to ES.
Kibana problems:
I tried to use the kibana helm chart that we use in our cluster but I always get below error:
```bash
{"type":"log","@timestamp":"2021-01-19T16:04:02Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from red to green - Ready","prevState":"red","prevMsg":"Authorization Exception"}
{"type":"log","@timestamp":"2021-01-19T16:04:02Z","tags":["fatal","root"],"pid":1,"message":"Error: Index .kibana_1 belongs to a version of Kibana that cannot be automatically migrated. Reset it or use the X-Pack upgrade assistant.\n at assertIsSupportedIndex (/usr/share/kibana/src/server/saved_objects/migrations/core/elastic_index.js:246:15)\n at Object.fetchInfo (/usr/share/kibana/src/server/saved_objects/migrations/core/elastic_index.js:52:12)"}
FATAL Error: Index .kibana_1 belongs to a version of Kibana that cannot be automatically migrated. Reset it or use the X-Pack upgrade assistant.
```
I tried to change the version of ES or Kibana and use OSS image, etc but I couldn't solve the problem and since I couldn't find any "reset" command/button in the aws cli/dashboard I deleted the ES cluster and recreated it but I got the same error.
I tried to connect to the kibana endpoint of AWS ES service and although the ES is on the public subnet I couldn't access it. At the end I ran an Nginx in the EKS and expose the kibana plugin endpoint via that and now I can access the kibana dashboard.
Fluentbit problems:
First I tried to just pass the AWS ES endpoint to fluentbit as an environment variable ("FLUENT_ELASTICSEARCH_HOST") but this didn't work and although I didn't have any error in the logs of fluentbit I didn't receive any logs in the ES.
Then I found that newer versions of fluentbit support AWS ES as an "output" so I tried the newer fluentbit helm chart and I configured the output as below:
```yaml
## https://docs.fluentbit.io/manual/pipeline/outputs
outputs: |
[OUTPUT]
Name es
Match *
Host https://vpc-elasticsearch-cluster-xxxxxxxxxxx.eu-west-1.es.amazonaws.com
Port 443
AWS_Auth Off
AWS_Region eu-west-1
tls On
Logstash_Format On
Retry_Limit False
```
And now fluentbit is giving me below error:
```bash
[2021/01/20 15:13:29] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:29] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:29] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:29] [ warn] [engine] failed to flush chunk '1-1611149110.838858668.flb', retry in 1341 seconds: task_id=910, input=systemd.1 > output=es.0
[2021/01/20 15:13:29] [ warn] [engine] failed to flush chunk '1-1611147719.849830914.flb', retry in 1817 seconds: task_id=363, input=tail.0 > output=es.0
[2021/01/20 15:13:29] [ warn] [engine] failed to flush chunk '1-1611153638.73469427.flb', retry in 968 seconds: task_id=1812, input=systemd.1 > output=es.0
[2021/01/20 15:13:30] [ warn] net_tcp_fd_connect: getaddrinfo(host='https://vpc-elasticsearch-cluster-xxxxxxxxxxx.eu-west-1.es.amazonaws.com'): Name or service not known
[2021/01/20 15:13:30] [ warn] net_tcp_fd_connect: getaddrinfo(host='https://vpc-elasticsearch-cluster-xxxxxxxxxxx.eu-west-1.es.amazonaws.com'): Name or service not known
[2021/01/20 15:13:30] [ warn] net_tcp_fd_connect:
Until now we were hosting our clusters ourselves on bare metal but now we decided to migrate to AWS. For logging we use EFK (Elasticsearch, Fluentbit and Kibana) which are based on Helm charts and all were running on the Kubernetes clusters (we have multiple clusters and launching the EFK was always easy).
On AWS we are using the EKS and we decided to use the AWS Elasticsearch service. After creating the ES domain and putting it on the same VPC as the EKS and testing the reachability (From within EKS I can successfully curl the ES) now I am facing many problems regarding connecting Kibana and Fluentbit to ES.
Kibana problems:
I tried to use the kibana helm chart that we use in our cluster but I always get below error:
```bash
{"type":"log","@timestamp":"2021-01-19T16:04:02Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from red to green - Ready","prevState":"red","prevMsg":"Authorization Exception"}
{"type":"log","@timestamp":"2021-01-19T16:04:02Z","tags":["fatal","root"],"pid":1,"message":"Error: Index .kibana_1 belongs to a version of Kibana that cannot be automatically migrated. Reset it or use the X-Pack upgrade assistant.\n at assertIsSupportedIndex (/usr/share/kibana/src/server/saved_objects/migrations/core/elastic_index.js:246:15)\n at Object.fetchInfo (/usr/share/kibana/src/server/saved_objects/migrations/core/elastic_index.js:52:12)"}
FATAL Error: Index .kibana_1 belongs to a version of Kibana that cannot be automatically migrated. Reset it or use the X-Pack upgrade assistant.
```
I tried to change the version of ES or Kibana and use OSS image, etc but I couldn't solve the problem and since I couldn't find any "reset" command/button in the aws cli/dashboard I deleted the ES cluster and recreated it but I got the same error.
I tried to connect to the kibana endpoint of AWS ES service and although the ES is on the public subnet I couldn't access it. At the end I ran an Nginx in the EKS and expose the kibana plugin endpoint via that and now I can access the kibana dashboard.
Fluentbit problems:
First I tried to just pass the AWS ES endpoint to fluentbit as an environment variable ("FLUENT_ELASTICSEARCH_HOST") but this didn't work and although I didn't have any error in the logs of fluentbit I didn't receive any logs in the ES.
Then I found that newer versions of fluentbit support AWS ES as an "output" so I tried the newer fluentbit helm chart and I configured the output as below:
```yaml
## https://docs.fluentbit.io/manual/pipeline/outputs
outputs: |
[OUTPUT]
Name es
Match *
Host https://vpc-elasticsearch-cluster-xxxxxxxxxxx.eu-west-1.es.amazonaws.com
Port 443
AWS_Auth Off
AWS_Region eu-west-1
tls On
Logstash_Format On
Retry_Limit False
```
And now fluentbit is giving me below error:
```bash
[2021/01/20 15:13:29] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:29] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:29] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:29] [ warn] [engine] failed to flush chunk '1-1611149110.838858668.flb', retry in 1341 seconds: task_id=910, input=systemd.1 > output=es.0
[2021/01/20 15:13:29] [ warn] [engine] failed to flush chunk '1-1611147719.849830914.flb', retry in 1817 seconds: task_id=363, input=tail.0 > output=es.0
[2021/01/20 15:13:29] [ warn] [engine] failed to flush chunk '1-1611153638.73469427.flb', retry in 968 seconds: task_id=1812, input=systemd.1 > output=es.0
[2021/01/20 15:13:30] [ warn] net_tcp_fd_connect: getaddrinfo(host='https://vpc-elasticsearch-cluster-xxxxxxxxxxx.eu-west-1.es.amazonaws.com'): Name or service not known
[2021/01/20 15:13:30] [ warn] net_tcp_fd_connect: getaddrinfo(host='https://vpc-elasticsearch-cluster-xxxxxxxxxxx.eu-west-1.es.amazonaws.com'): Name or service not known
[2021/01/20 15:13:30] [ warn] net_tcp_fd_connect:
docs.fluentbit.io
Outputs | Fluent Bit: Official Manual
getaddrinfo(host='https://vpc-elasticsearch-cluster-xxxxxxxxxxx.eu-west-1.es.amazonaws.com'): Name or service not known
[2021/01/20 15:13:30] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:30] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:30] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
```
Can someone please tell me how I can solve this issue?
BTW why creating a logging stack on AWS is so hard? I mean the first time that I was creating an EFK stack I just needed to configure the environment variables and install the helm chart and this was always one of the easiest parts of setting up a new cluster for me but here it took me multiple days, the whole point of having a service is to make things easy and not harder.
https://redd.it/l1b8rz
@r_devops
[2021/01/20 15:13:30] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:30] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
[2021/01/20 15:13:30] [error] [io_tls] flb_io_tls.c:359 NET - Connection was reset by peer
```
Can someone please tell me how I can solve this issue?
BTW why creating a logging stack on AWS is so hard? I mean the first time that I was creating an EFK stack I just needed to configure the environment variables and install the helm chart and this was always one of the easiest parts of setting up a new cluster for me but here it took me multiple days, the whole point of having a service is to make things easy and not harder.
https://redd.it/l1b8rz
@r_devops