Reddit DevOps
271 subscribers
11 photos
31.1K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
What are the prospects for finding a job or interlocutor?

Hi Everyone

About two years of work as a Linux engineer in a financial organization with all related tools and technologies (processing on-premise) like vSphere, KVM, Deb-based OS`s, ansible, Git and Gitlab CI/CD, rdbms, HA services, many DR plans and performing them, also I have sec+, itil, aws-saa and CKA.

BUT

I recently failed an interview because my spoken English is poor.

but to improve my case, I need to find a better financial offer than now

OR improve my spoken English.

In this regard, I appeal to this subreddit, someone may take me as an assistant with poor English as a junior DevOps employee (yes it is culture).

OR

Whoever can talk with me remotely by voice on various topics? so that neural connections appear in my brain to speak English better, and yes, I'm not in the US or in Europe.

https://redd.it/11f1ifx
@r_devops
We ran a Game Day to stress test our incident response

And have shared a public write-up here:

https://incident.io/blog/game-day

We had a serious outage in November which prompted us to schedule training for major incidents, especially for new joiners to the team.

This post is the write-up of our recent Game Day – where you get someone to manufacture incidents in a test/staging environment – that we ran to provide that training.

We triggered some example incidents like:

1. Scaling our compute to zero
2. GCP IAM misconfigurations
3. Hypothetical "shit, we've leaked a secret"

Always been a fan of this type of incident training, especially as it's so much more engaging and memorable than talks/reading material. We had great fun and the feedback from responders (you can see some survey results here) was really positive.

If you're wondering how to get more people onto your on-call rota or generally improve your incident response, running this training will cost you about 1 day of planning (for the villain + coordinators) and an afternoon to do the actual response.

I'd encourage people to give it a shot!

https://redd.it/11f105m
@r_devops
What other (DevOps) communities are you active in?

Sometimes I need likeminded people to communicate with.
Where do you get your fix?

https://redd.it/11f2612
@r_devops
What are the top challenges Devops teams face when it comes to containerization of apps?

Just trying to understand the most common challenges and roadblocks people run into

https://redd.it/11f4s63
@r_devops
Are TF modules just for re-use or splitting things up or both?

I have a large monolithic project with everything in it. I know this smells wrong and I need to make it more manageable as it is taking ages to run through currently. Every time I read around this - the answer is modules. But the examples given are all like a VM module that you can re-use to reduce code.

But most VMs are not similar, the amount of code I write having a VM module and passing it all the parameters isn't really different to just defining the VM.

Really for now I want to just split the project up into more manageable chunks, perhaps the networking hub spoke in one thing, the shared services in another and so on. If these areas are in sub-folders are they modules too? Just not ones I would re-use or pass parameters into?

https://redd.it/11f50uy
@r_devops
Longhorn/Harvester not seeing HDDs

I am experimenting with Harvester. Harvester/Longhorn only reports the available capacity of the single SSD that it was installed onto. It doesn't seem to see the HDDs.

Do I need to format the HDDs?

They previously were a ZFS pool from the TrueNAS I was experimenting with before I switched this server over to try out Harvester.

https://redd.it/11f6wt9
@r_devops
2nd proxy host in NPM

Hi. IN NPM web page i added my first proxy host and SSL certificate for domain.com \-> mylocal:8081.

I want to add 2nd proxy host with the same domain name but this time with different port domain.com \-> mylocal:9443..While doing this I get error for using same domain name. any ideas pls

https://redd.it/11f9izv
@r_devops
Is anyone here using akeyless?

Does anyone have any experience with akeyless?

I have been asked to compare it against Hashicorp Vaiult and while I understand a good bit of Vault, I have never looked into Akeyless before.

I am going through the documentation but is there anyone around that has used it before and has anything good or bad to say?

https://redd.it/11fam1a
@r_devops
Monthly 'Shameless Self Promotion' thread - 2023/03

Feel free to post your personal projects here. Just keep it to one project per comment thread.

https://redd.it/11fbfqf
@r_devops
Devops release classic deployment to appservice: Could not complete the request to remote agent URL to a specific app service deployment

Hi, I am trying to deploy a application to my app service, normally it is working as normal but at the moment I am getting "Could not complete the request to remote agent URL to a specific app service deploymen"

command]"C:\Agent\_work\_tasks\AzureRmWebAppDeploymentxxxxx\4.217.2\node_modules\azure-pipelines-tasks-webdeployment-common\MSDeploy3.6\MSDeploy3.6\msdeploy.exe" -verb:sync -source:package='C:\Agent\_work\r1\a\temp_web_package_xxxx.zip' -dest:auto,ComputerName=' -setParam:name='IIS Web Application Name',value='xxxx' -enableRule:AppOffline -retryAttempts:6 -retryInterval:10000 -enableRule:DoNotDeleteRule
Info: Using ID 'xxxxx' for connections to the remote server.
[error]Error: Error: Could not complete the request to remote agent URL 'https://xxxxx.scm.xxxxx.appserviceenvironment.net/msdeploy.axd?site=xxxxx'.

https://redd.it/11fgpke
@r_devops
Ultrawide vs dual monitor

Hi community I have a question I want to do an upgrade for my setup and I was thinking about the ultrawide monitor, what do you think about an ultrawide 49-inch monitor as a DevOps?

https://redd.it/11fhza8
@r_devops
Trouble with Host based routing in ecs fargate.

So i setup a loadbalancer and 3 containers behind it. I want to forward request from api.devops.com to api container in port 5000 so i added listener rule. I wanna do same for a form container in port 8282 from form.devops.com. So when i add more than one rule than application is not accessible. But the actions configured in the listener's default rule are performed.
What am i doing wrong here.

https://redd.it/11fccs2
@r_devops
Exporting/Saving ElasticSearch Kibana (7.10) logs?

Is there an automated way to stream/export/save Kibana logs to, for example, S3? We have to manually delete old indices every 3 months otherwise it crashes our search, and I'd like to lose nothing.

https://redd.it/11f8an9
@r_devops
How do you deal with developers asking for production DB access?

In most of my positions I have been asked by developers to have access to production databases (read-only).

As that could have unwanted consequences like DoS, data leaks, etc I normally have to build some custom tools to give them some sort of access to either anonymized data or to run some EXPLAIN on the data.

What is your experience and what tools have you used?

Thanks!

https://redd.it/11fmo4l
@r_devops
From EKS to ECS + Mongodb

I am using EKS for the moment where i deployed my backend app and also deployed mongodb that is using storing data on a EBS Volume , and my questions is , if i move to ECS how should i deploy my mongo deployment and how to mount the EBS ?

Thanks

https://redd.it/11f24xn
@r_devops
Elastic stack, ELK: Logs drop issue

We have on prem ELK stack, and the primary use is to get the pods logs. K8s have around 100+ micro services

Filebeats >1 Logstash> ES cluster

Issue: inconsistent logs drops ( hard to validate)


1. Can we have one input source and multiple pipeline? Recently we had some changes in pipe post that we saw this issue .. am thinking to make a new pipeline for that requirement
2. How can i find the root causes, what will be your approach


Below is a sample pipeline that has around 15 to 20 output conditions

INPUT
input {
beats {
port => "5044"
}
}
filter {
grok {
match => {"message" => "\[AUDIT\ %{GREEDYDATA:message}"]}
overwrite => "message"
addtag => ["audit"]
}
grok {
match => {"message" => ["\[ERROR\]"]}
add
tag => "error"
}
}

OUTPUT



output {
   if "audit" in tags {
     elasticsearch {
       hosts => "comma seprated es nodes
         index => "audit-%{+YYYY.MM.dd}"
     }
   }
   else if kubernetescontainername == "containername"  {
     elasticsearch {
       hosts => comma seprated es nodes
         index => "conrainername-%{+YYYY.MM.dd}"
     }
.
.
.

https://redd.it/11f80y8
@r_devops
Trunk based development deployment strategies

Trunk based development is the standard for branching today

It is confusing to me and would like to learn how you can perform deployments for different environments when all code is merged to the trunk?

https://redd.it/11f76zz
@r_devops
Gitlab CI Service Not Reachable

Can anyone help me with my CI troubles? I have a project that sets up a Flask webserver via Docker. In CI, I'm trying to use that image as a service to run integration tests on it, such as checking how it handles bad or messy inputs. However, when I use the Docker image to start a service with an alias, pytest can't access it from the CI job.

The Dockerfile for the test image is [here](https://gitlab.com/kitchen-server/kitchen-server/-/blob/ci-cd/test/assets/Dockerfile#L24-26). The highlighted lines (everything below `ENTRYPOINT`) are the only differences between the base and test image.

Example job that's failing [here](https://gitlab.com/kitchen-server/kitchen-server/-/jobs/3848991470). (The Dockerfile exposes port 8080 and that's what the underlying program binds to, which is why I'm using that port in CI - is that incorrect?)

ERROR test/integration_tests.py - requests.exceptions.ConnectionError: HTTPConnectionPool(host='test-service', port=8080): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f74768f9f50>: Failed to establish a new connection: [Errno -2] Name or service not known'))

Speaking of, that's [here](https://gitlab.com/kitchen-server/kitchen-server/-/blob/ci-cd/ci/branch.gitlab-ci.yml) (ignore that the build job is commented out; I just didn't want to burn CI minutes running it while troubleshooting my testing stage). I define a test image tag, a service alias, and a service URL based on the alias on L24-26. Then in the failing job, I define a service using the test image tag as the name, and the alias from the alias variable. Then the script runs a couple pip installs before calling pytest on a specific file.

variables:
LATEST_TAG: $CI_REGISTRY_IMAGE:latest
IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG
TEST_TAG: $IMAGE_TAG-test
TEST_SERVICE_ALIAS: test-service
TEST_URL: https://$TEST_SERVICE_ALIAS:8080
[...]
integration-tests:
stage: test
image: python:3.11.2-slim
services:
- name: $TEST_TAG
alias: $TEST_SERVICE_ALIAS
script:
- pip install -r pie_chart/requirements.txt
- pip install -r test/assets/requirements.txt
- echo "$TEST_SERVICE_ALIAS"
- echo "$TEST_URL"
- pytest test/integration_tests.py

The error comes from [this part](https://gitlab.com/kitchen-server/kitchen-server/-/blob/ci-cd/test/integration_tests.py#L29-31) of the program, specifically the `requests.get()` call. It seems to read and print to stdout the URL correctly as I defined it in the CI file, as shown in the CI job linked above.

test_url: str = os.environ.get('TEST_URL')
print(f"\n == Test URL: {test_url} ==\n")
response: requests.Response = requests.get(test_url)

I've tried

* A ton of different URLs
* All combinations of \["" | "https://" | "https://"\] + \["test-service" | "localhost" | "127.0.0.1"\] + \["" | ":80" | ":8080"\]
* Defining the service in the job
* Defining the service locally
* `FF_NETWORK_PER_BUILD: "true"`

https://redd.it/11f3yqa
@r_devops
Https listener rule weird behavior

Hello, everyone. I have a problem with the listener rule. As seen in the picture the rule 1 https://okynepal.com/login is overridden by the default (last) rule. When i enter https://okynepal.com/login it should have forwarded request to cms-tg but it is forwarding to adminer-8080-tg. When i remove the default (last) rule then it is working properly. What may be the problem?

https://redd.it/11fu154
@r_devops
What do you suggest as a distro to learn devops in virtualbox?

Hi devops enthusiasts.
I'm a beginner. I have a 1 year experience of backend development and I want to self learn devops tools and technologies and get a job. I have previously used Ubuntu for around 5-6 years as a personal os.

What are your suggestions for a somewhat lightweight, preferably somewhat graphical os for me to install on virtualbox? My learning path also includes lpic1, lpic2 and networking; the rest are mainly devops tools.

https://redd.it/11f8ycy
@r_devops
Can someone recommend resources for learning VMware Tanzu?

I usually lookup docs, courses, Kodekloud (website), or any of the other popular sources (e.g. popular YouTube channels to get me started)
I also tried searching on LinkedIn learning and O'Reilly.
No luck though. Maybe I'm going about it the wrong way (I'm searching for Tanzu courses, maybe I should search for another topic that uses Tanzu)
or maybe Tanzu is less popular than I though (or more enterprise-y).

I'm hoping for a course but I'm Okey with docs or even a book (hopefully not but beggers can't be choosers)

https://redd.it/11f0pne
@r_devops