Reddit DevOps
268 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
How to take docker-compose to production?

I have a node.js and postgresql app in docker-compose that I want to take to production and expose to the Internet. What options do I have?
I would prefer to keep it simple and not have to configure or learn different services of cloud providers. IMO it should be transparent. But maybe you guys can tell me better.

Do I need to go for K8 or Service Mesh or aggregate multiple cloud services together?

https://redd.it/p3k2q2
@r_devops
Using secrets in kube prom stack helm chart

Hey guys. Coincidentally, I was trying to configure alert manager using the kube prom stack helm chart and saw another post along similar lines.

Does anyone have ideas on how I could reference a secret in the values.yaml ? I have a K8s secret created which contains the slack webhook url

...
config:
global:
resolve_timeout: 5m
route:
...
receivers:
- name: 'slack-test'
slack_configs:
- api_url: <<slackApiUrl>>

It works if I have the url pasted directly, but would be good to retrieve the value from the K8s secret that's deployed. Any pointers appreciated!

https://redd.it/p2wpmw
@r_devops
Update on CircleCI Config

Yesterday I submitted this post asking for advice on my CI config: https://www.reddit.com/r/devops/comments/p2y21u/adviceoncircleciconfig/

I am pretty happy with it now, but would like to hear if you have any other suggestions. Here are the changes I have made due to suggestions from that first post:

I created a `compose.test.yaml` file with:

services:

flask:
build:
context: ./flask

dockerfile: Dockerfile
image: myapp/flask
volumes:
- ./test
results:/testresults
environment:
FLASK
APP: "manage.py"
FLASKENV: "test"
FLASK
CONFIG: "test"
TESTDATABASEURL: "postgresql://runner:runner@db:5432/circletest"
command: pytest "app/tests" --cov="app" -p no:warnings --junitxml=/test
results/junit.xml
dependson:
- db

db:
image: circleci/postgres:13-postgis
environment:
- POSTGRES
USER=runner
- POSTGRESPASSWORD=runner

New `config.yml`:

build:
machine:
image: ubuntu-2004:202107-02
steps:
- checkout

- run:
name: Create Results Directory
command: mkdir test
results && chmod 777 testresults

- run:
name: Building Containers
command: make test
build

- run:
name: Running Tests
command: make test

- storetestresults:
path: testresults

- store
artifacts:
path: testresults

I created a `Makefile` with:

test
build:
@echo "Running Test - Build"
docker-compose -p mytest -f compose.test.yaml build

test:
@echo "Running pytest"
docker-compose -p mytest -f compose.test.yaml up --exit-code-from flask

https://redd.it/p3mk1s
@r_devops
DevOps Bulletin - Digest 15 is here 🔥

Hey folks!
This week digest covers the following topics:

* NSA Kubernetes Hardening guidance
* Building a CDN from scratch in 5 hours
* Docker container security cheat sheet

The full digest is available here: [https://issues.devopsbulletin.com/issues/kubernetes-hardening-guidance-by-nsa.html](https://issues.devopsbulletin.com/issues/kubernetes-hardening-guidance-by-nsa.html)

https://redd.it/p3o7hi
@r_devops
Permission Denied on EFS mounted to SFTP server?

# TL;DR - questions

Customer files are failing to upload - depends on the day where they fail, and which files - some come through fine, others fail with partial upload. This is only specific to one customer, so I suspect that it's their custom SFTP software or their network QoS that might be causing the issues - but I don't have a good way to prove where the issue is based on error messages.

context: No other customers are having this issue - it is unique to a specific vendor, and happens for any of their users and nobody else - so I'm 9000% sure it's on their end.

Logs are posted below, and I've got some questions:


1. Is the Permission denied error here just a shitty error message for something else that's happening

2. What's a good way to pinpoint where/why the process\_write: write failed error message is happening that appears to precede their disconnection from the server

3. What the hell else can I look at to identify why the process\_write failure might be happening or to identify where in the stack the error is being caused from?

# More details:

Customer is trying to upload data to our SFTP server. They keep getting partway through a large file and then our SFTP server shows a permission denied log message but still has written the partial upload.

We don't do IP whitelisting to this server - it's an EC2 instance with a public IP and I've verified ports are open to all on the security group, so there's nothing between the server and it's connection - it's just a direct customer connection to the server.



Jul 29 12:12:24 use-prod-transfer1 sshd[32031]: Accepted password for customer-user from 192.168.29.123 port 9172 ssh2
Jul 29 12:12:24 use-prod-transfer1 systemd-logind[1094]: New session 18865 of user customer-user.
Jul 29 12:12:24 use-prod-transfer1 internal-sftp[32088]: session opened for local user customer-user from [192.168.29.123]
Jul 29 12:12:24 use-prod-transfer1 internal-sftp[32088]: received client version 3
Jul 29 12:12:24 use-prod-transfer1 internal-sftp[32088]: realpath "."
Jul 29 12:12:24 use-prod-transfer1 internal-sftp[32088]: open "/writeable/customer-file" flags WRITE,CREATE,TRUNCATE mode 0666
--
Jul 29 13:00:40 use-prod-transfer1 internal-sftp[32088]: error: process_write: write failed
Jul 29 13:00:40 use-prod-transfer1 internal-sftp[32088]: sent status Permission denied
Jul 29 13:00:40 use-prod-transfer1 internal-sftp[32088]: close "/writeable/customer-file" bytes read 0 written 1305450000
Jul 29 13:00:40 use-prod-transfer1 internal-sftp[32088]: session closed for local user customer-user from [192.168.29.123]
Jul 29 13:00:40 use-prod-transfer1 systemd-logind[1094]: Removed session 18865.

I verified that their user owns the directory that we're having them drop into via chroot:


root@ip-10-1-2-3:/sftp-home/customer-user# ls -lah
total 56K
drwxr-xr-x 5 root etl 6.0K May 1 2020 .
drwxr-xr-x 874 root root 38K Aug 11 13:17 ..
drw-r----- 18 root root 6.0K Aug 3 13:00 archive
drwxr-sr-x 2 root etl 6.0K Jun 24 10:23 dev
-rw-r--r-- 1 root etl 767 Feb 4 2020 README.txt
drwxrwxr-x 3 customer-user sftp-only 6.0K Aug 3 14:00 writeable
root@ip-10-1-2-3:/sftp-home/customer-user#

And our chroot config in sshd:


Match LocalPort 22
ForceCommand internal-sftp -l VERBOSE -f LOCAL6
AllowGroups sftp-only
PasswordAuthentication yes
RSAAuthentication no
X11Forwarding no
AllowTcpForwarding no
ChrootDirectory /sftp-home/%u


The directory /sftp-home/%u is folder on an AWS EFS filesystem with bursting allowed, and no restrictions (which we probably should have, but at least it's not part of this problem).

&#x200B;

TCP dumps from our end yielded no insight - it was literally just packets being sent, and then no more packets being sent at the time of the error message and client disconnect, with
nothing but successful transfers for almost an hour of tcp dumps. No retransmits, so the network connection is clean - no failed keepalives - everything looks like a perfectly happy connection until it goes bye bye.

As far as I can tell, there is nothing that we have configured that's unique for this user - and I'm at a loss for how to prove it's not us when the log messages don't give any indication of what the hell is actually failing.

Help?

https://redd.it/p3pk29
@r_devops
Two Bitbucket cloud migration questions

I already asked about this in the Bitbucket community forums, but on the chance that somebody here has some concrete information, I'll ask.

1. Currently, the Bitbucket migration path from local server to cloud allows for migration of repository data, but not repository metadata. That means users can migrate code, tags, and branches, but not pull requests and comments. According to the Bitbucket webpages, that may happen, if enough users express interest -- but the webpages said that in November 2020, and no change since then. Does anybody have any updated information about that? The PRs and comments are important to us.
2. The Bitbucket webpages have also been promising a Bitbucket Cloud Migration Assistant since November 2020 as well. If you sign up for the early access program, they promise to send you updates on when the Migration Assistant will be available. So far, the only availability date I can find is Real Soon Now^(TM). Does anybody have newer news about the Migration Assistant?

https://redd.it/p3pbf8
@r_devops
Intelligent synchronization between servers for debian

I am looking for a program for debian that would track the use of files in a selected location on server A and, on this basis, would be able to select the data that are most frequently used and should be synchronized with server B. Something like intelligent synchronization. Do you know such a program?

https://redd.it/p3rh5m
@r_devops
Can you give the CI/CD Tools to Learn if you want to be a DevOps engineer?

From where to start ?

https://redd.it/p2uyv4
@r_devops
Is your devops just ops automation?

Been in software for along time.
Remember when DevOps came out.. we talked a lot about it being a culture not a team.

Seems like the current form is that devops is a team?

Has DevOps really just become a automation for ops team?

Do your teams of "devs" not know how the prod systems work... they just bang out code with no notion of how it does what it does once they commit or issue a PR?

https://redd.it/p3ulan
@r_devops
Should you use AWS Route 53 for both your domains and subdomains or use it only for one of them?

We have a domain on GoDaddy and planning on routing the traffic to it through Route 53 and later will be creating subdomains using Route 53 too. So I wanted to know what are the pros and cons and also the security concerns for both scenarios of using GoDaddy for only the domain and Route 53 for subdomains and vice versa.

https://redd.it/p3ud4n
@r_devops
Monitor GitHub Pull Requests with Prometheus

I developed a new exporter so that we can get more insight into hacktoberfest contributions within my company this year. We're also thinking of giving prizes for top three contributors.

I hope others will do the same and find this useful!

https://dev.to/circa10a/monitoring-github-pull-requests-with-prometheus-57p2

https://redd.it/p217ut
@r_devops
Is DevOps a service or a process

I am very confused right now. I thought devops was the process of standing up a server, creating places to put code and data. IE linux server, sql databse. Getting code to the correct place and to make sure everything works the way it should using ci/cd. Soooo can someone explain what Azure devops, gitlab devops, teamcity, and aws devops is? I guess I'm mostly confused by the ci/cd. I don't need a devops service to do ci/cd.

https://redd.it/p3ymai
@r_devops
What to do when you feel stuck in an automation?

I'm new to this role, it's my first project and I'm working with a maven test for mulesoft that I can run in the Mulesoft IDE/Anypoint Studio but I'm stuck in automating it. I tried with my coworkers and they're all busy(also I'm trying yo learn, so I don't call very for help very often), also tried with developers and community/help forum. I'm stuck because I cannot run the test without the enterprise runtime so I got an error describing this and then I use the parameters from the documentation to fix the issue, but maven returns the same error :(

I tried everything that I knew, I tried running the IDE with strace to guess how it performs the maven call and what parameters it uses to work, I tried reconfiguring and also changing the runtime, the machine, the configurations and versions from Java and Maven, now I'm out of ideas and very frustrated. What do you do when this happens?

https://redd.it/p3z8ea
@r_devops
OSS Package Update Hygeine

It’s very tempting to think that once you pull in a React or an Angular, and then a bunch of other less reputable add ons that can add tens or hundreds of other dependencies to your project, to deploy and call it done. But packages may update frequently, dependencies change, maintainers may drop support quickly after major releases, and that large dependency tree is likely to catch some vulnerabilities.

So, I’m curious how well you, or the developers you support, do at keeping OSS dependencies up to date in 1st party applications. I find projects with hundreds of packages that are out of date by months and even years. Outdated on its own may not be the end of the world, but if you get out of date by years and a vuln is found, it may or may not be ported back to your major/minor release. Just taking the latest point release that fixes the vuln is an option if you’ve kept up to date, but can be daunting when you’re several major versions behind.

What have you seen? Do you enforce policies? How do you make sure development teams understand that free OSS isn’t free and the trade off in features comes with a responsibility to stay reasonably up to date?

https://redd.it/p4222u
@r_devops
Recommandations about an automated workflow architecture

I need to run tests on a product I created for my entreprise, and I need the flows to come from a lot of différents public IPs (not private. It's very important). These tests must be run in the cloud. These tests cost is important.

I created a workflow (to test the product) which explains that the x task can be run at the same time than the y task, but the z task must be run after the x and y tasks. Each task (x, y and z) are independent and can be containerized. I want to be able to run a maximum number of tasks at the same time and priorize them.

The workflow is a bit complicated but if it is followed, everything will be ok.

At the beginning, I was planning to create myself an orchestrator in Python. But considering that the tasks I want to run can be containerized, I think it will be by far easier to use containers orchestrators like swarm or kubernetes.

My problem is that the tasks (or the whole workflow) needs to be run for a very long time or even permanently. Then, it will cost a lot. As I said before, I need the flows to come from many different public IPs at the same time.

Then, considering the minimum hardware requirements of swarm and kubernetes (4 vCPUs/ 8 Go RAM per node for swarm and 2 vCPUs / 2 Go RAM per node for k8s), I think that the best for my project would be to use kubernetes and many 8$ / 2 vCPUs / 2 Go RAM VPSs.

Then, each k8s node/pod will run x tasks at the same time and with the k8s queue, I will be able to do exactly what is written in my workflow in parallel on all the pods / nodes (which are VPSs with 2 vCPUs / 2 Go RAM)

- Can you please give your opinion about that ?

Note : considering I need the flows needs to come from many public IPs at the same time (to simulate clients connections) I also go the idea to create the following architecture :
- 3 or 4 pods / nodes with each 8 or 12 vCPUs / 8 or 16 Go RAM
- 1 loadbalancer
- Use a /64 public IPv6 subnet (is that possible ? What's the price ?)
The flows of the containers running on the nodes will get out on internet through the loadbalancer and will be natted randomly with a specific IPv6 address. I gave a look at opnsense and pfsense virtual instances but I'm not sure these products can do that...

- Can you also give your opinion about that ?

Thanks a lot and have a good day.

https://redd.it/p441rs
@r_devops
MacOS user and docker networking limits and testing

So how are you folks getting around the limits of macos networking with docker and testing your containers? Do you just have an ecs cluster or something that you do your build tests on? Or are you building a local Linux vm and using that as your container host to test before pushing to your repo?

https://redd.it/p47ria
@r_devops
New devlog for my shoot em' up roguelike game Osore

Hey there, I just uploaded a new video on youtube about the progress I made the past 2 and a half weeks. Mainly it's about the new content I added to the game and new miner features.

Hope you enjoy watching the video, and don't forget to like, subscribe and all that good stuff ;).

https://www.youtube.com/watch?v=DEHSfjGeqSo&t=331s

https://redd.it/p4b69d
@r_devops
Anyone host kubernetes in digitalOcean? What are the limitations? Would anyone be interested in kubespray support for terraform?

I'm thinking about using DigitalOcean due to its relatively cheaper infrastructure costs and wondering if anyone here has noticed any limitations? Is there a reason why Kubespray doesn't have Terraform contrib added for it added yet that anyone can think of offhand? FYI no affiliation with them besides not wanting to pay $200+ per month for a Kubernetes Cluster in AWS/GCP (actually not sure if it's cheaper in GCP just know it's expensive af in AWS for a personal site).

https://redd.it/p4ffpw
@r_devops