Reddit DevOps
266 subscribers
30.9K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
Looking For Advice On Containerizing Complex Application

My companies solution uses multiple Systemd services for different modules (API, front end, etc), MongoDB, Kafka/ZK etc. which traditionally has been distributed to our customers using a stripped down OS with CentOS as a base and distributed as an Iso for installation on bare metal servers as an appliance. The ISO uses a in memory OS to build a permanent OS shim that has all of our modules, dependencies, etc installed after formatting/partitioning disks through bash scripts before doing the chroot to the actual OS.

We're currently going through a transition with trying to scale the product and personally I feel that distributing our product in this way isn't ideal from an CI/CD and customer perspective as we have to build and distribute an Iso and go through a complicated migration to upgrade customers on older versions. I've been looking into ways to containerize our product and am unsure if it's worth the effort or even possible.

Some specific challenges:

Our OS base has been CentOS in the past but with hits EOL we have switched to using Oracle 8.4 for compatibility with existing RPM packages and ease of migration. We strip out a lot of stuff and lock the kernel to prevent customers messing with and complicating the OS environment and wrap most stuff in a limited custom shell.

We use Kafka/Zookeeper for inter module communication along with our REST API, as well as communicating between nodes (separate installs of our product on different servers) in a cluster.

We use MongoDB on a separate disk/partition for the database from the OS disk. We shard the DB for use across multiple nodes in the cluster to keep them in sync.

We support managing and mounting different types of backend storage including NFS, SMB/CIFS, LTFS, S3.

Each module that comprises our solution is built using Java/Kotlin and is ran as a service using Systemd. This is about 8-10 different services/modules.

Does something like this sound like it'd be worth trying to containerize or is it too complicated and would defeat the purpose of containers (isolation, security, etc.)?

I know ideally we would need a container per service/module, one for the DB, Kafka/ZK. Having to support systemd (at least in current iteration) as well as mounting the different disk types require elevated permissions and lower the isolation to the host. Also the networking with other nodes seems like it will be a nightmare, so I'm not sure it's even worth attempting. Thoughts?

https://redd.it/r91oa2
@r_devops
What silly thing am I missing here

Hey there,

I'm developing my first Azure Devops pipeline and hit a snag. I'm sure I've overlooked something simple here. Appreciate if you can point it out :-)

I'm building a pipeline which deploys a new Forest & child AD.

I can successfully create the DC VM (OS:2019-Datacenter, version:latest) & accommodating network interface / disks.

Google then told me I needed to implement a desired state configuration (DSC) to configure the domain.

So I call it within my ARM template with the following:

"resources":
{
"name": "CreateADForest",
"type": "extensions",
"apiVersion": "2019-12-01",
"location": "[parameters('location')",
"dependsOn":
"[resourceId('Microsoft.Compute/virtualMachines', parameters('dcCastleVirtualMachineName'))"
],
"properties": {
"publisher": "Microsoft.Powershell",
"type": "DSC",
"typeHandlerVersion": "2.20",
"autoUpgradeMinorVersion": true,
"settings": {
"ModulesUrl": "variables('adPDCForestModulesURL')",
"ConfigurationFunction": "variables('adPDCForestConfigurationFunction')",
"Properties": {
"DomainName": "parameters('domainNameCastle')",
"AdminCreds": {
"UserName": "parameters('dcCastleAdminUsername')",
"Password": "PrivateSettingsRef"
},
"childDomainDNSIP": "10.0.0.1",
"childDomain": "parameters('domainNameTower')"
}
},
"protectedSettings": {
"Items": {
"AdminPassword": "parameters('dcCastleAdminPassword')"
}
}
}
}

And that successfully downloads/invokes my "Configuration block" CreateADPDCForest.

Then my configuration block looks like this:

Configuration CreateADPDCForest
{ param    ( < snip> )
Import-DscResource -ModuleName xActiveDirectory, xStorage, xNetworking, PSDesiredStateConfiguration, xPendingReboot
<snip>
}

But this fails with the following error:

Import-DscResource -ModuleName xActiveDirectory, xStorage, xNetwo ...Could not find the module 'xActiveDirectory'

And I have noidea why.

Am I meant to install xActiveDirectory first? I tried that (I think) and it still failed. No blogs online seem to have to install it.

My image reference:

"imageReference": {
"publisher": "MicrosoftWindowsServer",
"offer": "WindowsServer",
"sku": "2019-Datacenter",
"version": "latest"
},

And the WMF version is: 5.1.17763.2268

Am I meant to be using 6.x+? If so, how do I "upgrade" my WMF to that version?

Cheers in advance,

https://redd.it/r93eb2
@r_devops
Anyway to remove a commit from code commit mistake made

I made a really stupid mistake and committed to the wrong repo noticed a split second after and reverted it but code commit doesn’t allow rebase so I currently have a commit and reversion showing on a FE repo when I intended to update a pipeline repo that was building that FE repo.

Worse still it was the master branch!.
The previous commit on there was 2 years ago and on the most recent branch 1 year ago and I think there planning to move to a new repo but there repo might still get updated.

Me and a more senior dev had been working off our pipeline repo master branch to update Our WIP pipelines (I know bad practice we should likely have it follow and work off another branch to avoid this very issue happening) I also use -am so it takes less time to go from change to push also admittedly increasing the chance of this very mistake though it’s the first time Iv made it

Is there anyway for me to fix this?

https://redd.it/r5ivnl
@r_devops
A tip to avoid having a false sense of security on GitHub

Good: Enable branch protection policies

Better: Configure CodeOwners

Best: Ensure PRs are ACTUALLY reviewed before approved

What do you think about this approach?

https://redd.it/r50tvl
@r_devops
Looking for advice on DevOps boot camp selection and is it even worth it?

I am a Marine Corps Veteran with 12+ years of project management experience. Over the past couple of years I have been looking to move into tech and away from my current field mainly due to lack of upward mobility. As a veteran there is a specific program called Vet Tec that covers the cost of certain accredited boot camps around the country.
Most of my technical experience revolves around basic IT related things (I’m the neighbor/friend/family member everyone calls to help with their computer problems). As a project manager my brain is wired to find inefficiencies and correct them. I have very basic Python experience and have dabbled with some physical computing via Raspberry Pi’s.

The program I’m looking into offers a DevOps course that gets you the following Certs:

-ISA 1002 CompTIA Security+ | 72 hours
-ISA 1005 Certified Ethical Hacker (CEH) | 72 hours
-DEV 1003 Splunk Core User | 72 hours

I would love some feedback from the community on what you think about this offer. Specifically, is this really worth my time. I’m capped currently at making about $85k a year and have a ridiculous commute that has me going from project to project driving 9hrs a day.

https://redd.it/r561fz
@r_devops
Wanting to start learning about DevOps but stuck between getting Azure Certification or AWS?

Hi everyone, I am a devops noob. I have 5 years experience in IT and it wasn't till recently, I decided I wanted to be in DevOps. A friend of mine has encouraged me to study and take the Azure Sys Admin exam so I have been studying for this but now I'm caught up with whether I should be spending my time focusing on Azure certification or change my focus completely to AWS instead. Should I just push forward and get Azure certified or change focus to AWS? Please help :(

https://redd.it/r551gn
@r_devops
Kubernetes nodes autoscaling

I'm new to the kubernetes ecosystem and I'd like to know if an open source tool exists that will allow to launch more kubernetes nodes across clusters/providers.

I understand that Kubernetes comes with a pod horizontal autoscaling capability and Rancher is managing nodes connectivity to create a k8s cluster .

What I'd like is a tool that based on prometheus metrics (or other), I can launch nodes in a given provider, similar to AWS autoscaling group, but vendor agnostic, so I would take advantage of a multi cloud cluster.

Ironically AWS launched Karpenter https://karpenter.sh/ which seems to do that but I'm not sure if I understand it correctly. (It supports only AWS atm)

How do you manage nodes autoscaling in your k8s setup ? How can the nodes register themselves to rancher ?

https://redd.it/r9c36e
@r_devops
ADO, YAML, and Terraform question for VM builds

Currently we are deploying VMs to Azure using YAML pipelines with ARM templates. We pass a couple variables like VM name/size, RSG, region, ect. Throw the YAML pipeline in the variables section provided by ADO when we want to build a new VM. Since it’s ARM it doesn’t care that we are just rerunning the same pipeline each time but changing the variables for a new VM.

My question is how can we do something similar with Terraform? From my understanding if we were to rerun the same pipeline Terraform would go “oh hey I see you got this new server but also you didn’t mention the old one so yeah I’ll make you a bee server but at the same time delete your previous one.”

Any times or links to articles with some details would be great!

https://redd.it/r9bg8j
@r_devops
As a sysadmin, can I make our VM provisioning process more similar to devops best practices?

Hello everyone. I'm a junior sysadmin who has been trying to learn the devops ways for a short while now.

I would like to discuss with you about how we provision our VMs for our users and get feedback if it can be improved. I know everything can be improved, and it's nice that I want to learn but I'm not sure it's worth it if we have a certain way of doing it which has very little flaws.

Each employee in our organization gets a gateway provisioned for him (Usually a CentOS m4/m5 EC2 instance).

Our way of provisioning VMs is we have a web-ui that wraps a bunch of Ansible playbooks and bash scripts. When executed, the playbooks create the VM, configure automounts, VNC settings, join it to our domain, etc.

I was wondering if I can utilize other tools or best practices to perform the same tasks, maybe even make it better somehow? My current struggle is maintenance usually, which isn't really a struggle as it's just a minor inconvenience to debug errors in this process sometimes.

I'm pretty much clueless when it comes to IaC tools and even my Ansible isn't that good, but I'm willing to learn and it would be great to work on a tool that would bring me real world experience in this role, which might help me become DevOps one day.

Any suggestions are welcome.

Thanks

https://redd.it/r9dpd7
@r_devops
Use GOTOAWS to simplify the AWS CLI tool.

GoToAWS is a tool that simplifies the AWS CLI for several operations.

I'm not sure how well-known it is, so I wanted to show it off. This video is short and digestible, so I hope you all enjoy it.

https://www.youtube.com/watch?v=uLtx1PUUZJQ

Let me know if you have any questions!

Cheers!

https://redd.it/r9ijro
@r_devops
"Error in decrypting data with cmk"

I am trying to granularize my ECS task role permissions, it was Adminstrative access earlier, so for this server I gave every possible access it might need, along with AWSKeyManagementServicePowerUser, but it still throws the above error. But when I add Adminstrative Access, it works.

Without Adminstrative access, I am getting 405 error on my server.

I couldn't find higher permission for KMS than the above mentioned.


Any idea which permission should I give?

Also, I tried searching from Cloudtrail but there are just so many calls(heath checks) so it gets really hard to figure out mine.

https://redd.it/r9pb55
@r_devops
Best OpenSource password manager for enterprises??

So my company has been using some Keepass databases to manage passwords but now we're growing in clients, projects and employees and I'm looking for a proper solution to manage this kind of stuff (passwords, .keys, etc). In other jobs I've used Teampass, wich covers very well every single need we had, but it's been a little slow in releases and I'm looking for alternatives. I need something that can store any kind of credentials, share with teams and individuals, manage permissions, and self-hosted. What do you use for this kind of jobs?

https://redd.it/r9r5u7
@r_devops
Dedicated/Cloud/My Own GPU server with Tesla v100

Hello, I'm new to machine learning and want to start a project. I need to get hold of a server first. I've been searching and the lowest I could find was for 999 euro per month. It's still a lot like 8 times more than normal dedicated server. Does anyone know a cheap cloud or dedicated server which you have used? What about buying it? Any inputs?

https://redd.it/r9jkql
@r_devops
Linode vs GCP bucket storage service

I am having a hard time figuring out which storage bucket service is cheaper for less 50gb of data

I was comparing linode and GCP storage and it seems like GCP is cheaper, but I honestly am not sure

I have a server currently hosted on linode, but I was thinking of slowly migrating it to gcp for future scalability. I'm not sure if it's a better option to pick gcp storage or linode storage service?

Can I get your opinion on this

it seems hard to compare them

https://redd.it/r9wc3p
@r_devops
How Much Do You Really Care About K8s Jobs and CronJobs?

Hey all,

I work for a company in the DevOps tools space building a Kubernetes troubleshooting platform, and we're evaluating whether we can bring more value to our users by offering visibility & monitoring for K8s Jobs and CronJobs.

We've learned that for many organizations that use Jobs extensively, the existing tools don’t provide sufficient visibility (i.e status, latest runs, logs, context when Jobs fail, etc.)

I'm curious to know how critical Jobs/CronJobs are to your business? Suppose a CronJob failed in the middle of the night, would you or anyone else lose sleep over it? would you like to have more visibility into K8s Jobs? If so, what are you missing most?

So what do you say, folks? Is this feature worth developing? Or in other words — do you really care about Jobs and CronJobs?


Here's a mockup of what this feature might look like.

https://redd.it/ra2az9
@r_devops
👍1
6 things to consider when defining your Apache Flink cluster size

One of the frequently asked questions by the Apache Flink community revolves around how to plan and calculate a Flink cluster size (i.e. how to define the number of resources you will need to run a specific Flink job). Defining your cluster size will obviously depend on various factors such as the use case, the scale of your application, and your specific service-level-agreements (SLAs). Additional factors that will have an impact on your Flink cluster size include the type of checkpointing in your application (incremental versus full checkpoints), and whether your Flink job processing is continuous or bursty. 


The following 6 aspects are, among others, some initial elements to consider when defining your Apache Flink cluster size:

1. The number of records and the size per record

2. The number of distinct keys and the state size per key

3. The number of state updates and the access patterns of your state backend

4. The network capacity

5. The disk bandwidth 

6. The number of machines and their available CPU and memory

&#x200B;

More details and info: https://www.ververica.com/blog/6-things-to-consider-when-defining-your-apache-flink-cluster-size

https://redd.it/ra545l
@r_devops
DevOps Bulletin Newsletter - Issue 28

DevOps Bulletin - Digest #28 is out, the following topics are covered:

* **How to build a centralized logging platform with ELK, Kafka and K8s**
* **75 exercises to improve your Python regex skills**
* **The lazier way to manage everything Docker**
* **How to write an effective incident reports**
* **How to integrate your CI/CD pipeline with Kubernetes when using RBAC**


Complete issue: [https://issues.devopsbulletin.com/issues/writing-incident-reports.html](https://issues.devopsbulletin.com/issues/writing-incident-reports.html)

Feedback is welcome :)

https://redd.it/ra5ws8
@r_devops
Settings up k8s cluster on single vm

Hello there, I have task where I need to orchestrate ELK stack using K8s while having single vm only. I was told to use docker to create k8s cluster, so I tried to use KIND (kuberneteas inside docker) but it is more complicated to understand. So, is there any other way to achieve the goal?

https://redd.it/ra5tgm
@r_devops
Retrieva data from Azure App Configuration with Powershell?

Hi yall, I was suprised that MSFT calm hasn't made a PowerShell Module to work with App Configuration data, so I made my own: link It uses the az appconfig in‍ the background (so you need azure cli installed), but it added support for referencing to other keys, within the value. ## Install module install-module PSAzureAppConfiguration -Repository PSGallery ## Usage Log in Azure account using a service principal $clientId = 'client/app id' $tenantId = 'tenant id' $secret = 'client secret' az login --service-principal --username $clientId --password $secret --tenant $tenantId Get configuration: $MyConfig = Get-AppConfigurationKeyValue -Store MyAppConfigStore -Label Production

https://redd.it/ra6mua
@r_devops