Reddit DevOps
270 subscribers
5 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
need help with debugging yaml - game of pods

when i start the k8 terminal, it's starting not at the "master" directory, rather the "controlplane". That said, the controlplane has the role of master

steps:

ssh node01

mkdir /drupal-data

exit

cat > drupal-pv.yaml

---
apiVersion: v1
kind: PersistentVolume
metadata:
name: drupal-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
hostPath:
path: "/drupal-data"


kubectl apply -f drupal-pv.yaml.

this gives an error:


"drupal-pv.yaml": error validating data: ValidationError(PersistentVolume.spec): unknown field "storage" in io.k8s.api.core.v1.PersistentVolumeSpec; if you choose to ignore these errors, turn validation off with --validate=false

i fixed it by adding. under spec:

volumeMode: FileSystem

now the error is:

ValidationError(PersistentVolume): unknown field "capacity" in io.k8s.api.core.v1.PersistentVolume;

in the API [docs](
https://docs.openshift.com/container-platform/3.9/rest_api/api/v1.PersistentVolume.html) for v1, i see there is a capacity section and this game is also built on it, so what do I do?

----

trying to apply it by setting --validate=false:

controlplane $ k apply -f drupal-pv.yaml --validate=false
The PersistentVolume "drupal-pv" is invalid:
spec.capacity: Required value
spec.capacity: Unsupported value: core.ResourceList(nil): supported values: "storage"
spec: Required value: must specify a volume type


i'm confused

https://redd.it/o65umj
@r_devops
Which course was most helpful on your way to DevOps?

Hello, everyone, I hope you are well.

Recently I transitions to DevOps from a traditional sysadmin background.

My line manager recently told me that there would be some budgeting for trainings/certifications available.

Can any of you recommend a good course that is fairly up to date and covers AWS solutions architect associate? Of course I'm only interested in courses I can pay for only (no monthly/yearly subscribtion stuff). I'm also interested in k8s, so feel free to recommend about this also.

It might take some time before I actually find if they will cover the whole fee or just co-sponsor me, but in any case the first step would be to pick a course and let them know :)

​

Thanks for your time and suggestions!

https://redd.it/o70eew
@r_devops
Billing shouldn't be a nightmare for your customers

Hey r/devops! how are you?

We wrote a blog post about how to protect customers from excessive billing using billing limits that I think you will find interesting.

https://mediamachine.io/blog/protect-your-customers-with-billing-limits/

How are you implementing this? do you think our solution is a good one? or is there something we can do to improve it?

https://redd.it/o6pu5i
@r_devops
How to stream GUI from k8s to a website?

Hello all,
So I have this idea of having a computationally intensive software be spawned by a user on a website and at the backend a pod with good resources is created for the user.
The container runs ubuntu. Now, I want to open the gui of that software only and stream it to the website in an interactive way so that the user can control it something like vnc but in the browser (with focus on that the only shown thing is the software gui only not whole ubuntu)


Also If you can help guiding me on how to save the state of the pod for that user in case he want to close the session and continue later?

How to establish one to one connection with that specific user?

https://redd.it/o722zx
@r_devops
Sysadmin needing advice

I don't really know where to begin here, but I need help.

I have been a professional Linux admin for the past 15 years. I come from a background of managing large fleets of various *nixes, and in my last gig I am the principal architect for the on-prem Linux environment of the company that I work for. We deliver services to in-house customers in the form of applications that are mostly COTS, but we also do a bit of development internally. I have adopted Ansible, Git, and Docker (mainly Compose stacks) for most of those services, though some continue to run natively on VMs. I have implemented monitoring, alerting, logging, near zero-touch provisioning, etc. So I know my way Linux, and I know my way around development fundamentals.

But I have not had the chance to play with K8S yet, though I read on the primitives (Pods, ReplicaSets, PVs and PVCs, Ingress, etc) and understand containerization in general. I also have no formal cloud experience. I understand the abstraction layer that it provides for datacentre infrastructure, and I understand that most of the services that I have to babysit today are managed for me out of the box. I also explored Terraform a little bit, and understand the idea behind immutable infrastructure. I tried to implement some of the same principles on-prem, but datacentre automation is still years behind what you get in the cloud, despite the efforts of HP, Dell, Red Hat, VMware, and alike (we need to manage several bare metal hosts as well as VMs).

I feel like I am missing a big part of the "modern IT" experience. Things like CI/CD pipelines, Serverless computing, cloud-native applications. I can grasp the concepts behind them, and given time I could certainly pick any of it up. But there is no hope to introduce any of that where I work, and I mostly learn when I have to use something in real life (I can follow tutorials but most are very artificial, and having a toddler at home makes it hard to dedicate time to train myself outside of work). There is also a lot of risk in jumping ship and likely take a pay-cut to start off as a junior DevOps Eng (I know, not a job title, but that's what's advertised) or RSE somewhere else. And the market for DevOps is so global that you could get anyone to work remotely for less than what would have to be the minimum salary I can accept without having to sell our house and cancel our mortgage (not something my family are in a position to do).

I am not even sure if [Dev\]Ops is something I want either, nor think it has a lot of life left. I enjoy programming but I don't really want to retrain as a professional programmer. And I do believe declarative infrastructure is a bit limiting, so I wouldn't be surprised if something like Pulumi eventually takes over. Yes, there will still be plenty of jobs in Ops, and plenty of jobs for Linux SysAdmins. But they might be seen the same as COBOL developers are seen today: a relic of the past, still well paid because there are so few of them left and legacy never dies.

In a way, I feel trapped in a silo of legacy technology with no way out. I could easily wing it for the next several years without ever having to do nothing but the minimum to keep up, and my company would still be happy. But I wouldn't. None of my colleagues seems to be bothered by the fact that they will cease to be relevant in the next 10-15 years. I have no other friends in this space. I literally have no one to talk to that understands how things are moving and where they're going.

Please help. What can I do?

https://redd.it/o5jbhp
@r_devops
Differences between Azure site Recovery (AZR) & Zerto Virtual Replication (ZVR)

Cloud consumers generally ask “What are the differences between Azure site Recovery (AZR) and Zerto Virtual Replication (ZVR)?’” The very first reason consumer ask this potential question is because they are familiar with one of the technologies and not the other and on the surface, they sound similar.

Here’s our step by step guide, why ASR and ZVR are very different from each other and meet different levels of performance and usability. 

Azure Site Recovery (ASR) and Zerto Virtual Replication (ZVR) are both great products that can help any IT organization to improve the success rate of their disaster recovery plan. The purpose of this document is not to be a kill sheet for one or another product, but rather to highlight the pros and cons of each product in relation to VMware virtual machine replication in the public cloud of Microsoft Azure.

Both products are very capable of fulfilling the above-mentioned task at a high level, but as this article will show, each product has advantages over the other depending on the workload that needs to be protected.

For example, while it takes a little more time to initially configure Zerto Virtual Replication, it includes more time – saving workflows not only for disaster recovery but also for migration and testing. Failback is also another area where Zerto shines above ASR as the required networking is already configured and a single check mark can be used to enable reverse replication.

One of ASR’s major advantages is that it can protect Azure’s physical workloads. Zerto has no support for physical workload and is limited in terms of supported workloads from the source. Today only the source is supported by VMware vSphere and Microsoft Hyper – V. In theory, if the operating system is supported in Azure, any physical or virtual machine can be replicated with ASR. This is because both physical workloads and workloads based on VMware are based on ASR. 

The installation requirements for each product are unique and ASR is a clear winner in terms of ease of installation and speed for customers who only need to protect some VMs on a single Hyper – V host. But as the number of protected workload increases, the necessary ASR infrastructure can become overwhelming and the scalability ease of ZVR becomes much more evident.

When compared to a monthly subscription bases, ASR and ZVR are very similar in terms of costs. ASR’s list price is $ 25/month per protected VM and ZVR is approximately $ 21/month per protected VM. Both solutions also require you to pay for the storage that your replicated workload consumes, but for Zerto there are a few additional costs that ASR does not need. One such cost is the compute instance running the Zerto Cloud Appliance, and the other is the cost of connecting to a VPN or ExpressRoute. So, remember that ZVR may be richer in features, but it may also cost a little more than ASR as well.

Azure Site Recovery

Four major components are needed to replicate the Azure Public Cloud from Azure Site Recovery on – site. These components are downloaded and installed on a Windows machine in your on-site datacenter from the Azure Vault configuration wizard.

A Config Server is the first server you install. A Config Server is a central administration server that communicates with the Azure Vault and performs on-site jobs. Two other components, a Process Server and a Master Target Server, will also be installed on your main Config Server by default.

The component of the Process Server is what processes and transports the protected data to the Azure Vault. Depending on the size of the workload to be protected, process servers have CPU, memory and disk requirements. If a single Process Server is unable to scale up large enough to protect your workload, it is possible to take a scale-out approach and deploy additional process servers.

Master Target Servers are required only during operations of failback. It should be noted that a Master Target Server can only fail servers of the same type of
operating system so that a Windows Master Target Server cannot fail Linux servers. A Linux Master Target server also needs to be installed for Linux servers.

A Mobility Services Agent is the last component of the ASR topology. This agent is installed on each virtual machine that is protected and is responsible for sending changed data to the Process Server so it can be shipped to the Azure Vault. In summary, ASR installs multiple premise components that differ depending on what workloads you are planning to protect. The amount of planning and the number of components may be acceptable for small to medium-sized deployments. A customer may have to invest large amounts in planning for larger deployments as it will require more than one process server and more than one master target server. This can make deployment more complex and means you have to maintain multiple machines.

Zerto Virtual Replication

Zerto has two main components, Zerto Virtual Manager (ZVM) and Zerto Virtual Replication Appliances (VRA). Each physical host in your environment receives a VRA if it contains protected VMs or receives data for protected VMs. This usually means that all hosts in a cluster will receive VRAs.

For multi-site deployments, each site receives a ZVM and for each of the physical hypervisor hosts, clusters with protected data receive one VRA.

Zerto leverages a combined ZVM / VRA architecture called a ZCA or Zerto Cloud Appliance for public cloud sites such as Azure and AWS. For Azure, the ZCA can be deployed from the Azure Marketplace or you can build a MyZerto installation package from scratch with a Windows VM and Zerto for Azure. Since the ZCA uses an embedded VRA, the only way to scale Azure’s replication capacity is in a fashionable scale. So an additional ZCA will have to be deployed once a ZCA has reached its maximum (about 100MB / s throughput).

In summary, the architecture of Zerto contains the same number of components regardless of how many on-site VMs you protect. Planning is also straightforward because it scales along with your environment, which means that if you add an ESXi host or a Hyper – V host, you also add a VRA.

Read More : https://www.taliun.com/differences-between-azure-site-recovery-azr-zerto-virtual-replication-zvr

https://redd.it/o5jby6
@r_devops
Preparation failed: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock



Trying to transform a project in gitlab to docker image. The gitlab is selfhosted. This is the error I get:

Running with gitlab-runner 13.12.0 (7a6612da) on test -KnwQXuT Preparing the "docker" executor ERROR: Failed to remove network for build ERROR: Preparation failed: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get https://%2Fvar%2Frun%2Fdocker.sock/v1.25/info: dial unix /var/run/docker.sock: connect: permission denied (docker.go:858:0s)

I added a runner,registered it, created a gitlab-ci.yml with docker template which I left unmodified. As solutions I tried chmod 666 /var/run/docker.sock
before and after adding the runner but it did not work. I also added sudo before running and registering the docker runner but still no success. I found another option to add my docker user to group docker but all my users are in the docker group. The docker runner is run as a docker image. I did not know what to do...Please help. I`m am trying for about 10 days with no success... Thanks in advance

​

https://stackoverflow.com/questions/68081163/preparation-failed-got-permission-denied-while-trying-to-connect-to-the-docker

https://redd.it/o5jlqu
@r_devops
Developer Research Project - Participate in a Virtual Interview

HubSpot's marketing team would like to interview professional web developers to gather information about their skillset, career path, and any educational resources they use. Participants will be compensated for for time.

What does participation look like?

If you are selected to participate, one of our researchers will schedule an hour long call. During this time you will be asked a number of questions about your background and experience as a developer. Your name and answers will only be used internally within HubSpot.

Who are we looking for?

\- Full-time professional web developers

\- Front-end, back-end, or full-stack experience

\- Developers who work for a single employer, an agency, or are self-employed

\- Must be comfortable answering questions in depth via a call

How to Participate:

Complete this form and we will reach out if you are a good fit for this project.

Thank you for your time!

https://redd.it/o5luhb
@r_devops
Need advice on building a container provisioning system (think HackTheBox)

Hi there,

Gonna keep the deets of this to a minimum for confidentiality but also just for sake of reading time. I am working on building an infrastructure/platform where a user can provision "labs" (basically one "play" machine and one testing machine, or a few play machines within a lab etc. Now I'm mainly looking for a high level advice on how I can set such a thing up. I was researching container services, and also looking at ways to facilitate this. I know that automation is going to be a big thing, perhaps Ansible / Terraform / Kubernetes. I am well versed in Python, however I've not worked on something of this magnitude. The provisioning methods can probably be done as functions which can then be used by other teams working on the front-end website.

Maybe one of you devs could give some sort of idea on how you would go about doing something like this? Any advice/tips/insight would be deeply appreciated.

https://redd.it/o78sgy
@r_devops
Hii all, i am new to devops.. I am currently working as an devops intern. I feel like creating yml-ci files and understanding stuffs are lil difficult. Is auto devops really helpful?. Kindly give me some suggestions where to start and best way to learn it.

Rookiee..!!!

https://redd.it/o5i61g
@r_devops
I NEED HELP WITH GRAFANA DASHBOARD PROVISIONING!

>EDIT - THIS PROBLEM WAS SOLVED - TL:DR SOLUTION -
>
>Make sure your target names in Prometheus match with Variables in Grafana Dashboard JSON
>
>Unlike the popular $DS_AEROSPIKE_PROMETHEUS missing or not found error, such as these **tickets**. I did not encounter any of these after following the above step.
>
>The version of Data Source and Grafana mentioned in the JSON files do not matter, it's a common thought that might flash you, please don't waste your time on it like I did.
>
>And finally, I'd love to thank @**AnachronisticAdmin** for helping me out, I really appreciate the patience.

​

Problem is - Dashboard loads, but data does not,

I am nearing a deadline to set up a Grafana Dashboard that monitors Aerospike Cluster from Prometheus, my dashboards are being taken from here, and versions under use are as follows:

Prometheus - 2.26.0
Grafana - 8.0.3 and
[Dashboards from here](https://github.com/aerospike/aerospike-monitoring/tree/master/config/grafana/dashboards)
Step-by-Step documentation followed from Here

Now, steps followed are:

1. Installed Grafana using .deb installation, and dpkg-i
2. Created the /var/lib/grafana/dashboards/<dumped-a-few-jsons-here>
3. Provisioned Datasource as Prometheus in /etc/grafana/provisoning/datasources/all.yaml
4. Provisioned Datasource as Prometheus in /etc/grafana/provisoning/dashboards/all.yaml
5. Permissions and ownership set at /etc/grafana , /var/lib/grafana and /var/log/grafana level
6. Starts Grafana Server , set password,
7. I can view the datasource I created, and by dashboards are loaded as well

CHECKS:

1. Data Source is Working
2. Prometheus is up and running displaying healthy targets on its UI
3. Individual Panels from the same Data Source is also working.

&#x200B;

BUTTTTT

THERE IS NOT DATA ON THEM :(

Fixes I have tried are,

Common issue of $DS\_AEROSPIKE\_PROMETHEUS or $DS\_PROMETHEUS not found was attempted by replacing $DS\_AEROSPIKE\_PROMETHEUS to Prometheus in all the JSON files I have.
Tried to play around with the Variables from UI , but to no luck , please help me out here.

https://redd.it/o5i9vc
@r_devops
What does DevOps do? (Alliteration intended)

Hey! I'm working on a couple of products at my new job. The products are all about website monitoring and status pages.

I'm in marketing (tech isn't really my strong suit) and I've been tasked with understanding what the DevOps role involves and how the products we build can help engineers like you guys. If you could give me your take on the role, what you're responsible for (without divulging any secret details), the kind of tools you use, and how tools like Jira come into the picture, it would definitely give me a leg up.

P.S.: More than happy to talk about the products on request. Right now I'm trying not to make this post too promotional.

P.P.S.: I really hope I don't sound like a terrible spy trying to gather dirt on the competition.

Cheers!

Edit: Removed the links to my products

https://redd.it/o5j4i6
@r_devops
News VSCode extension "Blockman" to Highlight nested code blocks with boxes

Check out my VSCode extension - Blockman, took me 6 months to build. It's free. Please help me promote/share/rate if you like it. You can customize block colors (backgrounds, borders), depth, turn on/off focus, curly/square/round brackets, tags, python indentation and more.....

https://marketplace.visualstudio.com/items?itemName=leodevbro.blockman

Supports: Python, Dart, R, Go, PHP, JavaScript, JSX, TypeScript, TSX, C, C#, C++, Java, HTML, CSS and more...

This post in react.js community:

https://www.reddit.com/r/reactjs/comments/nwjr0b/idea\_highlight\_nested\_code\_blocks\_with\_boxes/

https://redd.it/o7cefu
@r_devops
How to identify a count index in Terraform

I want to deploy multiple resources in Azure with Terraform, I'm using the count index to specify how many resources I want but I would like to deploy some extra disks in one VM but it gives me errors when I tried to execute it

My configuration looks like this



\# Create a virtual machines 
resource "azurerm_virtual_machine" "vm" {
count = 2
name = element(var.vm_names_grupo1,count.index)
location = var.location
resource_group_name = azurerm_resource_group.rg.name
network_interface_ids = [element(azurerm_network_interface.nic.*.id, count.index)\]
vm_size = element(var.vm_size_grupo1,count.index)
storage_os_disk {
name = element(var.vm_osdisk_name_grupo1,count.index)
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Standard_LRS"
disk_size_gb = element(var.vm_osdisk_size_grupo1,count.index)
  }
storage_image_reference {
publisher = element(var.image_reference_publisher,count.index)
offer = element(var.vm_image_offer,count.index)
sku = element(var.sku,count.index)
version = "latest"
  }
os_profile {
computer_name = element(var.vm_names_grupo1,count.index)
admin_username = var.admin_username
admin_password = var.admin_password
  }
os_profile_linux_config {
disable_password_authentication = false
  }
}
resource "azurerm_managed_disk" "example" {
count = length(var.extradisks)  
name = element(var.extradisks,count.index)
location = var.location
resource_group_name = azurerm_resource_group.rg.name
storage_account_type = "Standard_LRS"
create_option = "Empty"
disk_size_gb = 10
}
resource "azurerm_virtual_machine_data_disk_attachment" "example" {
count = 2
managed_disk_id = element(azurerm_managed_disk.example.*.id, count.index)
virtual_machine_id = element(azurerm_network_interface.nic.1.id, count.index)
lun = "10"
caching = "ReadWrite"
}

The problem is in this line

virtualmachineid = element(azurermnetworkinterface.nic.1.id, count.index)

I don't know how to specify a vm

https://redd.it/o7cu62
@r_devops
Want to go into DevOps and AWS. Any project to begin with? Any material for a beginner to study the AWS certification?

I'm a web developer with knowledge on PHP, NodeJS and SQL databases. Last year I've started using Docker to create a development environment for my projects and last week I've finished a course of Docker + Kubernetes with a project being deployed to Google Cloud. I'm interested to get into DevOps because I like managing services servers and analyze metrics (I've hosted many gaming servers on old PC with linux), so I'm looking for projects to run in Docker and deploy them on GCP while I still have the 300 free credits.

Also, since a lot of companies are asking specialists with knowledge on AWS I'm looking material to study for the certification. Since I've never used AWS and I have 0 experience on cloud computing I'm asking for a Coursera course or if the AWS official docs are enough for the certification

https://redd.it/o7dyem
@r_devops
does devops mean something else or what? how do you define devops?

I am not sure if I understand the term DevOps correctly even though it's in my job title. Somebody corrected me with another definition in a post a month ago and got more upvotes than me, so I am confused (of course, I could be wrong)

1. https://old.reddit.com/r/cscareerquestions/comments/ncbq99/juniorsweatbigtechthisisnotwhatiexpected/gy4465s/?context=3

Please read the original post as well to see where I was coming from.

And what do you think DevOps is?

https://redd.it/o7evuj
@r_devops
What is the best way to learn about devops?

Could you please tell me the best way to learn about devops?
The tools, software, etc...?

https://redd.it/o7eq2o
@r_devops
Which one of these certificates will help me for DevOps career

Hi, my school is offering one free course for the below certification. Which one do you think will be helpful during applying for a job in DevOps or a similar role such as release engineer? They take at least 12 days to complete.

1. Microsoft Certified: Azure Developer Associate

2. Microsoft Certified Azure Administrator  

3. Microsoft Certified: Azure Data Fundamentals

4. Microsoft Certified: Azure Administrator

5. Microsoft Certified: Azure Al Engineer Associate


&#x200B;

&#x200B;


Also not related but from the same school:

1. APU Certified PHP & MYSQL Developer

2. APU Certified ASP.NET Developer

3. APU Certified Data Analyst

4. APU Certified Java Developer

5. APU Certified Cybersecurity Engineer

https://redd.it/o7hbif
@r_devops
Refactoring obsesssive Dev

So be me, helping with terraform, aws, various processes and operations stuff in a three men team plus manager within rapidly developing startup

Not everything I can do on time, some ideas get stale, firefighting questions burns down, backlog exists and compliance pressure is high, deadlines are tight.

Now backend dev starts to walk through pipelines, terraform code and beautify them, increase readability and parametrize everything. Touches drafts and concepts only, avoids prod code

He is not asked doing so
He has features backlog started sprints ago
There were talks on this behavior - does not help
Management squeezes biz value out of his code efforts
Started blaming on me blocking his coding testing etc with some absent integrations, but dies when asked shall the code work if he has integration or requirement x in 15 minutes

After his refactoring i have strong feeling i am that bad i don’t deserve work, profession, food
Got less sleep also

Looked at his code, ran sonar, created tickets on some findings regarding unification across projects, docs and so on just out of rage and to see his reaction to his own issues - got devaluated for “creating tickets is ineffective communications”

Frankly speaking I dont want to leave the company, other people are fine.

Im not new to operations sre and devops, but this whole thing is a bit too much for my paper-thin skin and i feel like i need to go back to school

Manager is already looking for secondary developer (or says so)

Any survival advice appreciated

https://redd.it/o7icm4
@r_devops