Reddit DevOps
269 subscribers
2 photos
31K links
Reddit DevOps. #devops
Thanks @reddit2telegram and @r_channels
Download Telegram
CI/CD pipeline architecture in repository containing multiple services

It's a sort of a older project, started as backend and frontend, had one pipeline which tested, built and deployed everything.

As the time went by, there was some heavy work on backend, and as result few services were also added, but in the same project as backend and frontend, not in separate repos.

Currently, there exists one pipeline which tests, builds and deploy everything which isnt effective because I have to wait a long time for a run to finish if I only changed one service.

I think I have two options to fix this:

1. Make a separate pipeline for each service, save them in repo, reuse the code from parts of main pipeline.
2. Edit existing pipeline in a way to add triggers and checks for each folder containing service so it runs only based on changes.

For reference, this is Azure DevOps, I think I am leaning more on option 1 because it seems logical ( Single responsibility principle), what do you guys think?

https://redd.it/12sskxz
@r_devops
Loadbalancer - round robin with 100%

I have 2 RPC servers that need to be serving requests with 100% uptime.


I have 2 questions:

1) To archive 100% uptime I need to deploy a load balancer, let's say with round robin. That health checks every RPC server every N milliseconds. The rpc request is around 10ms. So there may be situation when health check is not yet done, but server is down (update or outage). So request will return an error. What should I do in that case? Make every request with retries and some small timeout ? So it will at some point be routed to a healthy node, before next health check will invalidate a bad node?



2) How to make load balancer resilient ? I don't want to go with kubernetes, as it's really hard to master it for me. What can I use ?

https://redd.it/12ss262
@r_devops
Jenkins: Control retries upon job failure.

Hello,

Lately been into an issue with Jenkins,

The build keeps triggering every 15 minutes and failing in loop, non-stop. The reason says no change , kind of job is a docker image build job.

I want to configure Jenkins in a way if a Job is failing it should stop after several attempts, retries should be counted and limited to certain attempts.

On Console Logs I have noticed below details:

1. Started by an SCM change --- No changes were made in DEV Branch for which the trigger in configuration is present.

2. "Merged in PRODUCTION (pull request #2901)" --- though the branch specifier is configured with value DEV still this is present in logs

I tried to exclude the !PRODUCTION in the branch specifier configurations still the job is running in the loop.

Looking on the internet I have found that we can use a plugin to control the number of retries but before that I would like to know if there is any configuration that limits the retries in case of job/build failure.

https://redd.it/12sqegx
@r_devops
Guns, Lots of Guns: There Is No Silver Bullet

The software industry has notoriously sought new models, concepts and frameworks that can miraculously fix whatever is broken. But what if DevOps is already gulped and agile is already swallowed – by software? Lars Kruse said there is no silver bullet. What will the Boomer/Gen X cohorts that ruled agile and DevOps for decades are passing on to Gen Z coders. What does the future hold as the industry goes serverless?

https://redd.it/12swph4
@r_devops
How to pick up cloud effectively

Hi,


From my job listing browsing experience I found out that a company either uses cloud technologies heavily or don't touch them at all. I'd be most comfortable learning it on the job with hands-on examples and a bit of mentoring, because I believe that is the most relevant experience. Skimming through the Udemy course will surely get me closer to the subject, a cert would do wonders, but I was thinking about the easiest and most practical approach, which is learn while doing it.


I'm only 1.5ys in, have some experience in scripting / cicd / contenarization / orchestration, but only with on-prem envs. Did close to nothing "learning", did most of it with the help of google and colleagues, and I found this is da wae for me.


So in your typical job listing, knowing about cloud is either 50% of the job or isn't present at all. Do you aggree? Just bite the bullet and force myself to get the entry certs?


Thanks

https://redd.it/12swdxr
@r_devops
CREATING BASE DOCKER IMAGES WITH ALL DEPENDENCIES INSTALLED

We've got a bunch of microservices that we build via docker. Right now everytime we build an image we keep installing external dependencies/packages.

I'm thinking of creating a base image which will already have all of our dependencies/packages installed and then on top of that image we just build our microservice.

Do you folks recommend this kind of approach?


Edit : Sorry if I look like I'm yelling in the title, I couldn't do lowercase for some reason

https://redd.it/12srdou
@r_devops
When applying for roles, how much does the purpose/vision of it affect your decision?

I’ve been selective in trying to land my first engineering role because the visions haven’t seemed super appealing to me.

To you, does the product/purpose make a difference? Or do you just pick roles based on skills you’ll build and the technology you’ll be using?

https://redd.it/12t2kah
@r_devops
Noob here looking for a method to install Docker Desktop edition to a different partition in Win 10.

Looking for a foolproof method to install Docker Desktop edition to a different partition in Win 10. I tried the following command line flag but it didn't work. Installation proceeded with default settings i.e on the C drive itself which by the way is not having much free space at the moment.


* --installation-dir=<path> : changes the default installation location (C:\\Program Files\\Docker\\Docker)

Would be grateful for any pointers to accomplish this.

Thanks

https://redd.it/12t2gpk
@r_devops
Hi guys!!! I new to CICD pipelines. Can anyone recommend any instructor/YouTube or resources I can use to master GitHub Actions & Jenkins.

YouTube channels, book recommendations etc will be highly appreciated.

https://redd.it/12t236w
@r_devops
Small environment on docker swarm

Hey devops!

I recently jumped into the world of devops with docker swarm. After a somehow difficult learning curve (mostly because of the environment constraints for the swarm setup) I am thrilled and loving it.

The thing is, I took over the environment mid project and quite frankly is a bit of a clusterf***. But that is another story.

I need a quick, easy to setup and if possible web hi enabled configuration management system to at least fix the mess and create a platform basis for something better (i.e ansible, Salt, the likes).

In essence I have to modify a bunch of parameters on more or less 200 Debian servers (apt sources, create sudoers, ssh keys, etc). I have ssh access to them all.

I thought initially on a dockerized ansible but there seems not to be an up to date ansible container and to be honest I don’t have time to create my own, as there are multiple fires to deal with. I need something quick, easy and functional.

I know I may be looking for the golden triangle (easy, beautiful, quick) but….is there such a thing?

https://redd.it/12tctuw
@r_devops
How can I learn puppet with only one node?

Hi, I'm trying to learn puppet but all the tutorials and quickstart guides assume you have access to multiple nodes. I only have one. Is there a way to run puppet in a single node mode?

https://redd.it/12tm3t6
@r_devops
Database / Platform to track releases done on each system - Does it exist?

At least in my environments, because of the large amount of microservices / apps we now need a central platform track all the versions released on each system.

I am imagining if one was try to build something DIY, it would be some sort of centralized database or even simpler - an excel sheet with every version and system listed.

I can't seem to find a known tool that will do this (OSS or paid). Does anyone know what is the proper term for such a platform or what are the keywords / terms should be used.

Supposively Spotify Backstage can do this sort of release tracking but that is with the assumption that app was started / onboarded from Backstage initially.

https://redd.it/12tmczj
@r_devops
IaC - best practices

Hello, my organization runs an application with google cloud serving as the underlying infrastructure. We manage this through a terraform infra as code script.

We now have a need to host an additional
Application with a second app code base.

Question: should I have 2 different code bases for the underlying infrastructure or a single code base?

If I elect single code base, how hard would be to decouple the code if I choose to migrate one of the apps?

Note: I will also have different code bases for the various environments (dev, test, prod) but this is irrelevant.

https://redd.it/12tmzln
@r_devops
How do you decouple metrics generation from structured logging?

Where I work, I often see monitoring code that records the same fact twice: one for logging, and one for metrics, for example (Python-ish code):

base_logger = get_logger()
REQUESTS_RECEIVED = prometheus_client.Counter('requests_received', 'Number of requests received', ['method', 'status'])

def middleware(request, next_call):
logger = base_logger.bind(method=request.iss.onethod, endpoint=request.url)
logger.debug("starting to handle request")
response = next_call(request)
logger.debug("successfully handled request", status=response.status)
REQUESTS_RECEIVED.labels(request.iss.onethod, response.status).inc()

Having thought about it a little, it seems that it'd be better for the application to export only structured events. Some downstream process then receives these events, processes them, and then generates metrics based on defined rules. For example, one could define a rule for the above `REQUESTS_RECEIVED` metric, and another for a `REQUEST_TIME` metric based on how long it took for a request to be handled. In particular, those metrics could be derived from multiple events in aggregate.

Does something exist already to do this?

https://redd.it/12tuwyk
@r_devops
How to create a multi developer bitbucket workflow

Hi guys,

I have started at an agency who has multiple developers all over the world.

They have been all cloning sites from the repo to their local machines, making changes and committing. But you know there are a lot of problems with this. People forget to rebase to main. Developers are all using different versions of Apache and mysql not to mention php.

And then they commit to main and FTP to staging thinking “oh it worked in local”, something breaks and hopefully they notice before pushing to production.

So.

This is the process I am setting up:

1. Using source tree on local to commit to bitbucket.
2. Bitbucket pipeline runs automatically to staging.
3. Bitbucket pipeline is configured to manually deploy to production.

But the missing pieces of the puzzle are still different Apache versions, MySQL and php. So

Is there a software that a team of devs can use that will automatically set the versions of AMP so when they commit we are getting consistency from each developer?

Or is there a way to require the versions in the Git config file.

Thanks very much 😊

https://redd.it/12tvrcg
@r_devops
what is needed to be an SRE/Devops/platform in MAANG

Hey All,


I think this will be usefull for many

Any one who works as SRE/Devops/platform anything that they call these days in MAANG
what is the necessary skill needed to get inside these companies as SRE/Devops/platform engineer


I have a total of 6.5 years of experience in various cloud and devops tool including terraform , ansible, packer k8s docker CI/CD jenkins/ansible/team city . I mostly work in central team so i get requirement to automate CI/CD end to end so i use mix of shell /python /powershell and mix of tools


Now i want to move higher up in companies , I am a decent coder but have never done web dev something of that sort , mostly mine was scripting . I am good in fixing systems ie something breaks , i can dig deeper like dns /lb something got broke i will get the first call to fix since good at linux/networking and troubleshooting


I heard MAANG kind of companies just evaluate problem solving and never care about experience is it true , if so then how should i start and what all i need to revise please help


and how should i prepare now for these roles in these companies ?

https://redd.it/12twdrf
@r_devops
Bit of a "wow" this company is great to work for post, feel free to share stories yourselves

So i started at a big financial tech firm recently, Ive had my share of shitty jobs in the past where blame culture is massive, but I was shocked a few weeks ago by this new company.


Someone in another team fat fingered an update in a tool which took down everything in production, in such a away that the automated systems couldn't bring the apps back up successfully, so a bit of manual work had to be done, it was over a weekend so it luckily wasn't a lot of traffic hitting the app. But it was still a big outage.


In other places I've worked even though It was accidental with no malice intent, that guy would be fired immediately and the issues swept under a rug(fixed quietly) .


In this company they didn't name names, in a company wide email, they said it was a fault in an automated system and if anyone was to blame it would have been them as managers for not foreseeing this would be a problem (email sent by ceo, cto, coo, etc)

Only a few people in cloud ops with the correct access to this tool could see who did it.

The guy later said he was called into a meeting with all the ceo, coo, types, him thinking the worst.

They reassured him the blame was on them, thanked him for alerting them to the issue, and seeing how distraught he was offered the next few day off paid so he could return to work in a good frame of mind.

The guy is still at his job and is part of the team fixing the issue, still has all his admin accesses too.

https://redd.it/12txhbs
@r_devops
Looking for the cheapest deployment option

Hey guys. I need to deploy a service into AWS and I'm divided between ECS Fargate and Lambda functions. It's a simple service which gets e-commerce orders and transfer to a database, parsing the information. It will probably be called between 10 to 50 times a day.
I already have the container ready to be deployed but I'm also thinking about lambda functions due to its low price... What do you guys think? The service itself takes just a few 'ms' to run.

https://redd.it/12u1xt6
@r_devops