Which Cloud provider offer free credits on GPU Nvidia instances?
I want to test and train my AI model on GPU VM but when trying Azure or Google Cloud, they doesnt allow to use free credits on GPU instances (at least those in which I'm interested in). Is there any provider I could use or I will need to pay for this kind of machines from my wallet?
https://redd.it/1chszc5
@r_devops
I want to test and train my AI model on GPU VM but when trying Azure or Google Cloud, they doesnt allow to use free credits on GPU instances (at least those in which I'm interested in). Is there any provider I could use or I will need to pay for this kind of machines from my wallet?
https://redd.it/1chszc5
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
APM for react
What APM do you use for React application?
We have already Grafana Tempo, we don't want to install Elasticsearch just for react metrics.
https://redd.it/1chsstw
@r_devops
What APM do you use for React application?
We have already Grafana Tempo, we don't want to install Elasticsearch just for react metrics.
https://redd.it/1chsstw
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How to practice CI/CD?
Hello, I am new to devops. I have been watching some video on Youtube on how to get starterd (mostly videos on Gitlab). So far I watched TechWorld with Nana 1 hour video and Automation Step by Step playlist on Gitlab. My question is how should I practice CI/CD. And any other resources preferrably free.
https://redd.it/1chyvcg
@r_devops
Hello, I am new to devops. I have been watching some video on Youtube on how to get starterd (mostly videos on Gitlab). So far I watched TechWorld with Nana 1 hour video and Automation Step by Step playlist on Gitlab. My question is how should I practice CI/CD. And any other resources preferrably free.
https://redd.it/1chyvcg
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Where do I draw the line between Mid- and Senior-level effort, so that I don't do more work than I'm being paid for?
I recently applied to a senior level role at a company. They ended up making me an offer but at a lower salary than I was expecting (It's a pay cut but it's not exactly crap pay either). I also noticed that my title in HR System does not include senior.
I'm the only full-time infra engineer.
I'm totally capable of going into the company and leading all efforts for their current and future needs, but they're not titling nor paying me to do so apparently.
I've been trying to read through generalized engineering levels on the internet, but most are written about software developers.
So, I'm asking here, too:
How much effort and responsibility would you limit yourself to? and what boundaries would you hold? in order to make sure you weren't doing senior level work for mid pay and title. Thanks!
https://redd.it/1chzn1v
@r_devops
I recently applied to a senior level role at a company. They ended up making me an offer but at a lower salary than I was expecting (It's a pay cut but it's not exactly crap pay either). I also noticed that my title in HR System does not include senior.
I'm the only full-time infra engineer.
I'm totally capable of going into the company and leading all efforts for their current and future needs, but they're not titling nor paying me to do so apparently.
I've been trying to read through generalized engineering levels on the internet, but most are written about software developers.
So, I'm asking here, too:
How much effort and responsibility would you limit yourself to? and what boundaries would you hold? in order to make sure you weren't doing senior level work for mid pay and title. Thanks!
https://redd.it/1chzn1v
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Should I monitor my monitoring stack?
(🤯)
How do you ensure that your monitoring stack is working as expected and you didn't messed up the config?
If you're using a Saas (Grafana cloud, datadog, whatever) do you have another solution that will alert you in case of an outage?
Maybe it's just that there's no simple solution that's worth the effort. ¯\\_(ツ)_/¯
https://redd.it/1chzkj8
@r_devops
(🤯)
How do you ensure that your monitoring stack is working as expected and you didn't messed up the config?
If you're using a Saas (Grafana cloud, datadog, whatever) do you have another solution that will alert you in case of an outage?
Maybe it's just that there's no simple solution that's worth the effort. ¯\\_(ツ)_/¯
https://redd.it/1chzkj8
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Trying to move into DevOps - Need help reviewing resume
Hi everyone, I currently work remotely at a small business that does physical labor contracting and I maintain there website and everything considered IT within the business. I have basically automated everything to where I have nothing to do all day other than study and wanted to try and get into the devops field and need help peer reviewing my resume.
https://imgur.com/a/ARYgTEn
Also, please let me know if any changes are necessary and why, so I can learn and improve on making/editing my resume. Thanks in advance for all your feedback!
https://redd.it/1ci1822
@r_devops
Hi everyone, I currently work remotely at a small business that does physical labor contracting and I maintain there website and everything considered IT within the business. I have basically automated everything to where I have nothing to do all day other than study and wanted to try and get into the devops field and need help peer reviewing my resume.
https://imgur.com/a/ARYgTEn
Also, please let me know if any changes are necessary and why, so I can learn and improve on making/editing my resume. Thanks in advance for all your feedback!
https://redd.it/1ci1822
@r_devops
Imgur
Discover the magic of the internet at Imgur, a community powered entertainment destination. Lift your spirits with funny jokes, trending memes, entertaining gifs, inspiring stories, viral videos, and so much more from users.
RF, XRAY and Bitbucket Integration
Good day.
Anyone knows a step-by-step guide/link for integration of Robot Framework house in Bitbucket to XRAY+Jira? The Xray documentaion doesn't really help much. My experience is more on test scripting so this kind of setting up is new to me. Thank you in advance.
https://redd.it/1ci7pdz
@r_devops
Good day.
Anyone knows a step-by-step guide/link for integration of Robot Framework house in Bitbucket to XRAY+Jira? The Xray documentaion doesn't really help much. My experience is more on test scripting so this kind of setting up is new to me. Thank you in advance.
https://redd.it/1ci7pdz
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
AI in Infrastructure
Have anyone you implemented AI in infrastructure provisioning? If so, how beneficial has it been for your operations? #shareit
https://redd.it/1ci8c5h
@r_devops
Have anyone you implemented AI in infrastructure provisioning? If so, how beneficial has it been for your operations? #shareit
https://redd.it/1ci8c5h
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Question on Infrastructure-As-Code - How do you promote from dev to prOD
How do you manage the changes in Infrastructure as code, with respect to testing before putting into production? Production infra might differ a lot from the lower environments. Sometimes the infra component we are making a change to, may not even exist on a non-prod environment.
https://redd.it/1ci9kco
@r_devops
How do you manage the changes in Infrastructure as code, with respect to testing before putting into production? Production infra might differ a lot from the lower environments. Sometimes the infra component we are making a change to, may not even exist on a non-prod environment.
https://redd.it/1ci9kco
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Review my resume i am not getting any interview calls
Please roastresume Be detailed and provide detailed inputs on what skills am i missing and suggested learning sources if you can to jump my package to 14 lpa to 20 lpa or more in india
https://redd.it/1ciauhi
@r_devops
Please roastresume Be detailed and provide detailed inputs on what skills am i missing and suggested learning sources if you can to jump my package to 14 lpa to 20 lpa or more in india
https://redd.it/1ciauhi
@r_devops
postimg.cc
Screenshot 20240502 141507 Drive — Postimages
Scaling Observability & OpenTelemetry @ Skyscanner
Hey everyone👋
If you're in London UK next week and interested in observability & Open Telemetry, I think you'll enjoy this edition of the Observability Engineering Meetup.
Who: Dan is the Observability lead at Skyscanner, a member of the OpenTelemetry Governance Committee, and the author of "Practical OpenTelemetry: Adopting Open Observability Standards Across Your Organization."
What: Dan will share some of his experiences leading an observability transformation at Skyscanner, from custom solutions to telemetry standards and from a root cause analysis based on intuition and past experience to one based on context and evidence.
If you can't make it we'll record the talk and post it on this YouTube channel.
https://redd.it/1cibem9
@r_devops
Hey everyone👋
If you're in London UK next week and interested in observability & Open Telemetry, I think you'll enjoy this edition of the Observability Engineering Meetup.
Who: Dan is the Observability lead at Skyscanner, a member of the OpenTelemetry Governance Committee, and the author of "Practical OpenTelemetry: Adopting Open Observability Standards Across Your Organization."
What: Dan will share some of his experiences leading an observability transformation at Skyscanner, from custom solutions to telemetry standards and from a root cause analysis based on intuition and past experience to one based on context and evidence.
If you can't make it we'll record the talk and post it on this YouTube channel.
https://redd.it/1cibem9
@r_devops
Meetup
Login to Meetup | Meetup
Not a Meetup member yet? Log in and find groups that host online or in person events and meet people in your local community who share your interests.
Is system design needed for devops engineer
Hello how much system design needed for a devops guys
Please share your experiance.
I am currently reading alex vus system design interview prep book is it good .
I want to switch job soon so need your inputs on topic.
Dear high paying devops engineers please post your experiance on how to become a high paying devops guys with skillset
https://redd.it/1cian4h
@r_devops
Hello how much system design needed for a devops guys
Please share your experiance.
I am currently reading alex vus system design interview prep book is it good .
I want to switch job soon so need your inputs on topic.
Dear high paying devops engineers please post your experiance on how to become a high paying devops guys with skillset
https://redd.it/1cian4h
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
How to monitor the CO2 emissions of your AWS application
The US and European Union have set ambitious goals to reduce emissions by 40% and 55% by 2030. Yet, many companies lack solid strategies for making their tech stack more sustainable.
Discover innovative methods to create a greener future with Kubernetes clusters without sacrificing application performance. https://www.perfectscale.io/blog/what-is-the-carbon-impact-of-kubernetes
https://redd.it/1ciaqvy
@r_devops
The US and European Union have set ambitious goals to reduce emissions by 40% and 55% by 2030. Yet, many companies lack solid strategies for making their tech stack more sustainable.
Discover innovative methods to create a greener future with Kubernetes clusters without sacrificing application performance. https://www.perfectscale.io/blog/what-is-the-carbon-impact-of-kubernetes
https://redd.it/1ciaqvy
@r_devops
www.perfectscale.io
What is the Carbon Impact of Kubernetes? | PerfectScale
Get a deep dive into how Kubernetes affects the environment—and why continuous optimization is the key to sustainable software.
LGTM Stack VS Google Cloud Operations Suite
Hello! I was wondering if anyone had any experience using either of these. Right now I have a project with a company to essentially improve the log management they use. Its a large enterprise level company but the team itself and the application they use is for internal staff, and it creates around 80-100GB of logs per week. Its hosted on a Kubernetes cluster.
They're currently using Google Cloud Operations Suite with FluentBit as the log shipper, where logs are sent to Cloud Logging. Metrics are monitored with Prometheus and there's no tracing. Alerts are also dealt with through Google Alerts.
I essentially wanted to implement the LGTM stack considering this has very good integration with Kubernetes running in microservices mode - I can configure tracing through Tempo and OpenTelemetry and also set up metrics through Prometheus for an observability stack showing logs, metrics and traces in Grafana.
However after a lot of research I still can't quite figure out whether this implementation would actually improve anything on thier end. There's no real information on Loki/lgtm stack vs GC Operations suite and I don't know if there would be any big differences in the cost/speed/resources/performance/etc. Is Loki better than Google Cloud Logging at what it does? Are Grafana Alerts better than Google Alerts? Are there alternatives I can use instead? Its a big company so the actual costs of the additional resources really don't matter as long as the solution works.
Thank you for any advice you can give me on this!
https://redd.it/1cig0ad
@r_devops
Hello! I was wondering if anyone had any experience using either of these. Right now I have a project with a company to essentially improve the log management they use. Its a large enterprise level company but the team itself and the application they use is for internal staff, and it creates around 80-100GB of logs per week. Its hosted on a Kubernetes cluster.
They're currently using Google Cloud Operations Suite with FluentBit as the log shipper, where logs are sent to Cloud Logging. Metrics are monitored with Prometheus and there's no tracing. Alerts are also dealt with through Google Alerts.
I essentially wanted to implement the LGTM stack considering this has very good integration with Kubernetes running in microservices mode - I can configure tracing through Tempo and OpenTelemetry and also set up metrics through Prometheus for an observability stack showing logs, metrics and traces in Grafana.
However after a lot of research I still can't quite figure out whether this implementation would actually improve anything on thier end. There's no real information on Loki/lgtm stack vs GC Operations suite and I don't know if there would be any big differences in the cost/speed/resources/performance/etc. Is Loki better than Google Cloud Logging at what it does? Are Grafana Alerts better than Google Alerts? Are there alternatives I can use instead? Its a big company so the actual costs of the additional resources really don't matter as long as the solution works.
Thank you for any advice you can give me on this!
https://redd.it/1cig0ad
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Generating IaC with drag-and-drop interface
Hello everyone, a couple of months ago, I wrote here to ask for your opinion on the tool I've developed, which allows you to generate IaC from a drag-and-drop interface. I've implemented several suggestions I received, including extending the number of components (the tool now covers all AWS RDS offerings except Oracle), adding VPC endpoint support, and improving architecture validation.
It would be great if you could check it out and maybe suggest some more features it's missing: https://app.archformation.com/
https://redd.it/1cihhdm
@r_devops
Hello everyone, a couple of months ago, I wrote here to ask for your opinion on the tool I've developed, which allows you to generate IaC from a drag-and-drop interface. I've implemented several suggestions I received, including extending the number of components (the tool now covers all AWS RDS offerings except Oracle), adding VPC endpoint support, and improving architecture validation.
It would be great if you could check it out and maybe suggest some more features it's missing: https://app.archformation.com/
https://redd.it/1cihhdm
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Help with alertmanager & webex
Hello colleagues,
does anyone have experience with migration of alertmanager alerts to webex teams? Currently we are in transition from slack to webex (don't ask me why) and we are migrating all of the slack alerts/notifications to webex. This is current configuration (relevant part of it) of alertmanager:
....
receivers:
- name: default
- name: alertswebex
webexconfigs:
- apiurl: 'https://webexapis.com/v1/messages'
roomid: '..............'
sendresolved: false
httpconfig:
proxyurl: ..............
authorization:
type: 'Bearer'
credentials: '..............'
message: |-
{{ if .Alerts }}
{{ range .Alerts }}
"**[{{ .Status | upper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] Event Notification**\n\n**Severity:** {{ .Labels.severity }}\n**Alert:** {{ .Annotations.summary }}\n**Message:** {{ .Annotations.message }}\n**Graph:** [Graph URL]({{ .GeneratorURL }})\n**Dashboard:** [Dashboard URL]({{ .Annotations.dashboardurl }})\n**Details:**\n{{ range .Labels.SortedPairs }} • **{{ .Name }}:** {{ .Value }}\n{{ end }}"
{{ end }}
{{ end }}
....
But the bad part is that we receive 400 error from alertmanager:
msg="Notify for alerts failed" numalerts=2 err="alertswebex/webex[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 400: {\"message\":\"One of the following must be non-empty: text, file, or meetingId\",\"errors\":[{\"description\":\"One of the following must be non-empty: text, file, or meetingId\"}],\"trackingId\":\"ROUTERGW......\"}"
The connection works, as the simple messages are sent, however these "real" messages are dropped. We also thought about using webhook_configs, but the payload can't be modified (without proxy in the middle).
Anyone with experience with this issue? Thanks
https://redd.it/1cij91j
@r_devops
Hello colleagues,
does anyone have experience with migration of alertmanager alerts to webex teams? Currently we are in transition from slack to webex (don't ask me why) and we are migrating all of the slack alerts/notifications to webex. This is current configuration (relevant part of it) of alertmanager:
....
receivers:
- name: default
- name: alertswebex
webexconfigs:
- apiurl: 'https://webexapis.com/v1/messages'
roomid: '..............'
sendresolved: false
httpconfig:
proxyurl: ..............
authorization:
type: 'Bearer'
credentials: '..............'
message: |-
{{ if .Alerts }}
{{ range .Alerts }}
"**[{{ .Status | upper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] Event Notification**\n\n**Severity:** {{ .Labels.severity }}\n**Alert:** {{ .Annotations.summary }}\n**Message:** {{ .Annotations.message }}\n**Graph:** [Graph URL]({{ .GeneratorURL }})\n**Dashboard:** [Dashboard URL]({{ .Annotations.dashboardurl }})\n**Details:**\n{{ range .Labels.SortedPairs }} • **{{ .Name }}:** {{ .Value }}\n{{ end }}"
{{ end }}
{{ end }}
....
But the bad part is that we receive 400 error from alertmanager:
msg="Notify for alerts failed" numalerts=2 err="alertswebex/webex[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 400: {\"message\":\"One of the following must be non-empty: text, file, or meetingId\",\"errors\":[{\"description\":\"One of the following must be non-empty: text, file, or meetingId\"}],\"trackingId\":\"ROUTERGW......\"}"
The connection works, as the simple messages are sent, however these "real" messages are dropped. We also thought about using webhook_configs, but the payload can't be modified (without proxy in the middle).
Anyone with experience with this issue? Thanks
https://redd.it/1cij91j
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Automate All The Things DevOps project now with GitHub Actions
As suggested by someone from this community, I've moved the pipelines on Automate All The Things from Azure Devops to GitHub Actions.
I didn’t know GitHub Actions was free when the repo is public. This makes it SOO much easier to get started with the project, so thank you stranger who suggested this!
https://github.com/tferrari92/automate-all-the-things
The Azure DevOps version is still there in its own branch.
Any other feedback or suggestions are always welcomed! There's always room for improvement.
Also.. Nirvana Edition with Backstage.io is coming soon!
https://redd.it/1ciiqie
@r_devops
As suggested by someone from this community, I've moved the pipelines on Automate All The Things from Azure Devops to GitHub Actions.
I didn’t know GitHub Actions was free when the repo is public. This makes it SOO much easier to get started with the project, so thank you stranger who suggested this!
https://github.com/tferrari92/automate-all-the-things
The Azure DevOps version is still there in its own branch.
Any other feedback or suggestions are always welcomed! There's always room for improvement.
Also.. Nirvana Edition with Backstage.io is coming soon!
https://redd.it/1ciiqie
@r_devops
GitHub
GitHub - tferrari92/automate-all-the-things: First edition of the Automate All The Things Saga
First edition of the Automate All The Things Saga. Contribute to tferrari92/automate-all-the-things development by creating an account on GitHub.
Specializing within DevOps
There really is too much to know these days, what areas are there to specialize in?
My thoughts:
Kubernetes - I can see why some engineers love it. An awesome paradigm at the base layer and so much interesting built on top of it.
Observability - almost a science in itself and plenty to get into (or related to) be it monitoring, alerting, analytics, service management.
Platform management - building out a consumable platform, kinda like being a developer for developers.
Architect - the problem I have with this is developers are going to have their own software architects doing system design that the may overlap already with the infra side. Also many expect engineers to have software architects skills anyway. So where does that leave the cloud/DevOps architect? I feel there is not much mileage in this path.
Any others? As each year passes the more I think it is not a good idea to stay in the middle as a generalist and time to pick a path.
https://redd.it/1cikm6n
@r_devops
There really is too much to know these days, what areas are there to specialize in?
My thoughts:
Kubernetes - I can see why some engineers love it. An awesome paradigm at the base layer and so much interesting built on top of it.
Observability - almost a science in itself and plenty to get into (or related to) be it monitoring, alerting, analytics, service management.
Platform management - building out a consumable platform, kinda like being a developer for developers.
Architect - the problem I have with this is developers are going to have their own software architects doing system design that the may overlap already with the infra side. Also many expect engineers to have software architects skills anyway. So where does that leave the cloud/DevOps architect? I feel there is not much mileage in this path.
Any others? As each year passes the more I think it is not a good idea to stay in the middle as a generalist and time to pick a path.
https://redd.it/1cikm6n
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Can i use Prometheus and Grafana to build a localized cluster monitoring system?
I manage a computing clusters and want to monitor them locally. Never tried setting up a monitoring system on them before.
My idea is to setup Prometheus on all servers so i can export the data to Grafana, running everything locally.
I’ve tried using Netdata and it worked beautifully, i want the monitoring to be secure and netdata doesn’t cut it. Hence this solution.
Have you worked on anything like this in the past and what do you recommend?
https://redd.it/1ciod7f
@r_devops
I manage a computing clusters and want to monitor them locally. Never tried setting up a monitoring system on them before.
My idea is to setup Prometheus on all servers so i can export the data to Grafana, running everything locally.
I’ve tried using Netdata and it worked beautifully, i want the monitoring to be secure and netdata doesn’t cut it. Hence this solution.
Have you worked on anything like this in the past and what do you recommend?
https://redd.it/1ciod7f
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community
Programmatically deploy apps to k8s cluster with ArgoCD+Argo Workflows
I'm using Argo workflows+ArgoCD, and I want to automatically deploy many apps app from git repos to my cluster. Am i supposed to just commit new deployment manifests to the git repo every time i want to deploy a new app?
Like,
1. Build image and push it to a registry with a workflow.
2. Generate a new manifests that deploys the built image
3. Push manifest to git repo
I've been googling for two days, but all I see are examples on how to update image tags for already deployed images. The closest I've found is this repository but they're directly applying the manifests.
https://redd.it/1ciptux
@r_devops
I'm using Argo workflows+ArgoCD, and I want to automatically deploy many apps app from git repos to my cluster. Am i supposed to just commit new deployment manifests to the git repo every time i want to deploy a new app?
Like,
1. Build image and push it to a registry with a workflow.
2. Generate a new manifests that deploys the built image
3. Push manifest to git repo
I've been googling for two days, but all I see are examples on how to update image tags for already deployed images. The closest I've found is this repository but they're directly applying the manifests.
https://redd.it/1ciptux
@r_devops
GitHub
argo-workflows-ci-example/bootstrap/workflow-templates/common/deploy-resources.yml at main · pipekit/argo-workflows-ci-example
An example CI leveraging Argo Workflows. Contribute to pipekit/argo-workflows-ci-example development by creating an account on GitHub.
Deployment scenario for whole solution
We are developing whole solution backend, frontend, datalayer.
Everything is dependent from eachother. We can not deploy new backend version if frontend is not ready. We can not deploy new database schema if backend is not ready.
Each part of solution has own code repository. How to approach to the deployment?
Should I create separate repositorium with CD pipeline?
In that repo I will keep version of each part. When new versions are ready, I will update configuration with new versions and run CD pipeline?
How do you approach to that?
https://redd.it/1cim4bm
@r_devops
We are developing whole solution backend, frontend, datalayer.
Everything is dependent from eachother. We can not deploy new backend version if frontend is not ready. We can not deploy new database schema if backend is not ready.
Each part of solution has own code repository. How to approach to the deployment?
Should I create separate repositorium with CD pipeline?
In that repo I will keep version of each part. When new versions are ready, I will update configuration with new versions and run CD pipeline?
How do you approach to that?
https://redd.it/1cim4bm
@r_devops
Reddit
From the devops community on Reddit
Explore this post and more from the devops community