Reddit DevOps

Need Help Integrating AWS ECS Cluster, Service & Task with LGTM Stack using Terraform

So I've been working on Integrating LGTM Stack into my current AWS Infrastructure Stack.

Let me first explain my current work I've done so far,

\######LGTM Infra :

\- Grafana = Using AWS Managed Grafana with Loki, Mimir and Tempo Data Source deployed using Terraform

\- Loki, Tempo and Mimir servers are hosted on EC2 using Docker Compose and using AWS S3 as Backend storage for all three.

\- To push my ECS Task Logs, Metrics and Traces, I've added Side-Cars with current Apps Task Definition which will run alongside with app container and push the data to Loki, Tempo and Mimir servers. For Logs I'm using __aws firelens__ log driver, for Metrics and Traces I'm using Grafana Alloy.

LGTM Server stack is running fine and all three data are being pushed to backend servers, now i'm facing issue with labeling like the metrics and traces are pushed to Mimir and Tempo backend servers but how will i identify from which Cluster, Service and Task i'm getting these Logs, Metrics and Traces.

For logs it was straight forward since i was using AWS Firelens log driver, the code was like this:

>log_configuration = {

>logDriver = "awsfirelens"

>options = {

>"Name" = "grafana-loki"

>"Url" = "${var.loki_endpoint}/loki/api/v1/push"

>"Labels" = "{job=\\"firelens\\"}"

>"RemoveKeys" = "ecs_task_definition,source,ecs_task_arn"

>"LabelKeys" = "container_id,container_name,ecs_cluster",

>"LineFormat" = "key_value"

>}

>}

as you can see in the below screenshots, ecs related details are getting populated on grafana,
: https://i.postimg.cc/HspwKRVW/loki.png

and for the same i was able to create dashboard as well with some basic filtering and search box,
: https://i.postimg.cc/tT36vNbV/loki-dashboard.png

Now comes the Metrics a.k.a Mimir part:

for this i used Grafana Alloy, and used below config.alloy config file:

>prometheus.exporter.unix "local_system" { }

>

>prometheus.scrape "scrape_metrics" {

>targets = prometheus.exporter.unix.local_system.targets

>forward_to = [prometheus.relabel.add_ecs_labels.receiver\]

>scrape_interval = "10s"

>}

>

>remote.http "ecs_metadata" {

>url = "ECS_METADATA_URI"

>}

>

>prometheus.relabel "add_ecs_labels" {

>rule {

>source_labels = ["__address__"\]

>target_label = "ecs_cluster_name"

>regex = "(.*)"

>replacement = "ECS_CLUSTER_NAME"

>}

>

>rule {

>source_labels = ["__address__"\]

>target_label = "ecs_service_name"

>regex = "(.*)"

>replacement = "ECS_SERVICE_NAME"

>}

>

>rule {

>source_labels = ["__address__"\]

>target_label = "ecs_container_name"

>regex = "(.*)"

>replacement = "ECS_CONTAINER_NAME"

>}

>

>forward_to = [prometheus.remote_write.metrics_service.receiver\]

>}

>

>prometheus.remote_write "metrics_service" {

>endpoint {

>url = "${local.mimir_endpoint}/api/v1/push"

>headers = {

>"X-Scope-OrgID" = "staging",

>}

>}

>}

I used AWS to create this config in Param store and added another app task side car which will load this config file, run a custom script which will fetch the ECS Cluster name from ECS_CONTAINER_METADATA_URI_V4 and passed Service Name and Container Name as ECS Task Definition Environment Variable.

so after all this, I was able to do the relabeling and populate the Cluster, Service and Task name on Mimir Data Source:

: https://i.postimg.cc/Gh8LchBX/mimir.png

Now when I was trying to use Node\_Exporter\_Full Grafana dashboard for the metrics, I was getting the metrics but for unix level filtering only,

: https://i.postimg.cc/Jn0wPPZp/mimir-dashboard-1.png

:

postimg.cc

loki — Postimages

13 views11:28