---
title: Tanzu Application PlatformのWorkloadをDatadogでモニタリングする
tags: ["Kubernetes", "Tanzu", "TAP", "Datadog"]
categories: ["Dev", "CaaS", "Kubernetes", "TAP"]
date: 2023-02-04T08:01:57Z
updated: 2023-02-06T06:30:08Z
---
Tanzu Application Platformでは以下のDatadog buildpackが利用可能です。
* https://github.com/paketo-buildpacks/datadog
* https://docs.vmware.com/en/VMware-Tanzu-Buildpacks/services/tanzu-buildpacks/GUID-partner-integrations-partner-integration-buildpacks.html#datadog
このbuildpackはJavaアプリにDatadog Java Agentを、Node.jsアプリには Datadog Node.js Trace Agentを追加しますが、
このbuildpackを使うだけでは実際のデータをDatadogに送ることはできません。Datadogにデータを送るには別途Data Agentを起動させる必要があります。
次の図のような構成になります。
本記事では次のドキュメントに従ってDatadog AgentをKubernetesにインストールし、TAPにデプロイしたWorkloadをモニタリングしてみます。
* https://docs.datadoghq.com/containers/kubernetes/installation/?tab=helm
[こちらの記事](https://ik.am/entries/733)で構築したTAP 1.4 on Kindの環境で試します。なお、DatadogはFree Trialの環境を使用しています。
**目次**
### Datadog Agentのインストール
今回はDatadog AgentのインストールにはHelm Chartを利用します。今後は[Datadog Operator](https://docs.datadoghq.com/containers/kubernetes/installation?tab=operator)が推奨になるかもしれません。
```
helm repo add datadog https://helm.datadoghq.com
helm repo update
```
Helmのvalues.yamlを次のように設定します。 `datadog.site` と `datadog.apiKey`、`datadog.clusterName` は自分の環境に合わせて変更してください。
また、Kubernetes Distributionごとに必要な設定は
https://docs.datadoghq.com/containers/kubernetes/distributions?tab=helm
を参照してください。
```yaml
datadog:
site: us3.datadoghq.com
apiKey: xxxxxx
clusterName: kind-sandbox
kubelet:
host:
valueFrom:
fieldRef:
fieldPath: spec.nodeName
tlsVerify: false
apm:
portEnabled: true
logs:
enabled: true
dogstatsd:
useHostPort: true
```
`helm template`コマンドを使ってmanifestを生成し、`kubectl apply`でデプロイします。Multi Cluster構成の場合は、Runクラスタに対してインストールしてください。
```
kubectl create ns datadog
helm template -n datadog datadog datadog/datadog -f values.yaml > datadog.yaml
kubectl apply -n datadog -f datadog.yaml
```
Datadog AgentがDaemonSetとしてインストールされます。またKubernetesクラスタの監視のための[Datadog Cluster Agent](https://docs.datadoghq.com/containers/cluster_agent/)がDeploymentとしてインストールされます。
```
$ kubectl get pod,deploy,ds -n datadog
NAME READY STATUS RESTARTS AGE
pod/datadog-cluster-agent-7b8c9fb77b-xpkld 1/1 Running 0 55m
pod/datadog-vk27t 3/3 Running 0 55m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/datadog-cluster-agent 1/1 1 1 150m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/datadog 1 1 1 1 1 kubernetes.io/os=linux 150m
```
Kubernetesクラスタの監視は不要な場合は`clusterAgent.enabled`を`false`にすればDatadog Cluster Agentはインストールされません。
合わせて `datadog.kubeStateMetricsCore.enabled`と`datadog.collectEvents`も`false`にすればKubernetesに関するデータの送信を減らせます。
> ⚠️ `clusterAgent.enabled`を`false`に設定すると[Datadog Admission controller](https://docs.datadoghq.com/containers/cluster_agent/admission_controller/?tab=helm)がなくなるため、後述の環境変数 `DD_AGENT_HOST` を手動で設定する必要があります。
> `admission.datadoghq.com/enabled` Labelも設定する意味がなくなります。
### Workloadのデプロイ
それではモニタリング対象のWorkloadをデプロイします。デプロイするアプリは https://github.com/making/rest-service です。
このアプリは[ecs-logging](https://github.com/elastic/ecs-logging)でログを構造化(JSON化)しています。Datadogなどにログを転送する場合は構造化ログが便利です。
次のコマンドでWorkloadを作成します。
```
tanzu apps workload apply rest-service \
--app rest-service \
--git-repo https://github.com/making/rest-service \
--git-branch main \
--type web \
--annotation autoscaling.knative.dev/minScale=1 \
--label admission.datadoghq.com/enabled=true \
--build-env BP_DATADOG_ENABLED=true \
--env DD_SERVICE=rest-service \
--env DD_ENV=sandbox \
--env DD_VERSION=1.0 \
-n demo \
-y
```
ポイントは
* `--label admission.datadoghq.com/enabled=true` を設定することで[Datadog Admission controller](https://docs.datadoghq.com/containers/cluster_agent/admission_controller/?tab=helm)がPodに自動で環境変数 `DD_AGENT_HOST` などを設定します。
* `--build-env BP_DATADOG_ENABLED=true` を設定することでbuildserviceによるコンテナイメージ作成時に[datadog buildpack](https://docs.vmware.com/en/VMware-Tanzu-Buildpacks/services/tanzu-buildpacks/GUID-partner-integrations-partner-integration-buildpacks.html#datadog)がDatadog Java Agentを自動でインストールします。
環境変数`DD_**`の設定は任意ですが、設定した方がDatadogからアプリの情報にアクセスしやすいです。少なくとも`DD_SERVICE`にはWorkload名を入れると良いでしょう。`DD_ENV`は環境名、`DD_VERSION`はアプリのバージョンです。
> ℹ️ Multi Cluster環境では`--env`で設定した値はどの(Run)環境でも同じ値が使われます。`DD_ENV`のような環境によって異なる値は、ConfigMapやSecretを経由して設定した方が良いです。設定方法は後述します。
`stern`コマンドでログを確認します。
```
stern -n demo rest-service
```
コンテナイメージビルド中に次のようなログで、Datadog Java Agentが追加されることを確認できます。
```
rest-service-build-1-build-pod build Paketo Buildpack for Datadog 3.0.0
rest-service-build-1-build-pod build https://github.com/paketo-buildpacks/datadog
rest-service-build-1-build-pod build Datadog Java Agent 1.1.4: Contributing to layer
rest-service-build-1-build-pod build Downloading from https://repo1.maven.org/maven2/com/datadoghq/dd-java-agent/1.1.4/dd-java-agent-1.1.4.jar
rest-service-build-1-build-pod build Verifying checksum
rest-service-build-1-build-pod build Copying to /layers/paketo-buildpacks_datadog/datadog-agent-java
rest-service-build-1-build-pod build Writing env.launch/JAVA_TOOL_OPTIONS.append
rest-service-build-1-build-pod build Writing env.launch/JAVA_TOOL_OPTIONS.delim
```
アプリケーション起動時のログに環境変数`JAVA_TOOL_OPTIONS`が出力され、`-javaagent:/layers/paketo-buildpacks_datadog/datadog-agent-java/dd-java-agent-1.1.4.jar`が含まれていることを確認できます。
また、`[dd.trace`から始まるログでDatadog Java Agentが起動していることもわかります。Datadog Java Agentが接続するDatadog AgentのURLが`"agent_url":"http://172.19.0.2:8126"`と出力されています。`172.19.0.2`はNodeのIPです。
```
rest-service-00001-deployment-78454d968b-hmb9z workload Setting Active Processor Count to 8
rest-service-00001-deployment-78454d968b-hmb9z workload Calculating JVM memory based on 11788884K available memory
rest-service-00001-deployment-78454d968b-hmb9z workload For more information on this calculation, see https://paketo.io/docs/reference/java-reference/#memory-calculator
rest-service-00001-deployment-78454d968b-hmb9z workload Calculated JVM Memory Configuration: -XX:MaxDirectMemorySize=10M -Xmx11168701K -XX:MaxMetaspaceSize=108182K -XX:ReservedCodeCacheSize=240M -Xss1M (Total Memory: 11788884K, Thread Count: 250, Loaded Class Count: 16686, Headroom: 0%)
rest-service-00001-deployment-78454d968b-hmb9z workload Enabling Java Native Memory Tracking
rest-service-00001-deployment-78454d968b-hmb9z workload Adding 124 container CA certificates to JVM truststore
rest-service-00001-deployment-78454d968b-hmb9z workload Spring Cloud Bindings Enabled
rest-service-00001-deployment-78454d968b-hmb9z workload Picked up JAVA_TOOL_OPTIONS: -Dmanagement.endpoint.health.probes.add-additional-paths="true" -Dmanagement.health.probes.enabled="true" -Dserver.port="8080" -Dserver.shutdown.grace-period="24s" -Djava.security.properties=/layers/paketo-buildpacks_bellsoft-liberica/java-security-properties/java-security.properties -XX:+ExitOnOutOfMemoryError -javaagent:/layers/paketo-buildpacks_datadog/datadog-agent-java/dd-java-agent-1.1.4.jar -XX:ActiveProcessorCount=8 -XX:MaxDirectMemorySize=10M -Xmx11168701K -XX:MaxMetaspaceSize=108182K -XX:ReservedCodeCacheSize=240M -Xss1M -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -XX:+PrintNMTStatistics -Dorg.springframework.cloud.bindings.boot.enable=true
rest-service-00001-deployment-78454d968b-hmb9z queue-proxy {"severity":"INFO","timestamp":"2023-02-03T17:21:50.2316515Z","logger":"queueproxy","caller":"sharedmain/main.go:270","message":"Starting queue-proxy","commit":"d7e9873","knative.dev/key":"demo/rest-service-00001","knative.dev/pod":"rest-service-00001-deployment-78454d968b-hmb9z"}
rest-service-00001-deployment-78454d968b-hmb9z workload [dd.trace 2023-02-03 17:21:51:856 +0000] [main] INFO com.datadog.appsec.AppSecSystem - AppSec is ENABLED_INACTIVE with powerwaf(libddwaf: 1.5.1) no rules loaded
rest-service-00001-deployment-78454d968b-hmb9z workload [dd.trace 2023-02-03 17:21:52:220 +0000] [dd-task-scheduler] INFO datadog.trace.agent.core.StatusLogger - DATADOG TRACER CONFIGURATION {"version":"1.1.4~022e2ba179","os_name":"Linux","os_version":"5.10.104-linuxkit","architecture":"amd64","lang":"jvm","lang_version":"11.0.17","jvm_vendor":"BellSoft","jvm_version":"11.0.17+7-LTS","java_class_version":"55.0","http_nonProxyHosts":"null","http_proxyHost":"null","enabled":true,"service":"rest-service","agent_url":"http://172.19.0.2:8126","agent_error":false,"debug":false,"analytics_enabled":false,"sampling_rules":[{},{}],"priority_sampling_enabled":true,"logs_correlation_enabled":true,"profiling_enabled":false,"remote_config_enabled":true,"debugger_enabled":false,"appsec_enabled":"ENABLED_INACTIVE","telemetry_enabled":true,"dd_version":"1.0","health_checks_enabled":true,"configuration_file":"no config file present","runtime_id":"d10c232d-88f9-441f-9efa-35ea763a7e04","logging_settings":{"levelInBrackets":false,"dateTimeFormat":"'[dd.trace 'yyyy-MM-dd HH:mm:ss:SSS Z']'","logFile":"System.err","configurationFile":"simplelogger.properties","showShortLogName":false,"showDateTime":true,"showLogName":true,"showThreadName":true,"defaultLogLevel":"INFO","warnLevelString":"WARN","embedException":false},"cws_enabled":false,"cws_tls_refresh":5000}
rest-service-00001-deployment-78454d968b-hmb9z workload [dd.trace 2023-02-03 17:21:52:869 +0000] [dd-task-scheduler] WARN datadog.telemetry.dependency.DependencyResolverQueue - unable to detect dependency for URI file:/workspace/
rest-service-00001-deployment-78454d968b-hmb9z workload
rest-service-00001-deployment-78454d968b-hmb9z workload . ____ _ __ _ _
rest-service-00001-deployment-78454d968b-hmb9z workload /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
rest-service-00001-deployment-78454d968b-hmb9z workload ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
rest-service-00001-deployment-78454d968b-hmb9z workload \\/ ___)| |_)| | | | | || (_| | ) ) ) )
rest-service-00001-deployment-78454d968b-hmb9z workload ' |____| .__|_| |_|_| |_\__, | / / / /
rest-service-00001-deployment-78454d968b-hmb9z workload =========|_|==============|___/=/_/_/_/
rest-service-00001-deployment-78454d968b-hmb9z workload :: Spring Boot :: (v2.7.8)
rest-service-00001-deployment-78454d968b-hmb9z workload
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:53.382Z","log.level": "INFO","message":"Starting RestServiceApplication v0.0.1-SNAPSHOT using Java 11.0.17 on rest-service-00001-deployment-78454d968b-hmb9z with PID 1 (/workspace/BOOT-INF/classes started by cnb in /workspace)","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"com.example.restservice.RestServiceApplication","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:53.390Z","log.level": "INFO","message":"No active profile set, falling back to 1 default profile: \"default\"","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"com.example.restservice.RestServiceApplication","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload [dd.trace 2023-02-03 17:21:53:868 +0000] [dd-task-scheduler] WARN datadog.telemetry.dependency.DependencyResolverQueue - unable to detect dependency for URI file:/workspace/BOOT-INF/classes/
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.267Z","log.level": "INFO","message":"Tomcat initialized with port(s): 8080 (http)","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.springframework.boot.web.embedded.tomcat.TomcatWebServer","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.331Z","log.level": "INFO","message":"Starting service [Tomcat]","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.apache.catalina.core.StandardService","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.331Z","log.level": "INFO","message":"Starting Servlet engine: [Apache Tomcat/9.0.71]","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.apache.catalina.core.StandardEngine","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.434Z","log.level": "INFO","message":"Initializing Spring embedded WebApplicationContext","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/]","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.434Z","log.level": "INFO","message":"Root WebApplicationContext: initialization completed in 1974 ms","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:56.917Z","log.level": "INFO","message":"Exposing 1 endpoint(s) beneath base path '/actuator'","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.springframework.boot.actuate.endpoint.web.EndpointLinksResolver","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:56.979Z","log.level": "INFO","message":"Tomcat started on port(s): 8080 (http) with context path ''","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.springframework.boot.web.embedded.tomcat.TomcatWebServer","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:56.999Z","log.level": "INFO","message":"Started RestServiceApplication in 4.712 seconds (JVM running for 6.843)","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"com.example.restservice.RestServiceApplication","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:57.269Z","log.level": "INFO","message":"Initializing Spring DispatcherServlet 'dispatcherServlet'","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/]","dd.trace_id":"8339885192089890711","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"2116877690292855250"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:57.270Z","log.level": "INFO","message":"Initializing Servlet 'dispatcherServlet'","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"org.springframework.web.servlet.DispatcherServlet","dd.trace_id":"8339885192089890711","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"2116877690292855250"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:57.272Z","log.level": "INFO","message":"Completed initialization in 2 ms","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"org.springframework.web.servlet.DispatcherServlet","dd.trace_id":"8339885192089890711","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"2116877690292855250"}
```
`tanzu apps workload get ...`でWorkloadの情報を確認します。`Knative Services`が`READY`であればアプリにアクセス可能です。
```
$ tanzu apps workload get -n demo rest-service
📡 Overview
name: rest-service
type: web
namespace: demo
💾 Source
type: git
url: https://github.com/making/rest-service
branch: main
📦 Supply Chain
name: source-to-url
NAME READY HEALTHY UPDATED RESOURCE
source-provider True True 5m13s gitrepositories.source.toolkit.fluxcd.io/rest-service
image-provider True True 110s images.kpack.io/rest-service
config-provider True True 102s podintents.conventions.carto.run/rest-service
app-config True True 102s configmaps/rest-service
service-bindings True True 102s configmaps/rest-service-with-claims
api-descriptors True True 102s configmaps/rest-service-with-api-descriptors
config-writer True True 89s runnables.carto.run/rest-service-config-writer
🚚 Delivery
name: delivery-basic
NAME READY HEALTHY UPDATED RESOURCE
source-provider True True 67s imagerepositories.source.apps.tanzu.vmware.com/rest-service-delivery
deployer True True 64s apps.kappctrl.k14s.io/rest-service
💬 Messages
No messages found.
🛶 Pods
NAME READY STATUS RESTARTS AGE
rest-service-00001-deployment-78454d968b-hmb9z 2/2 Running 0 67s
rest-service-build-1-build-pod 0/1 Completed 0 5m12s
rest-service-config-writer-9hbdf-pod 0/1 Completed 0 100s
🚢 Knative Services
NAME READY URL
rest-service Ready https://rest-service.demo.127-0-0-1.sslip.io
To see logs: "tanzu apps workload tail rest-service --namespace demo --timestamp --since 1h"
```
次のコマンドでDatadog Admission controllerによってPodに環境変数が追加されているか確認します。
```
$ kubectl get pod -n demo rest-service-00001-deployment-78454d968b-hmb9z -oyaml | grep env: -A 12
- env:
- name: DD_ENTITY_ID
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.uid
- name: DD_AGENT_HOST
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
- name: DD_SERVICE
value: rest-service
--
- env:
- name: DD_ENTITY_ID
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.uid
- name: DD_AGENT_HOST
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
- name: SERVING_NAMESPACE
value: demo
```
`DD_ENTITY_ID`と`DD_AGENT_HOST`が自動で追加されていることがわかります。`DD_AGENT_HOST`には`fieldRef`を使用してhostIPが設定されています。そのため、ログに`"agent_url":"http://172.19.0.2:8126"`が出力されていました。
> ⚠️ Datadog Admission controllerを使わない場合は、`DD_AGENT_HOST`が設定されないため、Datadog Java Agentがのデータ送信に失敗します。
> この場合、環境変数 `DD_AGENT_HOST` を明示する必要があります。
> ただし、Knative Serviceは`fieldRef`を使えません。`fieldRef`を使ってNodeのHost IPを設定したい場合は`--type=server`を指定してDeploymentとしてアプリをデプロイする必要があります。
> または次のようなmanifestでにDatadog Agentに対するServiceを作成しても良いです。
>
> ```yaml
> apiVersion: v1
> kind: Service
> metadata:
> name: datadog-agent
> namespace: datadog
> spec:
> ports:
> - name: agent
> port: 8126
> protocol: TCP
> targetPort: 8126
> selector:
> app.kubernetes.io/component: agent
> app.kubernetes.io/instance: datadog
> type: ClusterIP
> ```
>
> この場合`DD_AGENT_HOST`に対する設定は `--env DD_AGENT_HOST=datadog-agent.datadog.svc.cluster.local` になります。
DatadogでAPM -> Servicesへ遷移するとService一覧に"reset-service"が表示されます。
"reset-service"をクリックするとrest-serviceのSummaryが表示されます。Readiness ProbeとLiveness Probeの`GET /readyz`と`GET /livez`に関するデータが毎秒送られてることがわかります。
"Service Summary"の下の"JVM Metrics"でJVMのメトリクスを見ることもできます。
アプリの`/greeting`エンドポイントへリクエストを送りましょう。
```
for i in `seq 1 100`;do
curl -sk https://rest-service.demo.127-0-0-1.sslip.io/greeting
echo
done
```
次のようなレスポンスが返ります。
```
{"id":1,"content":"Hello, World!"}
{"id":2,"content":"Hello, World!"}
{"id":3,"content":"Hello, World!"}
{"id":4,"content":"Hello, World!"}
{"id":5,"content":"Hello, World!"}
...
{"id":100,"content":"Hello, World!"}
```
また、次のようなアプリのログも出力されます。Datadog Java Agentは`dd.trace_id`と`dd.span_id`をMDCに自動で設定します。ecs-loggingを使用すれば、MDCに設定された値を自動でJSONに追加してくれるので、ログにも`dd.trace_id`と`dd.span_id`が出力されていることがわかります。これはのちにTraceとLogを結びつける際に役立ちます。
```
...
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:25:37.895Z","log.level": "INFO","message":"Greeting{id=98, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-1","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"5626764257006979930","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"2368498438859442974"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:25:37.945Z","log.level": "INFO","message":"Greeting{id=99, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-5","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"8014719942895616066","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"8884205664488887926"}
rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:25:37.999Z","log.level": "INFO","message":"Greeting{id=100, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"3762825320413857103","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"3963548649385222325"}
```
DatadogでAPM -> Tracesに遷移してください。
`/greeting`へのTrace情報が一覧に含まれていることがわかります。
一つ選んでクリックして、詳細を確認します。処理内容の内訳が表示されます。
"Logs"タブをクリックすると"No logs found"と表示されます。まだアプリからログを送る設定を行なっていませんでした。
### アプリのログをDatadogに送る
Datadogへのログ転送はPlatform単位で行う方法とWorkload単位で行う方法があります。今回はWorkload単位で設定します。
次のドキュメントのようにPodの`ad.datadoghq.com/.logs`アノテーションに`'[{"source": "","service": "","tags": ["", ""]}]'`という設定を行う必要があります。
https://docs.datadoghq.com/containers/kubernetes/log/?tab=kubernetes#configuration
`tanzu apps workload`の`--annotation`オプションはJSONを設定するとエラーになるので、WorkloadはYAMLで設定します。
現在のWorkloadの設定内容をYAMLに出力にするには次のコマンドが利用可能です。
```
tanzu apps workload get -n demo rest-service --export
```
もしくは、Workloadを作成する前にYAML化したい場合は`--dry-run`オプションをつければ良いです。
```
tanzu apps workload apply rest-service \
--app rest-service \
--git-repo https://github.com/making/rest-service \
--git-branch main \
--type web \
--annotation autoscaling.knative.dev/minScale=1 \
--label admission.datadoghq.com/enabled=true \
--build-env BP_DATADOG_ENABLED=true \
--env DD_SERVICE=rest-service \
--env DD_ENV=sandbox \
--env DD_VERSION=1.0 \
-n demo --dry-run | kubectl neat
```
いずれかのコマンドで次のYAMLが得られます。
```yaml
---
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
labels:
admission.datadoghq.com/enabled: "true"
app.kubernetes.io/part-of: rest-service
apps.tanzu.vmware.com/workload-type: web
name: rest-service
namespace: demo
spec:
build:
env:
- name: BP_DATADOG_ENABLED
value: "true"
env:
- name: DD_SERVICE
value: rest-service
- name: DD_ENV
value: sandbox
- name: DD_VERSION
value: "1.0"
params:
- name: annotations
value:
autoscaling.knative.dev/minScale: "1"
source:
git:
ref:
branch: main
url: https://github.com/making/rest-service
```
`annotations` paramに`ad.datadoghq.com/workload.logs`を追加します。次のようにYAMLを変更してください。
```yaml
---
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
labels:
admission.datadoghq.com/enabled: "true"
app.kubernetes.io/part-of: rest-service
apps.tanzu.vmware.com/workload-type: web
name: rest-service
namespace: demo
spec:
build:
env:
- name: BP_DATADOG_ENABLED
value: "true"
env:
- name: DD_SERVICE
value: rest-service
- name: DD_ENV
value: sandbox
- name: DD_VERSION
value: "1.0"
params:
- name: annotations
value:
ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]' #! <--- Added
autoscaling.knative.dev/minScale: "1"
source:
git:
ref:
branch: main
url: https://github.com/making/rest-service
```
このYAMLをapplyします。
```
$ tanzu apps workload apply -f workload.yaml
❗ WARNING: Configuration file update strategy is changing. By default, provided configuration files will replace rather than merge existing configuration. The change will take place in the January 2024 TAP release (use "--update-strategy" to control strategy explicitly).
🔎 Update workload:
...
22, 22 | value: "1.0"
23, 23 | params:
24, 24 | - name: annotations
25, 25 | value:
26 + | ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]'
26, 27 | autoscaling.knative.dev/minScale: "1"
27, 28 | source:
28, 29 | git:
29, 30 | ref:
...
❓ Really update the workload "rest-service"? [yN]: y
👍 Updated workload "rest-service"
To see logs: "tanzu apps workload tail rest-service --namespace demo --timestamp --since 1h"
To get status: "tanzu apps workload get rest-service --namespace demo"
```
> ℹ️ どうしてもYAMLを使いたくない場合は、`--anotation`オプションの代わりに`--param-yaml`オプションを使って次のコマンドでも設定可能です
>
> ```
> tanzu apps workload apply rest-service \
> --app rest-service \
> --git-repo https://github.com/making/rest-service \
> --git-branch main \
> --type web \
> --param-yaml annotation='{"ad.datadoghq.com/workload.logs":"[{\"source\":\"workload\",\"service\":\"rest-service\"}]","autoscaling.knative.dev/minScale":"1"}' \
> --label admission.datadoghq.com/enabled=true \
> --build-env BP_DATADOG_ENABLED=true \
> --env DD_SERVICE=rest-service \
> --env DD_ENV=sandbox \
> --env DD_VERSION=1.0 \
> ```
次のように、Knative ServiceのRevisionが`rest-service-00002`になればデプロイ完了です。
```
$ kubectl get ksvc -n demo rest-service
NAME URL LATESTCREATED LATESTREADY READY REASON
rest-service https://rest-service.demo.127-0-0-1.sslip.io rest-service-00002 rest-service-00002 True
```
もう一度、アプリにリクエストを送ります。
```
for i in `seq 1 100`;do
curl -sk https://rest-service.demo.127-0-0-1.sslip.io/greeting
echo
done
```
次のようなアプリログが出力されます。
```
...
rest-service-00002-deployment-6cfcb4b6f5-kng8m workload {"@timestamp":"2023-02-03T17:37:36.146Z","log.level": "INFO","message":"Greeting{id=98, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"7335734608230234482","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"6315277859940185552"}
rest-service-00002-deployment-6cfcb4b6f5-kng8m workload {"@timestamp":"2023-02-03T17:37:36.195Z","log.level": "INFO","message":"Greeting{id=99, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-2","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"8586535972007884877","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"7454967409341757930"}
rest-service-00002-deployment-6cfcb4b6f5-kng8m workload {"@timestamp":"2023-02-03T17:37:36.247Z","log.level": "INFO","message":"Greeting{id=100, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-8","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"8792935479711312615","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"1391216223413785587"}
```
DatadogからLogsを確認すると、同じログが転送されていることがわかります。[ecs-logging](https://github.com/elastic/ecs-logging)によって構造化ログが送られているので、"CONTENT"列にはログのうち`message`のフィールドだけが表示されます。
他の項目はログをクリックすると表示されます。
Traceタブをクリックするとこのログ出力のタイミングのTrace情報が表示されます。これは[ecs-logging](https://github.com/elastic/ecs-logging)によって、構造化ログに`dd.trace_id`と`dd.span_id`フィールドが追加されているためです。構造化ログでない場合は、[ドキュメント](https://docs.datadoghq.com/tracing/troubleshooting/correlated-logs-not-showing-up-in-the-trace-id-panel/?%3Ftab=jsonlogs&tab=custom#trace-id-option)にしたがって、マッピングの設定が必要です。
逆にTraceからLogsも見れるようになります。
### 環境変数DD_ENVを環境ごとに変える
iterate profileやfull profileのようなsingle clusterの場合は、`DD_ENV`をハードコードしても良いのですが、
Multi Cluster構成では、同じManifestが複数のRun Clusterに適用されてしまうため、Run Clusterによって`DD_ENV`を変えることができなくなります。
この場合は、`DD_ENV`はConfigMapまたはSecretから取得するようにすれば良いです。次のようなConfigMapまたはSecretをRun Clusterごとに作成します。
```
kubectl create cm -n demo rest-service-config --from-literal env=sandbox --dry-run -oyaml | kubectl apply -f-
```
`DD_ENV`には次のように`valueFrom`で上記で作成したConfigMapまたはSecretを参照するようにします。これで環境に応じて環境変数を変えることができます。
```yaml
---
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
labels:
admission.datadoghq.com/enabled: "true"
app.kubernetes.io/part-of: rest-service
apps.tanzu.vmware.com/workload-type: web
name: rest-service
namespace: demo
spec:
build:
env:
- name: BP_DATADOG_ENABLED
value: "true"
env:
- name: DD_SERVICE
value: rest-service
- name: DD_ENV
valueFrom:
configMapKeyRef: #! <----
name: rest-service-config
key: env
- name: DD_VERSION
value: "1.0"
params:
- name: annotations
value:
ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]'
autoscaling.knative.dev/minScale: "1"
source:
git:
ref:
branch: main
url: https://github.com/making/rest-service
```
この設定はTAP 1.4時点では`tanzu` CLIからは行えず、YAML経由で設定する必要があります。
### アプリのPrometheusエンドポイントをScrapeする
Spring Boot Actuatorを使用すると内部で[Micrometer](https://micrometer.io)を使ったメトリクスが収集されます。
Datadog Java Agentはこのメトリクスは送信しません。このメトリクスをDatadogに送るためにPrometheusエンドポイントを利用します。
Datadog AgentはPrometheusエンドポイントを検出して、取得したメトリクスをDatadogに送信することができます。アプリ側は[Micrometer Prometheus](https://micrometer.io/docs/registry/prometheus)をライブラリに加えると`/actuator/prometheus`エンドポイントでPrometheus形式で公開できます(rest-servicesは設定済みです)。
Datadog AgentにPrometheus向けのエンドポイントを有効にするために、Helm Chartの次の設定を行います。
```yaml
datadog:
# ...
prometheusScrape:
enabled: true
serviceEndpoints: true
```
次のコマンドで変更を適用します。
```
helm template -n datadog datadog datadog/datadog -f values.yaml > datadog.yaml
kubectl apply -n datadog -f datadog.yaml
```
Workloadに次のように`prometheus.io/...`アノテーションを追加します。
TAP 1.3までは`prometheus.io/port`を`8081`にしてください。
TAP 1.4の場合はデフォルトで`8080`ですが、Workloadに`apps.tanzu.vmware.com/auto-configure-actuators`ラベルを`true`で設定している場合、またはPlatformレベルで`springboot_conventions.autoConfigureActuators`を`true`に設定している場合は`8081`にしてください。
```yaml
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
labels:
admission.datadoghq.com/enabled: "true"
app.kubernetes.io/part-of: rest-service
apps.tanzu.vmware.com/workload-type: web
name: rest-service
namespace: demo
spec:
build:
env:
- name: BP_DATADOG_ENABLED
value: "true"
env:
- name: DD_SERVICE
value: rest-service
- name: DD_ENV
valueFrom:
configMapKeyRef:
name: rest-service-config
key: env
- name: DD_VERSION
value: "1.0"
params:
- name: annotations
value:
ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]'
autoscaling.knative.dev/minScale: "1"
prometheus.io/path: /actuator/prometheus #! <---
prometheus.io/port: "8080" #! <---
prometheus.io/scrape: "true" #! <---
source:
git:
ref:
branch: main
url: https://github.com/making/rest-service
```
このYAMLをapplyします。
```
$ tanzu apps workload apply -f workload.yaml
❗ WARNING: Configuration file update strategy is changing. By default, provided configuration files will replace rather than merge existing configuration. The change will take place in the January 2024 TAP release (use "--update-strategy" to control strategy explicitly).
🔎 Update workload:
...
29, 29 | - name: annotations
30, 30 | value:
31, 31 | ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]'
32, 32 | autoscaling.knative.dev/minScale: "1"
33 + | prometheus.io/path: /actuator/prometheus
34 + | prometheus.io/port: "8080"
35 + | prometheus.io/scrape: "true"
33, 36 | source:
34, 37 | git:
35, 38 | ref:
36, 39 | branch: main
...
❓ Really update the workload "rest-service"? [yN]: y
👍 Updated workload "rest-service"
To see logs: "tanzu apps workload tail rest-service --namespace demo --timestamp --since 1h"
To get status: "tanzu apps workload get rest-service --namespace demo"
```
APM -> Servicesを確認すると、`/actuator/prometheus`へのアクセスが増えていることが確認できます。
Metrics ExplorerでMicrometer由来のメトリクスが取れていることを実際に確認できます。
例えば、`http_server_requests_seconds_count`を`uri`、`method`、`status`タグ別に表示したグラフが次の図です。
`jvm_memory_used_bytes`を`area`、`id`タグ別に表示したグラフが次の図です。
Micrometerに関するダッシュボードは特に用意されていないので、自分で作成する必要があります。