--- title: Tanzu Application PlatformのWorkloadをDatadogでモニタリングする tags: ["Kubernetes", "Tanzu", "TAP", "Datadog"] categories: ["Dev", "CaaS", "Kubernetes", "TAP"] date: 2023-02-04T08:01:57Z updated: 2023-02-06T06:30:08Z --- Tanzu Application Platformでは以下のDatadog buildpackが利用可能です。 * https://github.com/paketo-buildpacks/datadog * https://docs.vmware.com/en/VMware-Tanzu-Buildpacks/services/tanzu-buildpacks/GUID-partner-integrations-partner-integration-buildpacks.html#datadog このbuildpackはJavaアプリにDatadog Java Agentを、Node.jsアプリには Datadog Node.js Trace Agentを追加しますが、 このbuildpackを使うだけでは実際のデータをDatadogに送ることはできません。Datadogにデータを送るには別途Data Agentを起動させる必要があります。 次の図のような構成になります。
image 本記事では次のドキュメントに従ってDatadog AgentをKubernetesにインストールし、TAPにデプロイしたWorkloadをモニタリングしてみます。 * https://docs.datadoghq.com/containers/kubernetes/installation/?tab=helm [こちらの記事](https://ik.am/entries/733)で構築したTAP 1.4 on Kindの環境で試します。なお、DatadogはFree Trialの環境を使用しています。 **目次** ### Datadog Agentのインストール 今回はDatadog AgentのインストールにはHelm Chartを利用します。今後は[Datadog Operator](https://docs.datadoghq.com/containers/kubernetes/installation?tab=operator)が推奨になるかもしれません。 ``` helm repo add datadog https://helm.datadoghq.com helm repo update ``` Helmのvalues.yamlを次のように設定します。 `datadog.site` と `datadog.apiKey`、`datadog.clusterName` は自分の環境に合わせて変更してください。
また、Kubernetes Distributionごとに必要な設定は
https://docs.datadoghq.com/containers/kubernetes/distributions?tab=helm
を参照してください。 ```yaml datadog: site: us3.datadoghq.com apiKey: xxxxxx clusterName: kind-sandbox kubelet: host: valueFrom: fieldRef: fieldPath: spec.nodeName tlsVerify: false apm: portEnabled: true logs: enabled: true dogstatsd: useHostPort: true ``` `helm template`コマンドを使ってmanifestを生成し、`kubectl apply`でデプロイします。Multi Cluster構成の場合は、Runクラスタに対してインストールしてください。 ``` kubectl create ns datadog helm template -n datadog datadog datadog/datadog -f values.yaml > datadog.yaml kubectl apply -n datadog -f datadog.yaml ``` Datadog AgentがDaemonSetとしてインストールされます。またKubernetesクラスタの監視のための[Datadog Cluster Agent](https://docs.datadoghq.com/containers/cluster_agent/)がDeploymentとしてインストールされます。 ``` $ kubectl get pod,deploy,ds -n datadog NAME READY STATUS RESTARTS AGE pod/datadog-cluster-agent-7b8c9fb77b-xpkld 1/1 Running 0 55m pod/datadog-vk27t 3/3 Running 0 55m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/datadog-cluster-agent 1/1 1 1 150m NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/datadog 1 1 1 1 1 kubernetes.io/os=linux 150m ``` Kubernetesクラスタの監視は不要な場合は`clusterAgent.enabled`を`false`にすればDatadog Cluster Agentはインストールされません。 合わせて `datadog.kubeStateMetricsCore.enabled`と`datadog.collectEvents`も`false`にすればKubernetesに関するデータの送信を減らせます。 > ⚠️ `clusterAgent.enabled`を`false`に設定すると[Datadog Admission controller](https://docs.datadoghq.com/containers/cluster_agent/admission_controller/?tab=helm)がなくなるため、後述の環境変数 `DD_AGENT_HOST` を手動で設定する必要があります。
> `admission.datadoghq.com/enabled` Labelも設定する意味がなくなります。 ### Workloadのデプロイ それではモニタリング対象のWorkloadをデプロイします。デプロイするアプリは https://github.com/making/rest-service です。
このアプリは[ecs-logging](https://github.com/elastic/ecs-logging)でログを構造化(JSON化)しています。Datadogなどにログを転送する場合は構造化ログが便利です。
次のコマンドでWorkloadを作成します。 ``` tanzu apps workload apply rest-service \ --app rest-service \ --git-repo https://github.com/making/rest-service \ --git-branch main \ --type web \ --annotation autoscaling.knative.dev/minScale=1 \ --label admission.datadoghq.com/enabled=true \ --build-env BP_DATADOG_ENABLED=true \ --env DD_SERVICE=rest-service \ --env DD_ENV=sandbox \ --env DD_VERSION=1.0 \ -n demo \ -y ``` ポイントは * `--label admission.datadoghq.com/enabled=true` を設定することで[Datadog Admission controller](https://docs.datadoghq.com/containers/cluster_agent/admission_controller/?tab=helm)がPodに自動で環境変数 `DD_AGENT_HOST` などを設定します。 * `--build-env BP_DATADOG_ENABLED=true` を設定することでbuildserviceによるコンテナイメージ作成時に[datadog buildpack](https://docs.vmware.com/en/VMware-Tanzu-Buildpacks/services/tanzu-buildpacks/GUID-partner-integrations-partner-integration-buildpacks.html#datadog)がDatadog Java Agentを自動でインストールします。 環境変数`DD_**`の設定は任意ですが、設定した方がDatadogからアプリの情報にアクセスしやすいです。少なくとも`DD_SERVICE`にはWorkload名を入れると良いでしょう。`DD_ENV`は環境名、`DD_VERSION`はアプリのバージョンです。 > ℹ️ Multi Cluster環境では`--env`で設定した値はどの(Run)環境でも同じ値が使われます。`DD_ENV`のような環境によって異なる値は、ConfigMapやSecretを経由して設定した方が良いです。設定方法は後述します。 `stern`コマンドでログを確認します。 ``` stern -n demo rest-service ``` コンテナイメージビルド中に次のようなログで、Datadog Java Agentが追加されることを確認できます。 ``` rest-service-build-1-build-pod build Paketo Buildpack for Datadog 3.0.0 rest-service-build-1-build-pod build https://github.com/paketo-buildpacks/datadog rest-service-build-1-build-pod build Datadog Java Agent 1.1.4: Contributing to layer rest-service-build-1-build-pod build Downloading from https://repo1.maven.org/maven2/com/datadoghq/dd-java-agent/1.1.4/dd-java-agent-1.1.4.jar rest-service-build-1-build-pod build Verifying checksum rest-service-build-1-build-pod build Copying to /layers/paketo-buildpacks_datadog/datadog-agent-java rest-service-build-1-build-pod build Writing env.launch/JAVA_TOOL_OPTIONS.append rest-service-build-1-build-pod build Writing env.launch/JAVA_TOOL_OPTIONS.delim ``` アプリケーション起動時のログに環境変数`JAVA_TOOL_OPTIONS`が出力され、`-javaagent:/layers/paketo-buildpacks_datadog/datadog-agent-java/dd-java-agent-1.1.4.jar`が含まれていることを確認できます。 また、`[dd.trace`から始まるログでDatadog Java Agentが起動していることもわかります。Datadog Java Agentが接続するDatadog AgentのURLが`"agent_url":"http://172.19.0.2:8126"`と出力されています。`172.19.0.2`はNodeのIPです。 ``` rest-service-00001-deployment-78454d968b-hmb9z workload Setting Active Processor Count to 8 rest-service-00001-deployment-78454d968b-hmb9z workload Calculating JVM memory based on 11788884K available memory rest-service-00001-deployment-78454d968b-hmb9z workload For more information on this calculation, see https://paketo.io/docs/reference/java-reference/#memory-calculator rest-service-00001-deployment-78454d968b-hmb9z workload Calculated JVM Memory Configuration: -XX:MaxDirectMemorySize=10M -Xmx11168701K -XX:MaxMetaspaceSize=108182K -XX:ReservedCodeCacheSize=240M -Xss1M (Total Memory: 11788884K, Thread Count: 250, Loaded Class Count: 16686, Headroom: 0%) rest-service-00001-deployment-78454d968b-hmb9z workload Enabling Java Native Memory Tracking rest-service-00001-deployment-78454d968b-hmb9z workload Adding 124 container CA certificates to JVM truststore rest-service-00001-deployment-78454d968b-hmb9z workload Spring Cloud Bindings Enabled rest-service-00001-deployment-78454d968b-hmb9z workload Picked up JAVA_TOOL_OPTIONS: -Dmanagement.endpoint.health.probes.add-additional-paths="true" -Dmanagement.health.probes.enabled="true" -Dserver.port="8080" -Dserver.shutdown.grace-period="24s" -Djava.security.properties=/layers/paketo-buildpacks_bellsoft-liberica/java-security-properties/java-security.properties -XX:+ExitOnOutOfMemoryError -javaagent:/layers/paketo-buildpacks_datadog/datadog-agent-java/dd-java-agent-1.1.4.jar -XX:ActiveProcessorCount=8 -XX:MaxDirectMemorySize=10M -Xmx11168701K -XX:MaxMetaspaceSize=108182K -XX:ReservedCodeCacheSize=240M -Xss1M -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -XX:+PrintNMTStatistics -Dorg.springframework.cloud.bindings.boot.enable=true rest-service-00001-deployment-78454d968b-hmb9z queue-proxy {"severity":"INFO","timestamp":"2023-02-03T17:21:50.2316515Z","logger":"queueproxy","caller":"sharedmain/main.go:270","message":"Starting queue-proxy","commit":"d7e9873","knative.dev/key":"demo/rest-service-00001","knative.dev/pod":"rest-service-00001-deployment-78454d968b-hmb9z"} rest-service-00001-deployment-78454d968b-hmb9z workload [dd.trace 2023-02-03 17:21:51:856 +0000] [main] INFO com.datadog.appsec.AppSecSystem - AppSec is ENABLED_INACTIVE with powerwaf(libddwaf: 1.5.1) no rules loaded rest-service-00001-deployment-78454d968b-hmb9z workload [dd.trace 2023-02-03 17:21:52:220 +0000] [dd-task-scheduler] INFO datadog.trace.agent.core.StatusLogger - DATADOG TRACER CONFIGURATION {"version":"1.1.4~022e2ba179","os_name":"Linux","os_version":"5.10.104-linuxkit","architecture":"amd64","lang":"jvm","lang_version":"11.0.17","jvm_vendor":"BellSoft","jvm_version":"11.0.17+7-LTS","java_class_version":"55.0","http_nonProxyHosts":"null","http_proxyHost":"null","enabled":true,"service":"rest-service","agent_url":"http://172.19.0.2:8126","agent_error":false,"debug":false,"analytics_enabled":false,"sampling_rules":[{},{}],"priority_sampling_enabled":true,"logs_correlation_enabled":true,"profiling_enabled":false,"remote_config_enabled":true,"debugger_enabled":false,"appsec_enabled":"ENABLED_INACTIVE","telemetry_enabled":true,"dd_version":"1.0","health_checks_enabled":true,"configuration_file":"no config file present","runtime_id":"d10c232d-88f9-441f-9efa-35ea763a7e04","logging_settings":{"levelInBrackets":false,"dateTimeFormat":"'[dd.trace 'yyyy-MM-dd HH:mm:ss:SSS Z']'","logFile":"System.err","configurationFile":"simplelogger.properties","showShortLogName":false,"showDateTime":true,"showLogName":true,"showThreadName":true,"defaultLogLevel":"INFO","warnLevelString":"WARN","embedException":false},"cws_enabled":false,"cws_tls_refresh":5000} rest-service-00001-deployment-78454d968b-hmb9z workload [dd.trace 2023-02-03 17:21:52:869 +0000] [dd-task-scheduler] WARN datadog.telemetry.dependency.DependencyResolverQueue - unable to detect dependency for URI file:/workspace/ rest-service-00001-deployment-78454d968b-hmb9z workload rest-service-00001-deployment-78454d968b-hmb9z workload . ____ _ __ _ _ rest-service-00001-deployment-78454d968b-hmb9z workload /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \ rest-service-00001-deployment-78454d968b-hmb9z workload ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \ rest-service-00001-deployment-78454d968b-hmb9z workload \\/ ___)| |_)| | | | | || (_| | ) ) ) ) rest-service-00001-deployment-78454d968b-hmb9z workload ' |____| .__|_| |_|_| |_\__, | / / / / rest-service-00001-deployment-78454d968b-hmb9z workload =========|_|==============|___/=/_/_/_/ rest-service-00001-deployment-78454d968b-hmb9z workload :: Spring Boot :: (v2.7.8) rest-service-00001-deployment-78454d968b-hmb9z workload rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:53.382Z","log.level": "INFO","message":"Starting RestServiceApplication v0.0.1-SNAPSHOT using Java 11.0.17 on rest-service-00001-deployment-78454d968b-hmb9z with PID 1 (/workspace/BOOT-INF/classes started by cnb in /workspace)","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"com.example.restservice.RestServiceApplication","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:53.390Z","log.level": "INFO","message":"No active profile set, falling back to 1 default profile: \"default\"","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"com.example.restservice.RestServiceApplication","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload [dd.trace 2023-02-03 17:21:53:868 +0000] [dd-task-scheduler] WARN datadog.telemetry.dependency.DependencyResolverQueue - unable to detect dependency for URI file:/workspace/BOOT-INF/classes/ rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.267Z","log.level": "INFO","message":"Tomcat initialized with port(s): 8080 (http)","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.springframework.boot.web.embedded.tomcat.TomcatWebServer","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.331Z","log.level": "INFO","message":"Starting service [Tomcat]","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.apache.catalina.core.StandardService","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.331Z","log.level": "INFO","message":"Starting Servlet engine: [Apache Tomcat/9.0.71]","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.apache.catalina.core.StandardEngine","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.434Z","log.level": "INFO","message":"Initializing Spring embedded WebApplicationContext","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/]","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:55.434Z","log.level": "INFO","message":"Root WebApplicationContext: initialization completed in 1974 ms","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:56.917Z","log.level": "INFO","message":"Exposing 1 endpoint(s) beneath base path '/actuator'","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.springframework.boot.actuate.endpoint.web.EndpointLinksResolver","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:56.979Z","log.level": "INFO","message":"Tomcat started on port(s): 8080 (http) with context path ''","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"org.springframework.boot.web.embedded.tomcat.TomcatWebServer","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:56.999Z","log.level": "INFO","message":"Started RestServiceApplication in 4.712 seconds (JVM running for 6.843)","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"main","log.logger":"com.example.restservice.RestServiceApplication","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:57.269Z","log.level": "INFO","message":"Initializing Spring DispatcherServlet 'dispatcherServlet'","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/]","dd.trace_id":"8339885192089890711","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"2116877690292855250"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:57.270Z","log.level": "INFO","message":"Initializing Servlet 'dispatcherServlet'","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"org.springframework.web.servlet.DispatcherServlet","dd.trace_id":"8339885192089890711","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"2116877690292855250"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:21:57.272Z","log.level": "INFO","message":"Completed initialization in 2 ms","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"org.springframework.web.servlet.DispatcherServlet","dd.trace_id":"8339885192089890711","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"2116877690292855250"} ``` `tanzu apps workload get ...`でWorkloadの情報を確認します。`Knative Services`が`READY`であればアプリにアクセス可能です。 ``` $ tanzu apps workload get -n demo rest-service 📡 Overview name: rest-service type: web namespace: demo 💾 Source type: git url: https://github.com/making/rest-service branch: main 📦 Supply Chain name: source-to-url NAME READY HEALTHY UPDATED RESOURCE source-provider True True 5m13s gitrepositories.source.toolkit.fluxcd.io/rest-service image-provider True True 110s images.kpack.io/rest-service config-provider True True 102s podintents.conventions.carto.run/rest-service app-config True True 102s configmaps/rest-service service-bindings True True 102s configmaps/rest-service-with-claims api-descriptors True True 102s configmaps/rest-service-with-api-descriptors config-writer True True 89s runnables.carto.run/rest-service-config-writer 🚚 Delivery name: delivery-basic NAME READY HEALTHY UPDATED RESOURCE source-provider True True 67s imagerepositories.source.apps.tanzu.vmware.com/rest-service-delivery deployer True True 64s apps.kappctrl.k14s.io/rest-service 💬 Messages No messages found. 🛶 Pods NAME READY STATUS RESTARTS AGE rest-service-00001-deployment-78454d968b-hmb9z 2/2 Running 0 67s rest-service-build-1-build-pod 0/1 Completed 0 5m12s rest-service-config-writer-9hbdf-pod 0/1 Completed 0 100s 🚢 Knative Services NAME READY URL rest-service Ready https://rest-service.demo.127-0-0-1.sslip.io To see logs: "tanzu apps workload tail rest-service --namespace demo --timestamp --since 1h" ``` 次のコマンドでDatadog Admission controllerによってPodに環境変数が追加されているか確認します。 ``` $ kubectl get pod -n demo rest-service-00001-deployment-78454d968b-hmb9z -oyaml | grep env: -A 12 - env: - name: DD_ENTITY_ID valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.uid - name: DD_AGENT_HOST valueFrom: fieldRef: apiVersion: v1 fieldPath: status.hostIP - name: DD_SERVICE value: rest-service -- - env: - name: DD_ENTITY_ID valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.uid - name: DD_AGENT_HOST valueFrom: fieldRef: apiVersion: v1 fieldPath: status.hostIP - name: SERVING_NAMESPACE value: demo ``` `DD_ENTITY_ID`と`DD_AGENT_HOST`が自動で追加されていることがわかります。`DD_AGENT_HOST`には`fieldRef`を使用してhostIPが設定されています。そのため、ログに`"agent_url":"http://172.19.0.2:8126"`が出力されていました。 > ⚠️ Datadog Admission controllerを使わない場合は、`DD_AGENT_HOST`が設定されないため、Datadog Java Agentがのデータ送信に失敗します。
> この場合、環境変数 `DD_AGENT_HOST` を明示する必要があります。
> ただし、Knative Serviceは`fieldRef`を使えません。`fieldRef`を使ってNodeのHost IPを設定したい場合は`--type=server`を指定してDeploymentとしてアプリをデプロイする必要があります。
> または次のようなmanifestでにDatadog Agentに対するServiceを作成しても良いです。 > > ```yaml > apiVersion: v1 > kind: Service > metadata: > name: datadog-agent > namespace: datadog > spec: > ports: > - name: agent > port: 8126 > protocol: TCP > targetPort: 8126 > selector: > app.kubernetes.io/component: agent > app.kubernetes.io/instance: datadog > type: ClusterIP > ``` > > この場合`DD_AGENT_HOST`に対する設定は `--env DD_AGENT_HOST=datadog-agent.datadog.svc.cluster.local` になります。 DatadogでAPM -> Servicesへ遷移するとService一覧に"reset-service"が表示されます。 image "reset-service"をクリックするとrest-serviceのSummaryが表示されます。Readiness ProbeとLiveness Probeの`GET /readyz`と`GET /livez`に関するデータが毎秒送られてることがわかります。 image "Service Summary"の下の"JVM Metrics"でJVMのメトリクスを見ることもできます。 image アプリの`/greeting`エンドポイントへリクエストを送りましょう。 ``` for i in `seq 1 100`;do curl -sk https://rest-service.demo.127-0-0-1.sslip.io/greeting echo done ``` 次のようなレスポンスが返ります。 ``` {"id":1,"content":"Hello, World!"} {"id":2,"content":"Hello, World!"} {"id":3,"content":"Hello, World!"} {"id":4,"content":"Hello, World!"} {"id":5,"content":"Hello, World!"} ... {"id":100,"content":"Hello, World!"} ``` また、次のようなアプリのログも出力されます。Datadog Java Agentは`dd.trace_id`と`dd.span_id`をMDCに自動で設定します。ecs-loggingを使用すれば、MDCに設定された値を自動でJSONに追加してくれるので、ログにも`dd.trace_id`と`dd.span_id`が出力されていることがわかります。これはのちにTraceとLogを結びつける際に役立ちます。 ``` ... rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:25:37.895Z","log.level": "INFO","message":"Greeting{id=98, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-1","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"5626764257006979930","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"2368498438859442974"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:25:37.945Z","log.level": "INFO","message":"Greeting{id=99, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-5","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"8014719942895616066","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"8884205664488887926"} rest-service-00001-deployment-78454d968b-hmb9z workload {"@timestamp":"2023-02-03T17:25:37.999Z","log.level": "INFO","message":"Greeting{id=100, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"3762825320413857103","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"3963548649385222325"} ``` DatadogでAPM -> Tracesに遷移してください。 `/greeting`へのTrace情報が一覧に含まれていることがわかります。 image 一つ選んでクリックして、詳細を確認します。処理内容の内訳が表示されます。 image "Logs"タブをクリックすると"No logs found"と表示されます。まだアプリからログを送る設定を行なっていませんでした。 image ### アプリのログをDatadogに送る Datadogへのログ転送はPlatform単位で行う方法とWorkload単位で行う方法があります。今回はWorkload単位で設定します。 次のドキュメントのようにPodの`ad.datadoghq.com/.logs`アノテーションに`'[{"source": "","service": "","tags": ["", ""]}]'`という設定を行う必要があります。
https://docs.datadoghq.com/containers/kubernetes/log/?tab=kubernetes#configuration `tanzu apps workload`の`--annotation`オプションはJSONを設定するとエラーになるので、WorkloadはYAMLで設定します。 現在のWorkloadの設定内容をYAMLに出力にするには次のコマンドが利用可能です。 ``` tanzu apps workload get -n demo rest-service --export ``` もしくは、Workloadを作成する前にYAML化したい場合は`--dry-run`オプションをつければ良いです。 ``` tanzu apps workload apply rest-service \ --app rest-service \ --git-repo https://github.com/making/rest-service \ --git-branch main \ --type web \ --annotation autoscaling.knative.dev/minScale=1 \ --label admission.datadoghq.com/enabled=true \ --build-env BP_DATADOG_ENABLED=true \ --env DD_SERVICE=rest-service \ --env DD_ENV=sandbox \ --env DD_VERSION=1.0 \ -n demo --dry-run | kubectl neat ``` いずれかのコマンドで次のYAMLが得られます。 ```yaml --- apiVersion: carto.run/v1alpha1 kind: Workload metadata: labels: admission.datadoghq.com/enabled: "true" app.kubernetes.io/part-of: rest-service apps.tanzu.vmware.com/workload-type: web name: rest-service namespace: demo spec: build: env: - name: BP_DATADOG_ENABLED value: "true" env: - name: DD_SERVICE value: rest-service - name: DD_ENV value: sandbox - name: DD_VERSION value: "1.0" params: - name: annotations value: autoscaling.knative.dev/minScale: "1" source: git: ref: branch: main url: https://github.com/making/rest-service ``` `annotations` paramに`ad.datadoghq.com/workload.logs`を追加します。次のようにYAMLを変更してください。 ```yaml --- apiVersion: carto.run/v1alpha1 kind: Workload metadata: labels: admission.datadoghq.com/enabled: "true" app.kubernetes.io/part-of: rest-service apps.tanzu.vmware.com/workload-type: web name: rest-service namespace: demo spec: build: env: - name: BP_DATADOG_ENABLED value: "true" env: - name: DD_SERVICE value: rest-service - name: DD_ENV value: sandbox - name: DD_VERSION value: "1.0" params: - name: annotations value: ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]' #! <--- Added autoscaling.knative.dev/minScale: "1" source: git: ref: branch: main url: https://github.com/making/rest-service ``` このYAMLをapplyします。 ``` $ tanzu apps workload apply -f workload.yaml ❗ WARNING: Configuration file update strategy is changing. By default, provided configuration files will replace rather than merge existing configuration. The change will take place in the January 2024 TAP release (use "--update-strategy" to control strategy explicitly). 🔎 Update workload: ... 22, 22 | value: "1.0" 23, 23 | params: 24, 24 | - name: annotations 25, 25 | value: 26 + | ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]' 26, 27 | autoscaling.knative.dev/minScale: "1" 27, 28 | source: 28, 29 | git: 29, 30 | ref: ... ❓ Really update the workload "rest-service"? [yN]: y 👍 Updated workload "rest-service" To see logs: "tanzu apps workload tail rest-service --namespace demo --timestamp --since 1h" To get status: "tanzu apps workload get rest-service --namespace demo" ``` > ℹ️ どうしてもYAMLを使いたくない場合は、`--anotation`オプションの代わりに`--param-yaml`オプションを使って次のコマンドでも設定可能です > > ``` > tanzu apps workload apply rest-service \ > --app rest-service \ > --git-repo https://github.com/making/rest-service \ > --git-branch main \ > --type web \ > --param-yaml annotation='{"ad.datadoghq.com/workload.logs":"[{\"source\":\"workload\",\"service\":\"rest-service\"}]","autoscaling.knative.dev/minScale":"1"}' \ > --label admission.datadoghq.com/enabled=true \ > --build-env BP_DATADOG_ENABLED=true \ > --env DD_SERVICE=rest-service \ > --env DD_ENV=sandbox \ > --env DD_VERSION=1.0 \ > ``` 次のように、Knative ServiceのRevisionが`rest-service-00002`になればデプロイ完了です。 ``` $ kubectl get ksvc -n demo rest-service NAME URL LATESTCREATED LATESTREADY READY REASON rest-service https://rest-service.demo.127-0-0-1.sslip.io rest-service-00002 rest-service-00002 True ``` もう一度、アプリにリクエストを送ります。 ``` for i in `seq 1 100`;do curl -sk https://rest-service.demo.127-0-0-1.sslip.io/greeting echo done ``` 次のようなアプリログが出力されます。 ``` ... rest-service-00002-deployment-6cfcb4b6f5-kng8m workload {"@timestamp":"2023-02-03T17:37:36.146Z","log.level": "INFO","message":"Greeting{id=98, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-3","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"7335734608230234482","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"6315277859940185552"} rest-service-00002-deployment-6cfcb4b6f5-kng8m workload {"@timestamp":"2023-02-03T17:37:36.195Z","log.level": "INFO","message":"Greeting{id=99, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-2","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"8586535972007884877","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"7454967409341757930"} rest-service-00002-deployment-6cfcb4b6f5-kng8m workload {"@timestamp":"2023-02-03T17:37:36.247Z","log.level": "INFO","message":"Greeting{id=100, content='Hello, World!'}","ecs.version": "1.2.0","service.name":"reset-service","event.dataset":"reset-service","process.thread.name":"http-nio-8080-exec-8","log.logger":"com.example.restservice.GreetingController","dd.trace_id":"8792935479711312615","dd.service":"rest-service","dd.env":"sandbox","dd.version":"1.0","dd.span_id":"1391216223413785587"} ``` DatadogからLogsを確認すると、同じログが転送されていることがわかります。[ecs-logging](https://github.com/elastic/ecs-logging)によって構造化ログが送られているので、"CONTENT"列にはログのうち`message`のフィールドだけが表示されます。 image 他の項目はログをクリックすると表示されます。 image Traceタブをクリックするとこのログ出力のタイミングのTrace情報が表示されます。これは[ecs-logging](https://github.com/elastic/ecs-logging)によって、構造化ログに`dd.trace_id`と`dd.span_id`フィールドが追加されているためです。構造化ログでない場合は、[ドキュメント](https://docs.datadoghq.com/tracing/troubleshooting/correlated-logs-not-showing-up-in-the-trace-id-panel/?%3Ftab=jsonlogs&tab=custom#trace-id-option)にしたがって、マッピングの設定が必要です。 image 逆にTraceからLogsも見れるようになります。 image ### 環境変数DD_ENVを環境ごとに変える iterate profileやfull profileのようなsingle clusterの場合は、`DD_ENV`をハードコードしても良いのですが、 Multi Cluster構成では、同じManifestが複数のRun Clusterに適用されてしまうため、Run Clusterによって`DD_ENV`を変えることができなくなります。 この場合は、`DD_ENV`はConfigMapまたはSecretから取得するようにすれば良いです。次のようなConfigMapまたはSecretをRun Clusterごとに作成します。 ``` kubectl create cm -n demo rest-service-config --from-literal env=sandbox --dry-run -oyaml | kubectl apply -f- ``` `DD_ENV`には次のように`valueFrom`で上記で作成したConfigMapまたはSecretを参照するようにします。これで環境に応じて環境変数を変えることができます。 ```yaml --- apiVersion: carto.run/v1alpha1 kind: Workload metadata: labels: admission.datadoghq.com/enabled: "true" app.kubernetes.io/part-of: rest-service apps.tanzu.vmware.com/workload-type: web name: rest-service namespace: demo spec: build: env: - name: BP_DATADOG_ENABLED value: "true" env: - name: DD_SERVICE value: rest-service - name: DD_ENV valueFrom: configMapKeyRef: #! <---- name: rest-service-config key: env - name: DD_VERSION value: "1.0" params: - name: annotations value: ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]' autoscaling.knative.dev/minScale: "1" source: git: ref: branch: main url: https://github.com/making/rest-service ``` この設定はTAP 1.4時点では`tanzu` CLIからは行えず、YAML経由で設定する必要があります。 ### アプリのPrometheusエンドポイントをScrapeする Spring Boot Actuatorを使用すると内部で[Micrometer](https://micrometer.io)を使ったメトリクスが収集されます。 Datadog Java Agentはこのメトリクスは送信しません。このメトリクスをDatadogに送るためにPrometheusエンドポイントを利用します。 Datadog AgentはPrometheusエンドポイントを検出して、取得したメトリクスをDatadogに送信することができます。アプリ側は[Micrometer Prometheus](https://micrometer.io/docs/registry/prometheus)をライブラリに加えると`/actuator/prometheus`エンドポイントでPrometheus形式で公開できます(rest-servicesは設定済みです)。 Datadog AgentにPrometheus向けのエンドポイントを有効にするために、Helm Chartの次の設定を行います。 ```yaml datadog: # ... prometheusScrape: enabled: true serviceEndpoints: true ``` 次のコマンドで変更を適用します。 ``` helm template -n datadog datadog datadog/datadog -f values.yaml > datadog.yaml kubectl apply -n datadog -f datadog.yaml ``` Workloadに次のように`prometheus.io/...`アノテーションを追加します。 TAP 1.3までは`prometheus.io/port`を`8081`にしてください。 TAP 1.4の場合はデフォルトで`8080`ですが、Workloadに`apps.tanzu.vmware.com/auto-configure-actuators`ラベルを`true`で設定している場合、またはPlatformレベルで`springboot_conventions.autoConfigureActuators`を`true`に設定している場合は`8081`にしてください。 ```yaml apiVersion: carto.run/v1alpha1 kind: Workload metadata: labels: admission.datadoghq.com/enabled: "true" app.kubernetes.io/part-of: rest-service apps.tanzu.vmware.com/workload-type: web name: rest-service namespace: demo spec: build: env: - name: BP_DATADOG_ENABLED value: "true" env: - name: DD_SERVICE value: rest-service - name: DD_ENV valueFrom: configMapKeyRef: name: rest-service-config key: env - name: DD_VERSION value: "1.0" params: - name: annotations value: ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]' autoscaling.knative.dev/minScale: "1" prometheus.io/path: /actuator/prometheus #! <--- prometheus.io/port: "8080" #! <--- prometheus.io/scrape: "true" #! <--- source: git: ref: branch: main url: https://github.com/making/rest-service ``` このYAMLをapplyします。 ``` $ tanzu apps workload apply -f workload.yaml ❗ WARNING: Configuration file update strategy is changing. By default, provided configuration files will replace rather than merge existing configuration. The change will take place in the January 2024 TAP release (use "--update-strategy" to control strategy explicitly). 🔎 Update workload: ... 29, 29 | - name: annotations 30, 30 | value: 31, 31 | ad.datadoghq.com/workload.logs: '[{"source":"workload","service":"rest-service"}]' 32, 32 | autoscaling.knative.dev/minScale: "1" 33 + | prometheus.io/path: /actuator/prometheus 34 + | prometheus.io/port: "8080" 35 + | prometheus.io/scrape: "true" 33, 36 | source: 34, 37 | git: 35, 38 | ref: 36, 39 | branch: main ... ❓ Really update the workload "rest-service"? [yN]: y 👍 Updated workload "rest-service" To see logs: "tanzu apps workload tail rest-service --namespace demo --timestamp --since 1h" To get status: "tanzu apps workload get rest-service --namespace demo" ``` APM -> Servicesを確認すると、`/actuator/prometheus`へのアクセスが増えていることが確認できます。 image Metrics ExplorerでMicrometer由来のメトリクスが取れていることを実際に確認できます。 例えば、`http_server_requests_seconds_count`を`uri`、`method`、`status`タグ別に表示したグラフが次の図です。 image `jvm_memory_used_bytes`を`area`、`id`タグ別に表示したグラフが次の図です。 image Micrometerに関するダッシュボードは特に用意されていないので、自分で作成する必要があります。