JVM
Here, we provide two kinds of JVM metrics collection methods, one is Jolokia (deprecated) and the other is ddtrace. How to choose the way, we have the following suggestions:
- It is recommended to use DDTrace to collect JVM metrics, and Jolokia is also acceptable as it is more cumbersome to use, so it is not recommended.
- If we collect the JVM metrics of our own Java application, we recommend ddtrace scheme, which can collect the JVM metrics as well as link tracing (APM) data.
Config¶
Collect JVM Metrics Through Ddtrace¶
DataKit has a built-in statsd collector for receiving statsd protocol data sent over the network. Here we use ddtrace to collect metrics from the JVM and send them to the DataKit via statsd protocol.
Collector Configuration¶
The following statsd configuration is recommended for collecting ddtrace JVM metrics. Copy it to the conf.d/statsd directory and name it ddtrace-jvm-statsd.conf:
[[inputs.statsd]]
protocol = "udp"
## Address and port to host UDP listener on
service_address = ":8125"
## separator to use between elements of a statsd metric
metric_separator = "_"
drop_tags = ["runtime-id"]
metric_mapping = [
"jvm_:jvm",
"datadog_tracer_:ddtrace",
]
# There is no need to pay attention to the following configurations...
delete_gauges = true
delete_counters = true
delete_sets = true
delete_timings = true
## Percentiles to calculate for timing & histogram stats
percentiles = [50.0, 90.0, 99.0, 99.9, 99.95, 100.0]
## Parses tags in the datadog statsd format
## http://docs.datadoghq.com/guides/dogstatsd/
parse_data_dog_tags = true
## Parses datadog extensions to the statsd format
datadog_extensions = true
## Parses distributions metric as specified in the datadog statsd format
## https://docs.datadoghq.com/developers/metrics/types/?tab=distribution#definition
datadog_distributions = true
## Number of UDP messages allowed to queue up, once filled,
## the statsd server will start dropping packets
allowed_pending_messages = 10000
## Number of timing/histogram values to track per-measurement in the
## calculation of percentiles. Raising this limit increases the accuracy
## of percentiles but also increases the memory usage and cpu time.
percentile_limit = 1000
## Max duration (TTL) for each metric to stay cached/reported without being updated.
#max_ttl = "1000h"
[inputs.statsd.tags]
# some_tag = "your-tag-value"
# some_other_tag = "your-other-tag-value"
The collector can now be turned on by ConfigMap injection collector configuration.
For configuration instructions here:
service_addressset here to:8125, which is the destination address where ddtrace sends out jvm metrics.drop_tagshere discardsruntime-idhere because it could cause the timeline to explode. If you really need this field, just remove it fromdrop_tags.metric_mapping: In the original data sent by ddtrace, there are two types of metrics, their metrics names begin withjvm_anddatadog_tracer_respectively, so we unify them into two types of metrics, one isjvmand the other isddtraceself-running metrics.
Start Java Application¶
A feasible JVM deployment method is as follows:
java -javaagent:dd-java-agent.jar \
-Ddd.profiling.enabled=true \
-Ddd.logs.injection=true \
-Ddd.trace.sample.rate=1 \
-Ddd.service.name=my-app \
-Ddd.env=staging \
-Ddd.agent.host=localhost \
-Ddd.agent.port=9529 \
-Ddd.jmxfetch.enabled=true \
-Ddd.jmxfetch.check-period=1000 \
-Ddd.jmxfetch.statsd.host=127.0.0.1 \
-Ddd.jmxfetch.statsd.port=8125 \
-Ddd.version=1.0 \
-jar your-app.jar
Note:
- For the download of the
dd-java-agent.jarpackage, see here -
It is recommended to name the following fields:
service.nameis used to indicate which application the JVM data comes fromenvis used to indicate which environment of an application the JVM data comes from (e.g.prod/test/preprod, etc.)
-
The meaning of several options here:
-Ddd.jmxfetch.check-perioddenotes the collection frequency, in milliseconds-Ddd.jmxfetch.statsd.host=127.0.0.1indicates the connection address of the statsd collector on the DataKit-Ddd.jmxfetch.statsd.port=8125indicates the UDP connection port for the statsd collector on the DataKit, which defaults to 8125-Ddd.trace.health.xxxddtrace own metrics data collection and sending settings- If you want to turn on link tracing (APM), you can append the following two parameters (DataKit HTTP address)
-Ddd.agent.host=localhost-Ddd.agent.port=9529
When turned on, you can collect jvm metrics exposed by DDTrace.
Info
The actual collected indicators are based on DataDog's doc.
Metric¶
See here
Focus on explaining the following indicators: gc_major_collection_count gc_minor_collection_count gc_major_collection_time gc_minor_collection_time:
The indicator type is composed of three components of counter. During the collection process, each time the indicator is collected, it will be subtracted from the previous result and divided by time.
These indicators are the rate of change per second, which is not actually the case. The value in the MBean in the JVM.
Collect JVM Metrics Through Jolokia¶
JVM collector can take many metrics through JMX, and collect metrics into Guance to help analyze Java operation.
Jolokia Config¶
Preconditions¶
Install or download Jolokia. The downloaded Jolokia jar package is already available in the data directory under the DataKit installation directory. Open the Java application by:
Already tested version:
- JDK 20
- JDK 17
- JDK 11
- JDK 8
Go to the conf.d/samples directory under the DataKit installation directory, copy jvm.conf.sample and name it jvm.conf. Examples are as follows:
[[inputs.jvm]]
# default_tag_prefix = ""
# default_field_prefix = ""
# default_field_separator = "."
# username = ""
# password = ""
# response_timeout = "5s"
## Optional TLS config
# tls_ca = "/var/private/ca.pem"
# tls_cert = "/var/private/client.pem"
# tls_key = "/var/private/client-key.pem"
# insecure_skip_verify = false
## Monitor Intreval
# interval = "60s"
# Add agents URLs to query
urls = ["http://localhost:8080/jolokia"]
## v2+ override all measurement names to "jvm", default: v2
## If you want to use the old metric set, you can change it to "v1"
measurement_version = "v2"
## Add metrics to read
[[inputs.jvm.metric]]
name = "java_runtime"
mbean = "java.lang:type=Runtime"
paths = ["Uptime"]
[[inputs.jvm.metric]]
name = "java_memory"
mbean = "java.lang:type=Memory"
paths = ["HeapMemoryUsage", "NonHeapMemoryUsage", "ObjectPendingFinalizationCount"]
[[inputs.jvm.metric]]
name = "java_garbage_collector"
mbean = "java.lang:name=*,type=GarbageCollector"
paths = ["CollectionTime", "CollectionCount"]
tag_keys = ["name"]
[[inputs.jvm.metric]]
name = "java_threading"
mbean = "java.lang:type=Threading"
paths = ["TotalStartedThreadCount", "ThreadCount", "DaemonThreadCount", "PeakThreadCount"]
[[inputs.jvm.metric]]
name = "java_class_loading"
mbean = "java.lang:type=ClassLoading"
paths = ["LoadedClassCount", "UnloadedClassCount", "TotalLoadedClassCount"]
[[inputs.jvm.metric]]
name = "java_memory_pool"
mbean = "java.lang:name=*,type=MemoryPool"
paths = ["Usage", "PeakUsage", "CollectionUsage"]
tag_keys = ["name"]
[inputs.jvm.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
# ...
After configuration, restart DataKit.
Jolokia Metric¶
For all the following data collections, a global tag named host is appended by default (the tag value is the host name of the DataKit), or other tags can be specified in the configuration by [inputs.jvm.tags]:
jvm¶
JVM runtime, memory, garbage collector, threading, class loading, and memory-pool statistics collected through Jolokia Java platform MXBeans.
| Tags & Fields | Description |
|---|---|
| host ( tag) |
Hostname reported by the Jolokia agent or proxy. |
| jolokia_agent_url ( tag) |
Jolokia agent URL used to collect the JVM metrics. |
| name ( tag) |
Garbage collector or memory pool name associated with fields tagged by name. |
| CollectionCount | Total number of garbage collections that have occurred for the collector. Type: int | (count) Unit: count Tagged by: name |
| CollectionTime | Approximate accumulated elapsed time spent in garbage collection. Type: int | (count) Unit: time,ms Tagged by: name |
| CollectionUsagecommitted | Committed memory in the post-GC collection usage snapshot for this memory pool. Type: float | (gauge) Unit: digital,B Tagged by: name |
| CollectionUsageinit | Initial memory size in the post-GC collection usage snapshot for this memory pool. Type: float | (gauge) Unit: digital,B Tagged by: name |
| CollectionUsagemax | Maximum memory in the post-GC collection usage snapshot for this memory pool, or -1 if undefined. Type: float | (gauge) Unit: digital,B Tagged by: name |
| CollectionUsageused | Used memory in the post-GC collection usage snapshot for this memory pool. Type: float | (gauge) Unit: digital,B Tagged by: name |
| DaemonThreadCount | Current number of live daemon threads. Type: int | (gauge) Unit: count |
| HeapMemoryUsagecommitted | Heap memory currently committed for JVM use. Type: int | (gauge) Unit: digital,B |
| HeapMemoryUsageinit | Initial heap memory size requested by the JVM. Type: int | (gauge) Unit: digital,B |
| HeapMemoryUsagemax | Maximum heap memory available to the JVM, or -1 if undefined. Type: int | (gauge) Unit: digital,B |
| HeapMemoryUsageused | Current heap memory used by the JVM. Type: int | (gauge) Unit: digital,B |
| LoadedClassCount | Current number of classes loaded in the JVM. Type: int | (gauge) Unit: count |
| NonHeapMemoryUsagecommitted | Non-heap memory currently committed for JVM use. Type: int | (gauge) Unit: digital,B |
| NonHeapMemoryUsageinit | Initial non-heap memory size requested by the JVM. Type: int | (gauge) Unit: digital,B |
| NonHeapMemoryUsagemax | Maximum non-heap memory available to the JVM, or -1 if undefined. Type: int | (gauge) Unit: digital,B |
| NonHeapMemoryUsageused | Current non-heap memory used by the JVM. Type: int | (gauge) Unit: digital,B |
| ObjectPendingFinalizationCount | Approximate current number of objects pending finalization. Type: int | (gauge) Unit: count |
| PeakThreadCount | Peak live thread count since the JVM started or the peak was reset. Type: int | (gauge) Unit: count |
| PeakUsagecommitted | Committed memory in the peak usage snapshot for this memory pool. Type: int | (gauge) Unit: digital,B Tagged by: name |
| PeakUsageinit | Initial memory size in the peak usage snapshot for this memory pool. Type: int | (gauge) Unit: digital,B Tagged by: name |
| PeakUsagemax | Maximum memory in the peak usage snapshot for this memory pool, or -1 if undefined. Type: int | (gauge) Unit: digital,B Tagged by: name |
| PeakUsageused | Used memory in the peak usage snapshot for this memory pool. Type: int | (gauge) Unit: digital,B Tagged by: name |
| ThreadCount | Current number of live threads. Type: int | (gauge) Unit: count |
| TotalLoadedClassCount | Total number of classes loaded since the JVM started. Type: int | (count) Unit: count |
| TotalStartedThreadCount | Total number of threads created and started since the JVM started. Type: int | (count) Unit: count |
| UnloadedClassCount | Total number of classes unloaded since the JVM started. Type: int | (count) Unit: count |
| Uptime | Elapsed time since the JVM started. Type: int | (gauge) Unit: time,ms |
| Usagecommitted | Memory currently committed for this memory pool. Type: int | (gauge) Unit: digital,B Tagged by: name |
| Usageinit | Initial memory size requested for this memory pool. Type: int | (gauge) Unit: digital,B Tagged by: name |
| Usagemax | Maximum memory available for this memory pool, or -1 if undefined. Type: int | (gauge) Unit: digital,B Tagged by: name |
| Usageused | Current memory used by this memory pool. Type: int | (gauge) Unit: digital,B Tagged by: name |
collector¶
Shared collector availability metric emitted by multiple collectors to report whether the target was successfully scraped.
| Tags & Fields | Description |
|---|---|
| instance ( tag) |
Server addr of the instance |
| job ( tag) |
Server name of the instance |
| up | Whether the collector successfully scraped the target during the last collection cycle: 1 means true and 0 means false. Type: int | (gauge) Unit: bool |