• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# SDK Metrics System
2## Concepts
3### Metric
4* A measure of some aspect of the SDK. Examples include request latency, number
5  of pooled connections and retries executed.
6
7* A metric is associated to a category. Some of the metric categories are
8  `Default`, `HttpClient` and `Streaming`. This enables customers to enable
9  metrics only for categories they are interested in.
10
11Refer to the [Metrics List](./MetricsList.md) document for a complete list of
12standard metrics collected by the SDK.
13
14### Metric Collector
15
16* `MetricCollector` is a typesafe aggregator of metrics. This is the primary
17  interface through which other SDK components report metrics they emit, using
18  the `reportMetric(SdkMetric,Object)` method.
19
20* `MetricCollector` objects allow for nesting. This enables metrics to be
21  collected in the context of other metric events. For example, for a single
22  API call, there may be multiple request attempts if there are retries. Each
23  attempt's associated metric events can be stored in their own
24  `MetricCollector`, all of which are children of another collector that
25  represents metrics for the entire API call.
26
27  A child of a collector is created by calling its `childCollector(String)`
28  method.
29
30* The `collect()` method returns a `MetricCollection`. This class essentially
31  returns an immutable version of the tree formed by the collector and its
32  children, which are also represented by `MetricCollection` objects.
33
34  Note that calling `collect()` implies that child collectors are also
35  collected.
36
37* Each collector has a name. Often this is will be used to describe the class of
38  metrics that it collects; e.g. `"ApiCall"` and `"ApiCallAttempt"`.
39
40* [Interface prototype](prototype/MetricCollector.java)
41
42### MetricPublisher
43
44* A `MetricPublisher` publishes collected metrics to a system(s) outside of the
45  SDK. It takes a `MetricCollection` object, potentially transforms the data
46  into richer metrics, and also into a format the receiver expects.
47
48* By default, the SDK will provide implementations to publish metrics to [Amazon
49  CloudWatch](https://aws.amazon.com/cloudwatch/) and [Client Side
50  Monitoring](https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/sdk-metrics.html)
51  (also known as AWS SDK Metrics for Enterprise Support).
52
53* Metrics publishers are pluggable within the SDK, allowing customers to
54  provide their own custom implementations.
55
56* Metric publishers can have different behaviors in terms of list of metrics to
57  publish, publishing frequency, configuration needed to publish etc.
58
59* [Interface prototype](prototype/MetricPublisher.java)
60
61## Enabling Metrics
62
63The metrics feature is disabled by default. Metrics can be enabled and configured in the following ways:
64
65### Option 1: Configuring MetricPublishers on a request
66
67A publisher can be configured directly on the `RequestOverrideConfiguration`:
68
69```java
70MetricPublisher metricPublisher = CloudWatchMetricPublisher.create();
71DynamoDbClient dynamoDb = DynamoDbClient.create();
72dynamoDb.listTables(ListTablesRequest.builder()
73                                     .overrideConfiguration(c -> c.addMetricPublisher(metricPublisher))
74                                     .build());
75```
76
77The methods exposed for setting metric publishers follow the pattern established by `ExecutionInterceptor`s:
78
79```java
80class RequestOverrideConfiguration {
81    // ...
82    class Builder {
83        // ...
84        Builder metricPublishers(List<MetricPublisher> metricsPublishers);
85        Builder addMetricPublisher(MetricPublisher metricsPublisher);
86    }
87}
88```
89
90### Option 2: Configuring MetricPublishers on a client
91
92A publisher can be configured directly on the `ClientOverrideConfiguration`. A publisher specified in this way is used
93with lower priority than **Option 1** above.
94
95```java
96MetricPublisher metricPublisher = CloudWatchMetricPublisher.create();
97DynamoDbClient dynamoDb = DynamoDbClient.builder()
98                                        .overrideConfiguration(c -> c.addMetricPublisher(metricPublisher))
99                                        .build();
100```
101
102The methods exposed for setting metric publishers follow the pattern established by `ExecutionInterceptor`s:
103
104```java
105class ClientOverrideConfiguration {
106    // ...
107    class Builder {
108        // ...
109        Builder metricPublishers(List<MetricPublisher> metricsPublishers);
110        Builder addMetricPublisher(MetricPublisher metricsPublisher);
111    }
112}
113```
114
115**Note:** As with the `httpClient` setting, calling `close()` on the `DynamoDbClient` *will not* close the configured
116`metricPublishers`. You must close the `metricPublishers` yourself when you're done using them.
117
118### Option 3: Configuring MetricPublishers using System Properties or Environment Variables
119
120This option allows the customer to enable metric publishing by default, without needing to enable it via **Option 1**
121or **Option 2** above. This means that a customer can enable metrics without needing to make a change to their runtime
122code.
123
124This option is enabled using an environment variable or system property. If both are specified, the system property
125will be used. If metrics are enabled at the client level using **Option 2** above, this option is ignored. Overriding
126the metric publisher at request time using **Option 1** overrides any publishers that have been enabled globally.
127
128**System Property:** `aws.metricPublishingEnabled=true`
129
130**Environment Variable:** `AWS_METRIC_PUBLISHING_ENABLED=true`
131
132The value specified must be one of `"true"` or `"false"`. Specifying any other string values will result in
133a value of `"false"` being used, and a warning being logged each time an SDK client is created.
134
135When the value is `"false"`, no metrics will be published by a client.
136
137When the value is `"true"`, metrics will be published by every client to a set of "global metric publishers". The set
138of global metric publishers is loaded automatically using the same mechanism currently used to discover HTTP
139clients. This means that including the `cloudwatch-metric-publisher` module and enabling the system property or
140environment variable above is sufficient to enable metric publishing to CloudWatch on all AWS clients.
141
142The set of "Global Metric Publishers" is static and is used for *all* AWS SDK clients instantiated by the application
143(while **Option 3** remains enabled). A JVM shutdown hook will be registered to invoke `MetricPublisher.close()` on
144every publisher (in case the publishers use non-daemon threads that would otherwise block JVM shutdown).
145
146#### Updating a MetricPublisher to work as a global metric publisher
147
148**Option 3** above references the concept of "Global Metric Publishers", which are a set of publishers that are
149discovered automatically by the SDK. This section outlines how global metric publishers are discovered and created.
150
151Each `MetricPublisher` that supports loading when **Option 3** is enabled must:
1521. Provide an `SdkMetricPublisherService` implementation. An `SdkMetricPublisherService` implementation is a class with
153a zero-arg constructor, used to instantiate a specific type of `MetricPublisher` (e.g. a
154`CloudWatchMetricPublisherService` that is a factory for `CloudWatchMetricPublisher`s).
1552. Provide a resource file: `META-INF/services/software.amazon.awssdk.metrics.SdkMetricPublisherService`. This file
156contains the list of fully-qualified `SdkMetricPublisherService` implementation class names.
157
158The `software.amazon.awssdk.metrics.SdkMetricPublisherService` interface that must be implemented by all global metric
159publisher candidates is defined as:
160
161```java
162public interface SdkMetricPublisherService {
163    MetricPublisher createMetricPublisher();
164}
165```
166
167**`SdkMetricPublisherService` Example**
168
169Enabling the `CloudWatchMetricPublisher` as a global metric publisher can be done by implementing the
170`SdkMetricPublisherService` interface:
171
172```java
173package software.amazon.awssdk.metrics.publishers.cloudwatch;
174
175public final class CloudWatchSdkMetricPublisherService implements SdkMetricPublisherService {
176    @Override
177    public MetricPublisher createMetricPublisher() {
178        return CloudWatchMetricPublisher.create();
179    }
180}
181```
182
183And creating a `META-INF/services/software.amazon.awssdk.metrics.SdkMetricPublisherService` resource file in the
184`cloudwatch-metric-publisher` module with the following contents:
185
186```
187software.amazon.awssdk.metrics.publishers.cloudwatch.CloudWatchSdkMetricPublisherService
188```
189
190#### Option 3 Implementation Details and Edge Cases
191
192**How the SDK loads `MetricPublisher`s when Option 3 is enabled**
193
194When a client is created with **Option 3** enabled (and **Option 2** "not specified"), the client retrieves the list of
195global metric publishers to use via a static "global metric publisher list" singleton. This singleton is initialized
196exactly once using the following process:
1971. The singleton uses `java.util.ServiceLoader` to locate all `SdkMetricPublisherService` implementations configured
198as described above. The classloader used with the service loader is chosen in the same manner as the one chosen for the
199HTTP client service loader (`software.amazon.awssdk.core.internal.http.loader.SdkServiceLoader`). That is, the first
200classloader present in the following list: (1) the classloader that loaded the SDK, (2) the current thread's classloader,
201then (3) the system classloader.
2022. The singleton creates an instance of every `SdkMetricPublisherService` located in this manner.
2033. The singleton creates an instance of each `MetricPublisher` instance using the metrics publisher services.
204
205**How Option 3 and Option 1 behave when Option 2 is "not specified"**
206
207The SDK treats **Option 3** as the default set of client-level metric publishers to be
208used when **Option 2** is "not specified". This means that if a customer: (1) enables global metric publishing using
209**Option 3**, (2) does not specify client-level publishers using **Option 2**, and (3) specifies metric publishers at
210the request level with **Option 1**, then the global metric publishers are still *instantiated* but will not be used.
211This nuance prevents the SDK from needing to consult the global metric configuration with every request.
212
213**How Option 2 is considered "not specified" for the purposes of considering Option 3**
214
215Global metric publishers (**Option 3**) are only considered for use when **Option 2** is "not specified".
216
217"Not specified" is defined to be when the customer either: (1) does not invoke
218`ClientOverrideConfiguration.Builder.addMetricPublisher()` / `ClientOverrideConfiguration.Builder.metricPublishers()`,
219or (2) invokes `ClientOverrideConfiguration.Builder.metricPublishers(null)` as the last `metricPublisher`-mutating
220action on the client override configuration builder.
221
222This definition purposefully excludes `ClientOverrideConfiguration.Builder.metricPublishers(emptyList())`. Setting
223the `metricPublishers` to an empty list is equivalent to setting the `metricPublishers` to the `NoOpMetricPublisher`.
224
225**Implementing an SdkMetricPublisherService that depends on an AWS clients**
226
227Any `MetricPublisher`s that supports creation via a `SdkMetricPublisherService` and depends on an AWS service client
228**must** disable metric publishing on those AWS service clients using **Option 2** when they are created via the
229`SdkMetricPublisherService`. This is to prevent a scenario where the global metric publisher singleton's initialization
230process depends on the global metric publishers singleton already being initialized.
231
232## Modules
233New modules are created to support metrics feature.
234
235### metrics-spi
236* Contains the metrics interfaces and default implementations that don't require other dependencies
237* This is a sub module under `core`
238* `sdk-core` has a dependency on `metrics-spi`, so customers will automatically get a dependency on this module.
239
240### metrics-publishers
241* This is a new module that contains implementations of all SDK supported publishers
242* Under this module, a new sub-module is created for each publisher (`cloudwatch-publisher`, `csm-publisher`)
243* Customers have to **explicitly add dependency** on these modules to use the sdk provided publishers
244
245## Performance
246One of the main tenets for metrics is "Enabling default metrics should have
247minimal impact on the application performance". The following design choices are
248made to ensure enabling metrics does not affect performance significantly.
249
250* When collecting metrics, a No-op metric collector is used if metrics are
251  disabled. All methods in this collector are no-op and return immediately.
252
253* Metric publisher implementations can involve network calls and impact latency
254  if done in blocking way. Therefore, all SDK publisher implementations will
255  process the metrics asynchronously to not block the request thread.
256
257* Performance tests will be written and run with each release to ensure that the
258  SDK performs well even when metrics are enabled and being collected and
259  published.
260