Use the metrics API
The metrics API allows you to monitor and manage applications and infrastructure. The metrics API enables real-time data collection and aggregation and provides insights into your system's performance. Additionally, DevOps teams can use the API to set up alerts, troubleshoot issues, and optimize resource allocation. By exposing these performance metrics, you can better detect anomalies and bottlenecks and be more proactive in system maintenance. Finally, the metrics API allows for the integration of a wide range of monitoring and visualization tools, enhancing the ability to observe complex systems.
As an administrator, you can use the metrics API with PromQL queries to directly query and obtain raw metrics that can be used by third-party viewers or graphical applications, such as Grafana, that may be running inside or outside of your cluster.
The metrics API collects information on services, users, and content items. The Prometheus monitoring system is used by the metrics API to persist the collected information into a time-series database.
See Service metrics for more information about using the metrics API to optimize and troubleshoot services. See View metrics for more information about visualizing information collected by the metrics API.
Access the metrics API
The metrics API is exposed in the usagestatistics API in the ArcGIS Enterprise Administrator directory. To access the metrics API, follow these steps:
Go to the usagestatistics API in the administrator directory:
https://organization.example.com/<context>/admin/usagestatistics.Click Rest Metrics API.
The metrics API application opens on a new tab.
Sign in to the Prometheus web page.
By default, system-generated credentials are used to sign in. Before you sign in to the Prometheus dashboard, configure your organization and update your credentials. To update these credentials, use the Update Credentials operation in
admin/usagestatistics.
Information exposed by the metrics API
You can use Prometheus to retrieve metrics for ArcGIS Enterprise. The following tables summarize the available metrics and labels, which you can use in PromQL queries and result filters to access the information you need.
Metrics with the suffix of _created reflect the Unix time (seconds) of histograms and summaries from when the corresponding metric was first collected. This value provides insight into the life cycle of a metric. These metrics are not included in the tables below.
Service metrics
Service metrics are reported for each published service.
|
Name |
Description |
Type |
Labels |
|---|---|---|---|
|
requests_failed_total |
Number of failed requests |
counter |
apiType, folderName, instance, instancePool, job, nodeName, operation, podName, responseCached, serviceName, serviceType, username |
|
requests_response_time_seconds_bucket |
Cumulative number of observed requests completed in less than or equal to the bucket's upper bound, in seconds |
histogram |
apiType, folderName, instance, instancePool, job, le, nodeName, operation, podName, responseCached, serviceName, serviceType |
|
requests_response_time_seconds_count |
Total number of observed request response times across all buckets |
histogram |
apiType, folderName, instance, instancePool, job, nodeName, operation, podName, responseCached, serviceName, serviceType |
|
requests_response_time_seconds_sum |
Cumulative response time of all observed requests, in seconds |
histogram |
apiType, folderName, instance, instancePool, job, nodeName, operation, podName, responseCached, serviceName, serviceType |
|
requests_response_time_seconds_counter_total |
Sum of response time of requests in seconds |
counter |
apiType, folderName, instance, instancePool, job, nodeName, operation, podName, responseCached, serviceName, serviceType, username |
|
requests_succeeded_total |
Number of succeeded requests |
counter |
apiType, folderName, instance, instancePool, job, nodeName, operation, podName, responseCached, serviceName, serviceType |
|
requests_total |
Number of requests |
counter |
apiType, folderName, instance, instancePool, job, nodeName, operation, podName, responseCached, serviceName, serviceType, username |
|
requests_usage_timedout_total |
Number of requests that timed out due to exceeding the maximum usage time |
counter |
apiType, folderName, nodeName, operation, podName, serviceName, serviceType, username |
|
requests_wait_time_seconds_bucket |
Cumulative number of requests that had to wait to be processed for a time less than or equal to the bucket's upper bound, in seconds |
counter |
apiType, folderName, instance, instancePool, job, le, nodeName, podName, serviceName, serviceType |
|
requests_wait_time_seconds_count |
Total number of observed request wait times across all buckets |
counter |
apiType, folderName, instance, instancePool, job, nodeName, podName, serviceName, serviceType |
|
requests_wait_time_seconds_sum |
Cumulative wait time of all observed requests, in seconds |
counter |
apiType, folderName, instance, instancePool, job, nodeName, podName, serviceName, serviceType |
|
requests_wait_timedout_total |
Number of requests that timed out due to exceeding the maximum wait time |
counter |
apiType, folderName, nodeName, operation, podName, serviceName, serviceType, username |
|
up |
Running pods |
counter |
instance, job |
Organization metrics
|
Name |
Description |
Type |
Labels |
|---|---|---|---|
|
items_added_total |
Total number of items added |
counter |
componentName, instance, job, nodeName |
|
items_count |
Total number of items |
gauge |
componentName, instance, job, nodeName |
|
items_deleted_total |
Total number of items deleted |
counter |
componentName, instance, job, nodeName |
|
items_updated_total |
Total number of items updated |
counter |
componentName, instance, job, nodeName |
|
items_viewed_total |
Total number of all viewed items |
counter |
componentName, instance, job, nodeName |
|
users_added_total |
Total number of users added |
counter |
componentName, instance, job, nodeName |
|
users_count |
Total number of users |
gauge |
componentName, instance, job, nodeName |
|
users_login_total |
Total number of user logins |
counter |
componentName, instance, job, nodeName |
|
users_logout_total |
Total number of user logouts |
counter |
componentName, instance, job, nodeName |
Note:
The items_count metric does not include Esri content items, such as Living Atlas content.
Label descriptions
| Label name | Description |
|---|---|
| apiType | The API where the metric is collected. |
| folderName | The name of the folder where the service is created. |
| instance | The cluster IP address and port of the service pod handling the request. |
| instancePool | The instance pool used by the service, either shared or dedicate. |
| job | The job name defined in the Prometheus configuration. |
| le | The upper bound of a bucket in a histogram metric. |
| nodeName | The name of the machine with the service handling the request. |
| operation | The requested operation, for example, a query or export. Operation information is only available for REST requests. SOAP requests only return a value of soap. |
| podName | The name of the service pod handling the request. |
| responseCached | Boolean indicator if the response was served from the cache. |
| serviceName | The name of the service, including the folder name in which the service resides. |
| serviceType | The type of service, for example, a MapService or FeatureService. |
| username | The username of the user making the request. |