Skip to main content
Version: v2.x

Metrics via Prometheus

Enable metrics endpoint

By default the Prometheus metrics endpoint is disabled. To enable Prometheus metrics, configure the environment variable below:

HASURA_GRAPHQL_ENABLED_APIS=metadata,graphql,config,metrics

Secure the Prometheus metrics endpoint with a secret:

HASURA_GRAPHQL_METRICS_SECRET=<secret>
curl 'http://127.0.0.1:8080/v1/metrics' -H 'Authorization: Bearer <secret>'
Configure a secret

The metrics endpoint should be configured with a secret to prevent misuse and should not be exposed over the internet.

High-cardinality Labels

Starting in v2.26.0, Hasura GraphQL Engine exposes metrics with high-cardinality labels by default.

You can disable the cardinality of labels for metrics if you are experiencing high memory usage, which can be due to a large number of labels in the metrics (typically more than 10000).

Metrics exported

The following metrics are exported by Hasura GraphQL Engine:

Hasura Event Triggers Metrics

The following metrics can be used to monitor the performance of Hasura Event Triggers system:

Subscription Metrics

The following metrics can be used to monitor the performance of subscriptions:

Hasura cache request count

Tracks cache hit and miss requests, which helps in monitoring and optimizing cache utilization. You can read more about this here.

Namehasura_cache_request_count
TypeCounter
Labelsstatus: hit | miss

Hasura cron events invocation total

Total number of cron events invoked, representing the number of invocations made for cron events.

Namehasura_cron_events_invocation_total
TypeCounter
Labelsstatus: success | failed

Hasura cron events processed total

Total number of cron events processed, representing the number of invocations made for cron events. Compare this to hasura_cron_events_invocation_total. A high difference between the two metrics indicates high failure rate of the cron webhook.

Namehasura_cron_events_processed_total
TypeCounter
Labelsstatus: success | failed

Hasura GraphQL execution time seconds

Execution time of successful GraphQL requests (excluding subscriptions). If more requests are falling in the higher buckets, you should consider tuning the performance.

Namehasura_graphql_execution_time_seconds
TypeHistogram

Buckets: 0.01, 0.03, 0.1, 0.3, 1, 3, 10
Labelsoperation_type: query | mutation

Hasura GraphQL requests total

Number of GraphQL requests received, representing the GraphQL query/mutation traffic on the server.

Namehasura_graphql_requests_total
TypeCounter
Labelsoperation_type: query | mutation | subscription | unknown

The unknown operation type will be returned for queries that fail authorization, parsing, or certain validations. The response_status label will be success for successful requests and failed for failed requests.

Hasura HTTP connections

Current number of active HTTP connections (excluding WebSocket connections), representing the HTTP load on the server.

Namehasura_http_connections
TypeGauge
Labelsnone

Hasura one-off events invocation total

Total number of one-off events invoked, representing the number of invocations made for one-off events.

Namehasura_oneoff_events_invocation_total
TypeCounter
Labelsstatus: success | failed

Hasura one-off events processed total

Total number of one-off events processed, representing the number of invocations made for one-off events. Compare this to hasura_oneoff_events_invocation_total. A high difference between the two metrics indicates high failure rate of the one-off webhook.

Namehasura_oneoff_events_processed_total
TypeCounter
Labelsstatus: success | failed

Hasura postgres connections

Current number of active PostgreSQL connections. Compare this to pool settings.

Namehasura_postgres_connections
TypeGauge
Labelssource_name: name of the database
conn_info: connection url string (password omitted) or name of the connection url environment variable
role: primary | replica

Hasura source health

Health check status of a particular data source, corresponding to the output of /healthz/sources, with possible values 0 through 3 indicating, respectively: OK, TIMEOUT, FAILED, ERROR. See the Source Health Check API Reference for details.

Namehasura_source_health
TypeGauge
Labelssource_name: name of the database

Hasura WebSocket connections

Current number of active WebSocket connections, representing the WebSocket load on the server.

Namehasura_websocket_connections
TypeGauge
Labelsnone
GraphQL request execution time
  • Uses wall-clock time, so it includes time spent waiting on I/O.
  • Includes authorization, parsing, validation, planning, and execution (calls to databases, Remote Schemas).