GraphQL Observability with Hasura GraphQL Engine and Honeycomb
Observability (in software world) means you can answer any questions about what’s happening on the inside of the system just by observing metrics from outside of the system, without having to modify the working deployment to support this.
Why an Observable System?
Without an observable system, it would be difficult to understand what is going wrong with the system unless you are monitoring for known issues. At any point of time, you should be able to ask any arbitrary question about how your application works.
GraphQL Observability: In a GraphQL application, these are the important metrics and context to capture:
- time of query and query execution time
- actual query payload / query hash
- response status codes of queries/mutations/subscriptions
- graphql server version
- ip_address from which the query originated
and specifically in the case of Hasura GraphQL Engine, you might want to capture context like
- user_id of the user who made the query
- role of the user
- metadata of the query
With this external information available, you can ask meaningful questions in a production deployment to find what went wrong internally, or why your GraphQL backend is behaving the way it is. For example, if you see anomalies in query execution time for a particular query hash, you can try to identify what is wrong with the query (may be there is a database bottleneck that you need to optimise).
Just having this information set up for monitoring also might not be enough. To make it meaningful, you can set up alerts — for example: triggers for requests/min, errors/min with thresholds and get notified when the server is behaving differently.
By the end of this tutorial, you should be able to setup Hasura GraphQL Engine with an observable system, Honeycomb in this case, and help you understand, optimise and control your GraphQL server. Here’s a preview of what you will get at the end.
You can also choose to create a visual view of your logs based on filters. For example, to monitor error codes across a given time interval.
Set up Hasura GraphQL Engine on GKE
This documentation assumes that you have Hasura GraphQL Engine installed on Google Kubernetes Engine. Instructions for deployment is available here.
Set up Honeycomb Agent on Kubernetes
Signup on Honeycomb to start the Kubernetes agent setup. Honeycomb uses kubernetes secret to store the API key. This will be available on Honeycomb after Signing up.
kubectl create secret generic -n kube-system honeycomb-writekey \
— from-literal=key=<your-key>
We will use Kubernetes DaemonSets to automatically deploy the Honeycomb Agent on all nodes.
Once the DaemonSets extension is enabled, create a daemonset using the following command to deploy the agent
$ kubectl apply -f https://honeycomb.io/download/kubernetes/logs/quickstart.yaml
Note: in production clusters, remove mountpath configuration for `minikube-varlibdockercontainers`
Sending cluster events and resource metrics
Execute the following command to start sending cluster events and resource metrics from kubernetes.
$ kubectl create -f https://honeycomb.io/download/kubernetes/metrics/honeycomb-heapster.yaml
Sending cluster state metrics
Deploy kube-state-metrics and the Honeycomb kube-state-metrics adapter to collect more fine grained data.
Start kube-state-metrics
$ kubectl create -f https://honeycomb.io/download/kubernetes/metrics/kube-state-metrics.yaml
Start the Honeycomb adapter
$ kubectl create -f https://honeycomb.io/download/kubernetes/metrics/honeycomb-state-metrics-adapter.yaml
Check the status of your new pods:
kubectl get pods -n kube-system
Head over to the Honeycomb UI and query the kubernetes-state-metrics dataset to verify if everything is working.
Sending GraphQL Engine logs to honeycomb
Verify graphql-engine logs by executing the following command.
$ kubectl logs -l app=hasura-graphql-engine -c graphql-engine`
Here -c is specified because hasura-graphql-engine has multiple containers running.
Apply the following configmap.
echo ‘
— -
apiVersion: v1
kind: ConfigMap
metadata:
name: honeycomb-agent-config
namespace: kube-system
data:
config.yaml: |-
watchers:
— dataset: kubernetes-logs
labelSelector: app=hasura-graphql-engine
containerName: graphql-engine
parser: json
‘ | kubectl apply -f -
`
After updating the configmap, restart the agent pods:
$ kubectl delete pods -l k8s-app=honeycomb-agent -n kube-system
In the Honeycomb UI, go to the kubernetes-logs dataset and open the Schema tab.
Scroll down to see options. Check the `Automatically unpack nested JSON` and select the level of nesting to `5`.
You can now get detailed report on the logs / apply filters depending on the requirements.
You can see query_execution_time, query_hash, response_size, status code etc for inspection.
Dashboard Filters
Now start apply filters to get granular details. Lets apply a filter to get all queries which resulted in an error. The filter would be detail.status != 200
. Here’s what the dashboard looks like.
Now, lets get a performance overview by filtering/sorting queries using query_exection_time
column. Here’s what a typical filter/sort would look like:
Alerts for Anamolies
Now that we have got a holistic view of your GraphQL queries, let’s set up an alert to get notified via Email when something is wrong.
We can apply a filter for query_execution_time
≥2 to get notified for slow queries. This is just an example. You can tweak the number of seconds to a desired value depending on the application requirements.
Now, the focus can be on the application development and this observable system setup could notify us on whats going wrong with the unpredictable parts of your GraphQL server.
For further reference, refer to Honeycomb’s kubernetes integration docs