17 minutos
Graph Node
Graph Node is the component which indexes Subgraphs, and makes the resulting data available to query via a GraphQL API. As such it is central to the indexer stack, and correct operation of Graph Node is crucial to running a successful indexer.
This provides a contextual overview of Graph Node, and some of the more advanced options available to indexers. Detailed documentation and instructions can be found in the Graph Node repository.
Graph Node
Graph Node is the reference implementation for indexing Subgraphs on The Graph Network, connecting to blockchain clients, indexing Subgraphs and making indexed data available to query.
Graph Node (and the whole indexer stack) can be run on bare metal, or in a cloud environment. This flexibility of the central indexing component is crucial to the robustness of The Graph Protocol. Similarly, Graph Node can be built from source, or indexers can use one of the provided Docker Images.
Base de datos PostgreSQL
The main store for the Graph Node, this is where Subgraph data is stored, as well as metadata about Subgraphs, and Subgraph-agnostic network data such as the block cache, and eth_call cache.
Clientes de red
Para indexar una red, Graph Node necesita acceso a un cliente de red a través de una API JSON-RPC compatible con EVM. Esta RPC puede conectarse a un solo cliente o puede ser una configuración más compleja que equilibre la carga entre varios clientes.
While some Subgraphs may just require a full node, some may have indexing features which require additional RPC functionality. Specifically Subgraphs which make eth_calls
as part of indexing will require an archive node which supports EIP-1898, and Subgraphs with callHandlers
, or blockHandlers
with a call
filter, require trace_filter
support (see trace module documentation here).
Network Firehoses - a Firehose is a gRPC service providing an ordered, yet fork-aware, stream of blocks, developed by The Graph’s core developers to better support performant indexing at scale. This is not currently an Indexer requirement, but Indexers are encouraged to familiarise themselves with the technology, ahead of full network support. Learn more about the Firehose here.
Nodos IPFS
Subgraph deployment metadata is stored on the IPFS network. The Graph Node primarily accesses the IPFS node during Subgraph deployment to fetch the Subgraph manifest and all linked files. Network indexers do not need to host their own IPFS node. An IPFS node for the network is hosted at https://ipfs.thegraph.com.
Servidor de métricas Prometheus
Para permitir la supervisión y la generación de informes, Graph Node puede registrar métricas opcionalmente en un servidor de métricas Prometheus.
Getting started from source
Install prerequisites
-
Rust
-
PostgreSQL
-
IPFS
-
Additional Requirements for Ubuntu users - To run a Graph Node on Ubuntu a few additional packages may be needed.
1sudo apt-get install -y clang libpq-dev libssl-dev pkg-config
Setup
- Start a PostgreSQL database server
1initdb -D .postgres2pg_ctl -D .postgres -l logfile start3createdb graph-node
-
Clone Graph Node repo and build the source by running
cargo build
-
Now that all the dependencies are setup, start the Graph Node:
1cargo run -p graph-node --release -- \2 --postgres-url postgresql://[USERNAME]:[PASSWORD]@localhost:5432/graph-node \3 --ethereum-rpc [NETWORK_NAME]:[URL] \4 --ipfs https://ipfs.thegraph.com
Introducción a Kubernetes
A complete Kubernetes example configuration can be found in the indexer repository.
Puertos
Cuando está funcionando, Graph Node muestra los siguientes puertos:
Port | Purpose | Routes | CLI Argument | Environment Variable |
---|---|---|---|---|
8000 | GraphQL HTTP server (for Subgraph queries) | /subgraphs/id/… /subgraphs/name/…/… | —http-port | - |
8001 | GraphQL WS (for Subgraph subscriptions) | /subgraphs/id/… /subgraphs/name/…/… | —ws-port | - |
8020 | JSON-RPC (for managing deployments) | / | —admin-port | - |
8030 | Subgraph indexing status API | /graphql | —index-node-port | - |
8040 | Prometheus metrics | /metrics | —metrics-port | - |
Important: Be careful about exposing ports publicly - administration ports should be kept locked down. This includes the the Graph Node JSON-RPC endpoint.
Configuración avanzada de Graph Node
At its simplest, Graph Node can be operated with a single instance of Graph Node, a single PostgreSQL database, an IPFS node, and the network clients as required by the Subgraphs to be indexed.
This setup can be scaled horizontally, by adding multiple Graph Nodes, and multiple databases to support those Graph Nodes. Advanced users may want to take advantage of some of the horizontal scaling capabilities of Graph Node, as well as some of the more advanced configuration options, via the config.toml
file and Graph Node’s environment variables.
config.toml
A TOML configuration file can be used to set more complex configurations than those exposed in the CLI. The location of the file is passed with the —config command line switch.
Cuando se utiliza un archivo de configuración, no es posible utilizar las opciones —postgres-url, —postgres-secondary-hosts y —postgres-host-weights.
A minimal config.toml
file can be provided; the following file is equivalent to using the —postgres-url command line option:
1[store]2[store.primary]3connection="<.. postgres-url argument ..>"4[deployment]5[[deployment.rule]]6indexers = [ "<.. list of all indexing nodes ..>" ]
Full documentation of config.toml
can be found in the Graph Node docs.
Graph Nodes múltiples
Graph Node indexing can scale horizontally, running multiple instances of Graph Node to split indexing and querying across different nodes. This can be done simply by running Graph Nodes configured with a different node_id
on startup (e.g. in the Docker Compose file), which can then be used in the config.toml
file to specify dedicated query nodes, block ingestors, and splitting Subgraphs across nodes with deployment rules.
Ten en cuenta que varios Graph Nodes pueden configurarse para utilizar la misma base de datos, que a su vez puede escalarse horizontalmente mediante sharding.
Reglas de deploy
Given multiple Graph Nodes, it is necessary to manage deployment of new Subgraphs so that the same Subgraph isn’t being indexed by two different nodes, which would lead to collisions. This can be done by using deployment rules, which can also specify which shard
a Subgraph’s data should be stored in, if database sharding is being used. Deployment rules can match on the Subgraph name and the network that the deployment is indexing in order to make a decision.
Ejemplo de configuración de reglas de deploy:
1[deployment]2[[deployment.rule]]3match = { name = "(vip|important)/.*" }4shard = "vip"5indexers = [ "index_node_vip_0", "index_node_vip_1" ]6[[deployment.rule]]7match = { network = "kovan" }8# No shard, so we use the default shard called 'primary'9indexers = [ "index_node_kovan_0" ]10[[deployment.rule]]11match = { network = [ "xdai", "poa-core" ] }12indexers = [ "index_node_other_0" ]13[[deployment.rule]]14# There's no 'match', so any Subgraph matches15shards = [ "sharda", "shardb" ]16indexers = [17 "index_node_community_0",18 "index_node_community_1",19 "index_node_community_2",20 "index_node_community_3",21 "index_node_community_4",22 "index_node_community_5"23 ]
Read more about deployment rules here.
Nodos de consulta dedicados
Los nodos pueden configurarse para ser explícitamente nodos de consulta incluyendo lo siguiente en el archivo de configuración:
1[general]2query = "<regular expression>"
Cualquier nodo cuyo —node-id coincida con la expresión regular se configurará para que solo responda a consultas.
Escalado de bases de datos mediante sharding
Para la mayoría de los casos de uso, una única base de datos Postgres es suficiente para soportar una instancia de graph-node. Cuando una instancia de graph-node supera una única base de datos Postgres, es posible dividir el almacenamiento de los datos de graph-node en varias bases de datos Postgres. Todas las bases de datos juntas forman el almacén de la instancia graph-node. Cada base de datos individual se denomina shard.
Shards can be used to split Subgraph deployments across multiple databases, and can also be used to use replicas to spread query load across databases. This includes configuring the number of available database connections each graph-node
should keep in its connection pool for each database, which becomes increasingly important as more Subgraphs are being indexed.
El Sharding resulta útil cuando la base de datos existente no puede soportar la carga que le impone Graph Node y cuando ya no es posible aumentar el tamaño de la base de datos.
It is generally better make a single database as big as possible, before starting with shards. One exception is where query traffic is split very unevenly between Subgraphs; in those situations it can help dramatically if the high-volume Subgraphs are kept in one shard and everything else in another because that setup makes it more likely that the data for the high-volume Subgraphs stays in the db-internal cache and doesn’t get replaced by data that’s not needed as much from low-volume Subgraphs.
En términos de configuración de las conexiones, comienza con max_connections en postgresql.conf establecido en 400 (o tal vez incluso 200) y mira las métricas de Prometheus store_connection_wait_time_ms y store_connection_checkout_count. Tiempos de espera notables (cualquier cosa por encima de 5ms) es una indicación de que hay muy pocas conexiones disponibles; altos tiempos de espera allí también serán causados por la base de datos que está muy ocupada (como alta carga de CPU). Sin embargo, si la base de datos parece estable, los tiempos de espera elevados indican la necesidad de aumentar el número de conexiones. En la configuración, el número de conexiones que puede utilizar cada instancia de Graph Node es un límite superior, y Graph Node no mantendrá conexiones abiertas si no las necesita.
Read more about store configuration here.
Bloques de procesamiento dedicado
If there are multiple nodes configured, it will be necessary to specify one node which is responsible for ingestion of new blocks, so that all configured index nodes aren’t polling the chain head. This is done as part of the chains
namespace, specifying the node_id
to be used for block ingestion:
1[chains]2ingestor = "block_ingestor_node"
Soporte de múltiples redes
The Graph Protocol is increasing the number of networks supported for indexing rewards, and there exist many Subgraphs indexing unsupported networks which an indexer would like to process. The config.toml
file allows for expressive and flexible configuration of:
- Redes múltiples
- Múltiples proveedores por red (esto puede permitir dividir la carga entre los proveedores, y también puede permitir la configuración de nodos completos, así como nodos de archivo, con Graph Node prefiriendo proveedores más baratos si una carga de trabajo dada lo permite).
- Detalles adicionales del proveedor, como features, autenticación y tipo de proveedor (para soporte experimental de Firehose)
The [chains]
section controls the ethereum providers that graph-node connects to, and where blocks and other metadata for each chain are stored. The following example configures two chains, mainnet and kovan, where blocks for mainnet are stored in the vip shard and blocks for kovan are stored in the primary shard. The mainnet chain can use two different providers, whereas kovan only has one provider.
1[chains]2ingestor = "block_ingestor_node"3[chains.mainnet]4shard = "vip"5provider = [6 { label = "mainnet1", url = "http://..", features = [], headers = { Authorization = "Bearer foo" } },7 { label = "mainnet2", url = "http://..", features = [ "archive", "traces" ] }8]9[chains.kovan]10shard = "primary"11provider = [ { label = "kovan", url = "http://..", features = [] } ]
Read more about provider configuration here.
Variables del entorno
Graph Node supports a range of environment variables which can enable features, or change Graph Node behaviour. These are documented here.
Deploy continuo
Los usuarios que están operando una configuración de indexación escalada con configuración avanzada pueden beneficiarse de la gestión de sus Graph Nodes con Kubernetes.
- The indexer repository has an example Kubernetes reference
- Launchpad is a toolkit for running a Graph Protocol Indexer on Kubernetes maintained by GraphOps. It provides a set of Helm charts and a CLI to manage a Graph Node deployment.
Operar Graph Node
Given a running Graph Node (or Graph Nodes!), the challenge is then to manage deployed Subgraphs across those nodes. Graph Node surfaces a range of tools to help with managing Subgraphs.
Logging
Graph Node’s logs can provide useful information for debugging and optimisation of Graph Node and specific Subgraphs. Graph Node supports different log levels via the GRAPH_LOG
environment variable, with the following levels: error, warn, info, debug or trace.
In addition setting GRAPH_LOG_QUERY_TIMING
to gql
provides more details about how GraphQL queries are running (though this will generate a large volume of logs).
Monitoring & alerting
Graph Node proporciona las métricas a través del endpoint Prometheus en el puerto 8040 por defecto. Grafana se puede utilizar para visualizar estas métricas.
The indexer repository provides an example Grafana configuration.
Graphman
graphman
is a maintenance tool for Graph Node, helping with diagnosis and resolution of different day-to-day and exceptional tasks.
The graphman command is included in the official containers, and you can docker exec into your graph-node container to run it. It requires a config.toml
file.
Full documentation of graphman
commands is available in the Graph Node repository. See [/docs/graphman.md] (https://github.com/graphprotocol/graph-node/blob/master/docs/graphman.md) in the Graph Node /docs
Working with Subgraphs
API de estado de indexación
Available on port 8030/graphql by default, the indexing status API exposes a range of methods for checking indexing status for different Subgraphs, checking proofs of indexing, inspecting Subgraph features and more.
The full schema is available here.
Rendimiento de indexación
El proceso de indexación consta de tres partes diferenciadas:
- Obtener eventos de interés del proveedor
- Procesar los eventos en orden con los handlers apropiados (esto puede implicar llamar a la cadena para obtener el estado y obtener datos del store)
- Escribir los datos resultantes en el store
These stages are pipelined (i.e. they can be executed in parallel), but they are dependent on one another. Where Subgraphs are slow to index, the underlying cause will depend on the specific Subgraph.
Causas habituales de la lentitud de indexación:
- Time taken to find relevant events from the chain (call handlers in particular can be slow, given the reliance on
trace_filter
) - Making large numbers of
eth_calls
as part of handlers - Gran cantidad de interacción con el depósito durante la ejecución
- Una gran cantidad de datos para guardar en el depósito
- Un gran número de eventos que procesar
- Tiempo de conexión a la base de datos lento, para nodos abarrotados
- El proveedor en sí mismo se está quedando rezagado con respecto a la cabeza de la cadena
- Lentitud en la obtención de nuevos recibos en la cabeza de la cadena desde el proveedor
Subgraph indexing metrics can help diagnose the root cause of indexing slowness. In some cases, the problem lies with the Subgraph itself, but in others, improved network providers, reduced database contention and other configuration improvements can markedly improve indexing performance.
Failed Subgraphs
During indexing Subgraphs might fail, if they encounter data that is unexpected, some component not working as expected, or if there is some bug in the event handlers or configuration. There are two general types of failure:
- Fallos deterministas: son fallos que no se resolverán con reintentos
- Fallos no deterministas: pueden deberse a problemas con el proveedor o a algún error inesperado de Graph Node. Cuando se produce un fallo no determinista, Graph Node reintentará los handlers que han fallado, retrocediendo en el tiempo.
In some cases a failure might be resolvable by the indexer (for example if the error is a result of not having the right kind of provider, adding the required provider will allow indexing to continue). However in others, a change in the Subgraph code is required.
Deterministic failures are considered “final”, with a Proof of Indexing generated for the failing block, while non-deterministic failures are not, as the Subgraph may manage to “unfail” and continue indexing. In some cases, the non-deterministic label is incorrect, and the Subgraph will never overcome the error; such failures should be reported as issues on the Graph Node repository.
Caché de bloques y llamadas
Graph Node caches certain data in the store in order to save refetching from the provider. Blocks are cached, as are the results of eth_calls
(the latter being cached as of a specific block). This caching can dramatically increase indexing speed during “resyncing” of a slightly altered Subgraph.
However, in some instances, if an Ethereum node has provided incorrect data for some period, that can make its way into the cache, leading to incorrect data or failed Subgraphs. In this case indexers can use graphman
to clear the poisoned cache, and then rewind the affected Subgraphs, which will then fetch fresh data from the (hopefully) healthy provider.
Si se sospecha de una inconsistencia en el caché de bloques, como un evento de falta de recepción tx:
graphman chain list
to find the chain name.graphman chain check-blocks <CHAIN> by-number <NUMBER>
will check if the cached block matches the provider, and deletes the block from the cache if it doesn’t.- If there is a difference, it may be safer to truncate the whole cache with
graphman chain truncate <CHAIN>
. - Si el bloque coincide con el proveedor, el problema puede depurarse directamente contra el proveedor.
- If there is a difference, it may be safer to truncate the whole cache with
Consulta de problemas y errores
Once a Subgraph has been indexed, indexers can expect to serve queries via the Subgraph’s dedicated query endpoint. If the indexer is hoping to serve significant query volume, a dedicated query node is recommended, and in case of very high query volumes, indexers may want to configure replica shards so that queries don’t impact the indexing process.
Sin embargo, incluso con un nodo de consulta dedicado y réplicas, ciertas consultas pueden llevar mucho tiempo para ejecutarse y, en algunos casos, aumentar el uso de memoria y afectar negativamente el tiempo de consulta de otros usuarios.
No existe una “bala de plata”, sino toda una serie de herramientas para prevenir, diagnosticar y tratar las consultas lentas.
Caché de consultas
Graph Node caches GraphQL queries by default, which can significantly reduce database load. This can be further configured with the GRAPH_QUERY_CACHE_BLOCKS
and GRAPH_QUERY_CACHE_MAX_MEM
settings - read more here.
Análisis de consultas
Problematic queries most often surface in one of two ways. In some cases, users themselves report that a given query is slow. In that case the challenge is to diagnose the reason for the slowness - whether it is a general issue, or specific to that Subgraph or query. And then of course to resolve it, if possible.
En otros casos, el desencadenante puede ser un uso elevado de memoria en un nodo de consulta, en cuyo caso el reto consiste primero en identificar la consulta causante del problema.
Indexers can use qlog to process and summarize Graph Node’s query logs. GRAPH_LOG_QUERY_TIMING
can also be enabled to help identify and debug slow queries.
Ante una consulta lenta, los Indexadores tienen algunas opciones. Por supuesto, pueden alterar su modelo de costes, para aumentar significativamente el coste de envío de la consulta problemática. Esto puede dar lugar a una reducción de la frecuencia de esa consulta. Sin embargo, a menudo esto no resuelve la raíz del problema.
Optimización a nivel de cuenta
Las tablas de bases de datos que almacenan entidades suelen ser de dos tipos: Las de tipo “transacción”, en las que las entidades, una vez creadas, no se actualizan nunca, es decir, almacenan algo parecido a una lista de transacciones financieras, y las de tipo “cuenta”, en las que las entidades se actualizan muy a menudo, es decir, almacenan algo parecido a cuentas financieras que se modifican cada vez que se registra una transacción. Las tablas tipo cuenta se caracterizan por contener un gran número de versiones de entidades, pero relativamente pocas entidades distintas. A menudo, en este tipo de tablas, el número de entidades distintas es del 1% del número total de filas (versiones de entidades)
For account-like tables, graph-node
can generate queries that take advantage of details of how Postgres ends up storing data with such a high rate of change, namely that all of the versions for recent blocks are in a small subsection of the overall storage for such a table.
The command graphman stats show <sgdNNNN
> shows, for each entity type/table in a deployment, how many distinct entities, and how many entity versions each table contains. That data is based on Postgres-internal estimates, and is therefore necessarily imprecise, and can be off by an order of magnitude. A -1
in the entities
column means that Postgres believes that all rows contain a distinct entity.
In general, tables where the number of distinct entities are less than 1% of the total number of rows/entity versions are good candidates for the account-like optimization. When the output of graphman stats show
indicates that a table might benefit from this optimization, running graphman stats show <sgdNNN> <table>
will perform a full count of the table - that can be slow, but gives a precise measure of the ratio of distinct entities to overall entity versions.
Once a table has been determined to be account-like, running graphman stats account-like <sgdNNN>.<table>
will turn on the account-like optimization for queries against that table. The optimization can be turned off again with graphman stats account-like --clear <sgdNNN>.<table>
It takes up to 5 minutes for query nodes to notice that the optimization has been turned on or off. After turning the optimization on, it is necessary to verify that the change does not in fact make queries slower for that table. If you have configured Grafana to monitor Postgres, slow queries would show up in pg_stat_activity
in large numbers, taking several seconds. In that case, the optimization needs to be turned off again.
For Uniswap-like Subgraphs, the pair
and token
tables are prime candidates for this optimization, and can have a dramatic effect on database load.
Removing Subgraphs
Se trata de una nueva funcionalidad, que estará disponible en Graph Node 0.29.x
At some point an indexer might want to remove a given Subgraph. This can be easily done via graphman drop
, which deletes a deployment and all it’s indexed data. The deployment can be specified as either a Subgraph name, an IPFS hash Qm..
, or the database namespace sgdNNN
. Further documentation is available here.