4 minutes
Subgraph Best Practice 3 - Improve Indexing and Query Performance by Using Immutable Entities and Bytes as IDs
TLDR
Using Immutable Entities and Bytes for IDs in our schema.graphql
file significantly improves indexing speed and query performance.
Immutable Entities
To make an entity immutable, we simply add (immutable: true)
to an entity.
type Transfer @entity(immutable: true) { id: Bytes! from: Bytes! to: Bytes! value: BigInt!}
By making the Transfer
entity immutable, graph-node is able to process the entity more efficiently, improving indexing speeds and query responsiveness.
Immutable Entities structures will not change in the future. An ideal entity to become an Immutable Entity would be an entity that is directly logging onchain event data, such as a Transfer
event being logged as a Transfer
entity.
Under the hood
Mutable entities have a ‘block range’ indicating their validity. Updating these entities requires the graph node to adjust the block range of previous versions, increasing database workload. Queries also need filtering to find only live entities. Immutable entities are faster because they are all live and since they won’t change, no checks or updates are required while writing, and no filtering is required during queries.
When not to use Immutable Entities
If you have a field like status
that needs to be modified over time, then you should not make the entity immutable. Otherwise, you should use immutable entities whenever possible.
Bytes as IDs
Every entity requires an ID. In the previous example, we can see that the ID is already of the Bytes type.
type Transfer @entity(immutable: true) { id: Bytes! from: Bytes! to: Bytes! value: BigInt!}
While other types for IDs are possible, such as String and Int8, it is recommended to use the Bytes type for all IDs due to character strings taking twice as much space as Byte strings to store binary data, and comparisons of UTF-8 character strings must take the locale into account which is much more expensive than the bytewise comparison used to compare Byte strings.
Reasons to Not Use Bytes as IDs
- If entity IDs must be human-readable such as auto-incremented numerical IDs or readable strings, Bytes for IDs should not be used.
- If integrating a Subgraph’s data with another data model that does not use Bytes as IDs, Bytes as IDs should not be used.
- Indexing and querying performance improvements are not desired.
Concatenating With Bytes as IDs
It is a common practice in many Subgraphs to use string concatenation to combine two properties of an event into a single ID, such as using event.transaction.hash.toHex() + "-" + event.logIndex.toString()
. However, as this returns a string, this significantly impedes Subgraph indexing and querying performance.
Instead, we should use the concatI32()
method to concatenate event properties. This strategy results in a Bytes
ID that is much more performant.
export function handleTransfer(event: TransferEvent): void { let entity = new Transfer(event.transaction.hash.concatI32(event.logIndex.toI32())) entity.from = event.params.from entity.to = event.params.to entity.value = event.params.value entity.blockNumber = event.block.number entity.blockTimestamp = event.block.timestamp entity.transactionHash = event.transaction.hash entity.save()}
Sorting With Bytes as IDs
Sorting using Bytes as IDs is not optimal as seen in this example query and response.
Query:
{ transfers(first: 3, orderBy: id) { id from to value }}
Query response:
{ "data": { "transfers": [ { "id": "0x00010000", "from": "0xabcd...", "to": "0x1234...", "value": "256" }, { "id": "0x00020000", "from": "0xefgh...", "to": "0x5678...", "value": "512" }, { "id": "0x01000000", "from": "0xijkl...", "to": "0x9abc...", "value": "1" } ] }}
The IDs are returned as hex.
To improve sorting, we should create another field on the entity that is a BigInt.
type Transfer @entity { id: Bytes! from: Bytes! # address to: Bytes! # address value: BigInt! # unit256 tokenId: BigInt! # uint256}
This will allow for sorting to be optimized sequentially.
Query:
{ transfers(first: 3, orderBy: tokenId) { id tokenId }}
Query Response:
{ "data": { "transfers": [ { "id": "0x…", "tokenId": "1" }, { "id": "0x…", "tokenId": "2" }, { "id": "0x…", "tokenId": "3" } ] }}
Conclusion
Using both Immutable Entities and Bytes as IDs has been shown to markedly improve Subgraph efficiency. Specifically, tests have highlighted up to a 28% increase in query performance and up to a 48% acceleration in indexing speeds.
Read more about using Immutable Entities and Bytes as IDs in this blog post by David Lutterkort, a Software Engineer at Edge & Node: Two Simple Subgraph Performance Improvements.