4 minutes

Subgraph Best Practice 3 - Improve Indexing and Query Performance by Using Immutable Entities and Bytes as IDs

TLDR

Using Immutable Entities and Bytes for IDs in our schema.graphql file significantly improves indexing speed and query performance.

Immutable Entities

To make an entity immutable, we simply add (immutable: true) to an entity.

1type Transfer @entity(immutable: true) {2  id: Bytes!3  from: Bytes!4  to: Bytes!5  value: BigInt!6}

By making the Transfer entity immutable, graph-node is able to process the entity more efficiently, improving indexing speeds and query responsiveness.

Immutable Entities structures will not change in the future. An ideal entity to become an Immutable Entity would be an entity that is directly logging onchain event data, such as a Transfer event being logged as a Transfer entity.

Under the hood

Mutable entities have a ‘block range’ indicating their validity. Updating these entities requires the graph node to adjust the block range of previous versions, increasing database workload. Queries also need filtering to find only live entities. Immutable entities are faster because they are all live and since they won’t change, no checks or updates are required while writing, and no filtering is required during queries.

When not to use Immutable Entities

If you have a field like status that needs to be modified over time, then you should not make the entity immutable. Otherwise, you should use immutable entities whenever possible.

Bytes as IDs

Every entity requires an ID. In the previous example, we can see that the ID is already of the Bytes type.

1type Transfer @entity(immutable: true) {2  id: Bytes!3  from: Bytes!4  to: Bytes!5  value: BigInt!6}

While other types for IDs are possible, such as String and Int8, it is recommended to use the Bytes type for all IDs due to character strings taking twice as much space as Byte strings to store binary data, and comparisons of UTF-8 character strings must take the locale into account which is much more expensive than the bytewise comparison used to compare Byte strings.

Reasons to Not Use Bytes as IDs

If entity IDs must be human-readable such as auto-incremented numerical IDs or readable strings, Bytes for IDs should not be used.
If integrating a Subgraph’s data with another data model that does not use Bytes as IDs, Bytes as IDs should not be used.
Indexing and querying performance improvements are not desired.

Concatenating With Bytes as IDs

It is a common practice in many Subgraphs to use string concatenation to combine two properties of an event into a single ID, such as using event.transaction.hash.toHex() + "-" + event.logIndex.toString(). However, as this returns a string, this significantly impedes Subgraph indexing and querying performance.

Instead, we should use the concatI32() method to concatenate event properties. This strategy results in a Bytes ID that is much more performant.

1export function handleTransfer(event: TransferEvent): void {2  let entity = new Transfer(event.transaction.hash.concatI32(event.logIndex.toI32()))3  entity.from = event.params.from4  entity.to = event.params.to5  entity.value = event.params.value67  entity.blockNumber = event.block.number8  entity.blockTimestamp = event.block.timestamp9  entity.transactionHash = event.transaction.hash1011  entity.save()12}

Sorting With Bytes as IDs

Sorting using Bytes as IDs is not optimal as seen in this example query and response.

Query:

1{2  transfers(first: 3, orderBy: id) {3    id4    from5    to6    value7  }8}

Query response:

1{2  "data": {3    "transfers": [4      {5        "id": "0x00010000",6        "from": "0xabcd...",7        "to": "0x1234...",8        "value": "256"9      },10      {11        "id": "0x00020000",12        "from": "0xefgh...",13        "to": "0x5678...",14        "value": "512"15      },16      {17        "id": "0x01000000",18        "from": "0xijkl...",19        "to": "0x9abc...",20        "value": "1"21      }22    ]23  }24}

The IDs are returned as hex.

To improve sorting, we should create another field on the entity that is a BigInt.

1type Transfer @entity {2  id: Bytes!3  from: Bytes! # address4  to: Bytes! # address5  value: BigInt! # unit2566  tokenId: BigInt! # uint2567}

This will allow for sorting to be optimized sequentially.

Query:

1{2  transfers(first: 3, orderBy: tokenId) {3    id4    tokenId5  }6}

Query Response:

1{2  "data": {3    "transfers": [4      {5        "id": "0x…",6        "tokenId": "1"7      },8      {9        "id": "0x…",10        "tokenId": "2"11      },12      {13        "id": "0x…",14        "tokenId": "3"15      }16    ]17  }18}

Conclusion

Using both Immutable Entities and Bytes as IDs has been shown to markedly improve Subgraph efficiency. Specifically, tests have highlighted up to a 28% increase in query performance and up to a 48% acceleration in indexing speeds.

Read more about using Immutable Entities and Bytes as IDs in this blog post by David Lutterkort, a Software Engineer at Edge & Node: Two Simple Subgraph Performance Improvements.

Subgraph Best Practices 1-6

⁠Edit on GitHub⁠