

Two Simple Subgraph Performance Improvements
This blog was originally published in April 2022 by Edge & Node, a core developer of The Graph.
Edge & Node and the rest of The Graph community are focused on improving subgraph indexing and querying performance. Edge & Node has worked on two features that help in these areas: immutable entities and using Bytes
as the id
of entities. The two features are independent of each other; immutable entities only require a small change to the subgraph schema, while using byte strings as the id
also requires small changes to the mappings. Both are explained in more detail below.
Both of these enhancements improve indexing speed, reduce the amount of data that needs to be stored for a subgraph, and also speed up some queries. The indexing speedup depends on how much of its time the subgraph spends with writing data to the database compared to other operations, like making contract calls or reading entities from the database.
These new features are currently deployed on the hosted service, and are a part of graph-node
versions 0.26.0 and beyond. Subgraph authors also need to use graph-cli
with a version of at least 0.28 and graph-ts
with a version of at least 0.26 to use these features.
Performance measurements
The Edge & Node team did some performance measurements in a controlled environment where we deployed four variants of the same subgraph: a base variant that uses ID
as the type of id
and no immutable entities, one variant each that uses Bytes
as the type of id
and immutable entities, and one that uses both. The performance gains from these two features are striking.
We measured the average time it took to process a block, and the amount of storage that the subgraph required. The faster average block time directly translates to faster syncing while the reduced storage requirement is not only more storage efficient but will also have a positive impact on query speeds:
| Variant | block avg | speedup | sync time | storage | reduction ||-----------+-----------+---------+-----------+---------+-----------|| base | 268 ms | - | 352 hrs | 143 GB | - || immutable | 217 ms | 19% | 292 hrs | 89 GB | 37% || bytes | 225 ms | 16% | 299 hrs | 115 GB | 20% || both | 194 ms | 28% | 259 hrs | 74 GB | 48% |
Using immutable entities
Many entities represent on-chain data and are therefore immutable. Subgraph authors can indicate to the system that these entities will never be changed once they have been created by changing the @entity
annotation in the subgraph's GraphQL schema to@entity(immutable: true)
. For example, for a Transfer
entity, the schema would say
type Transfer @entity(immutable: true) {id ID!from Bytes!to Bytes!amount BigDecimal}
When entity types are marked as immutable, graph-node
can use database indexes that are much cheaper to build and maintain than the ones needed for normal mutable entity types. Of course, any attempt to modify an immutable entity will result in an indexing error.
Using Bytes
as the id
Many subgraphs use binary data like addresses as the id
of entities. So far, graph-node
only allowed ID
(a synonym for String
) as the type of the id
field. Converting byte strings into character strings and using them as the id
has several disadvantages: character strings take twice as much space as byte strings to store binary data, and comparisons of UTF-8 character strings must take the locale into account which is much more expensive than the bytewise comparison used to compare byte strings.
It is now possible to use Bytes
as the type for the id
field of entities, and it is highly recommended to use Bytes
wherever that is possible, and only use String
for attributes that truly contain human-readable text, like the name of a token. Subgraph authors can now simply change the type definition of the id
attribute, using the example from above:
type Transfer @entity(immutable: true) {id Bytes!from Bytes!to Bytes!amount BigDecimal}
In addition, some code changes will be needed. The most obvious change is to remove a lot of calls to toHexString()
so that code like
transfer.id = event.transaction.hash.toHexString()
becomes
transfer.id = event.transaction.hash
For entities whose id
consists of the concatenation of a byte array with some counter, setting a string id
to"${address}-${counter}"
should be changed to simply concatenating the counter with the address so that code like
let id = event.transaction.hash.toHexString().concat('-').concat(BigInt.fromI32(counter).toString())
becomes
let id = event.transaction.hash.concatI32(counter)
For entities that store aggregated data, for example, daily trade volumes and the like, the id
usually contains the day number. Here, too, using a byte string as the id
is beneficial. Determining the id
would look like
let dayID = event.block.timestamp.toI32() / 86400let id = Bytes.fromI32(dayID)
Finally, some constants that represent special addresses might have to be turned into byte strings, too, so that a definition like
const BEEF_ADDRESS = '0xdead...beef'
becomes
const BEEF_ADDRESS = Bytes.fromHexString('0xdead...beef')
Edge & Node is hiring Rust Engineers! See here for details.
Edge & Node is a software development company and the initial team behind The Graph. We create and support protocols and dapps that are building the decentralized future. Learn more about the Edge & Node vision at edgeandnode.com and follow us on Twitter and LinkedIn.
About The Graph
The Graph is the indexing and query layer of web3. Developers build and publish open APIs, called subgraphs, that applications can query using GraphQL. The Graph currently supports indexing data from over 40 different networks including Ethereum, NEAR, Arbitrum, Optimism, ZkSync, Polygon, Avalanche, Celo, Fantom, Moonbeam, IPFS, Cosmos Hub and PoA with more networks coming soon. To date, 88,900+ subgraphs have been deployed on the hosted service. Tens of thousands of developers use The Graph for applications such as Uniswap, Synthetix, KnownOrigin, Art Blocks, Gnosis, Balancer, Livepeer, DAOstack, Audius, Decentraland, and many others.
The Graph Network’s self service experience for developers launched in July 2021; since then over 800+ subgraphs have migrated to the Network, with 450+ Indexers serving subgraph queries, 11,300+ Delegators, and 2,500+ Curators to date. More than 5.6+ million GRT has been signaled to date.
If you are a developer building an application or web3 application, you can use subgraphs for indexing and querying data from blockchains. The Graph allows applications to efficiently and performantly present data in a UI and allows other developers to use your subgraph too! You can deploy a subgraph to the network using the newly launched Subgraph Studio or query existing subgraphs that are in the Graph Explorer. The Graph would love to welcome you to be Indexers, Curators and/or Delegators on The Graph’s mainnet. Join The Graph community by introducing yourself in The Graph Discord for technical discussions, join The Graph’s Telegram chat, and follow The Graph on Twitter, LinkedIn, Instagram, Facebook, Reddit, and Medium! The Graph’s developers and members of the community are always eager to chat with you, and The Graph ecosystem has a growing community of developers who support each other.
The Graph Foundation oversees The Graph Network. The Graph Foundation is overseen by the Technical Council. Edge & Node, StreamingFast, Semiotic Labs, The Guild, Messari and GraphOps are seven of the many organizations within The Graph ecosystem.