Substreams-Powered Subgraphs FAQ
Reading time: 5 min
Developed by , Substreams is an exceptionally powerful processing engine capable of consuming rich streams of blockchain data. Substreams allow you to refine and shape blockchain data for fast and seamless digestion by end-user applications. More specifically, Substreams is a blockchain-agnostic, parallelized, and streaming-first engine, serving as a blockchain data transformation layer. Powered by the , it enables developers to write Rust modules, build upon community modules, provide extremely high-performance indexing, and their data anywhere.
Go to the to learn more about Substreams.
combine the power of Substreams with the queryability of subgraphs. When publishing a Substreams-powered Subgraph, the data produced by the Substreams transformations, can , which are compatible with subgraph entities.
If you are already familiar with subgraph development, then note that Substreams-powered subgraphs can then be queried, just as if it had been produced by the AssemblyScript transformation layer, with all the Subgraph benefits, like providing a dynamic and flexible GraphQL API.
Subgraphs are made up of datasources which specify onchain events, and how those events should be transformed via handlers written in Assemblyscript. These events are processed sequentially, based on the order in which events happen onchain.
By contrast, substreams-powered subgraphs have a single datasource which references a substreams package, which is processed by the Graph Node. Substreams have access to additional granular onchain data compared to conventional subgraphs, and can also benefit from massively parallelised processing, which can mean much faster processing times.
Substreams-powered subgraphs combine all the benefits of Substreams with the queryability of subgraphs. They bring greater composability and high-performance indexing to The Graph. They also enable new data use cases; for example, once you've built your Substreams-powered Subgraph, you can reuse your to output to different such as PostgreSQL, MongoDB, and Kafka.
There are many benefits to using Substreams, including:
-
Composable: You can stack Substreams modules like LEGO blocks, and build upon community modules, further refining public data.
-
High-performance indexing: Orders of magnitude faster indexing through large-scale clusters of parallel operations (think BigQuery).
-
Sink anywhere: Sink your data to anywhere you want: PostgreSQL, MongoDB, Kafka, subgraphs, flat files, Google Sheets.
-
Programmable: Use code to customize extraction, do transformation-time aggregations, and model your output for multiple sinks.
-
Access to additional data which is not available as part of the JSON RPC
-
All the benefits of the Firehose.
Developed by , the Firehose is a blockchain data extraction layer designed from scratch to process the full history of blockchains at speeds that were previously unseen. Providing a files-based and streaming-first approach, it is a core component of StreamingFast's suite of open-source technologies and the foundation for Substreams.
Go to the to learn more about the Firehose.
There are many benefits to using Firehose, including:
-
Lowest latency & no polling: In a streaming-first fashion, the Firehose nodes are designed to race to push out the block data first.
-
Prevents downtimes: Designed from the ground up for High Availability.
-
Never miss a beat: The Firehose stream cursor is designed to handle forks and to continue where you left off in any condition.
-
Richest data model: Best data model that includes the balance changes, the full call tree, internal transactions, logs, storage changes, gas costs, and more.
-
Leverages flat files: Blockchain data is extracted into flat files, the cheapest and most optimized computing resource available.
The will teach you how to build Substreams modules.
The will show you how to package them for deployment on The Graph.
The will allow you to bootstrap a Substreams project without any code.
Rust modules are the equivalent of the AssemblyScript mappers in subgraphs. They are compiled to WASM in a similar way, but the programming model allows for parallel execution. They define the sort of transformations and aggregations you want to apply to the raw blockchain data.
When using Substreams, the composition happens at the transformation layer enabling cached modules to be re-used.
As an example, Alice can build a DEX price module, Bob can use it to build a volume aggregator for some tokens of his interest, and Lisa can combine four individual DEX price modules to create a price oracle. A single Substreams request will package all of these individual's modules, link them together, to offer a much more refined stream of data. That stream can then be used to populate a subgraph, and be queried by consumers.
After a Substreams-powered Subgraph, you can use the Graph CLI to deploy it in .
You can visit to find examples of Substreams and Substreams-powered subgraphs.
The integration promises many benefits, including extremely high-performance indexing and greater composability by leveraging community modules and building on them.