Substreams-powered subgraphs FAQ
Developed by StreamingFast, Substreams is an exceptionally powerful processing engine capable of consuming rich streams of blockchain data. Substreams allow you to refine and shape blockchain data for fast and seamless digestion by end-user applications. More specifically, Substreams is a blockchain-agnostic, parallelized, and streaming-first engine, serving as a blockchain data transformation layer. Powered by the Firehose, it enables developers to write Rust modules, build upon community modules, provide extremely high-performance indexing, and sink their data anywhere.
Go to the Substreams Documentation to learn more about Substreams.
Substreams-powered subgraphs combine the power of Substreams with the queryability of subgraphs. When publishing a Substreams-powered Subgraph, the data produced by the Substreams transformations, can output entity changes, which are compatible with subgraph entities.
If you are already familiar with subgraph development, then note that Substreams-powered subgraphs can then be queried, just as if it had been produced by the AssemblyScript transformation layer, with all the Subgraph benefits, like providing a dynamic and flexible GraphQL API.
Subgraphs are made up of datasources which specify on-chain events, and how those events should be transformed via handlers written in Assemblyscript. These events are processed sequentially, based on the order in which events happen on-chain.
By contrast, substreams-powered subgraphs have a single datasource which references a substreams package, which is processed by the Graph Node. Substreams have access to additional granular on-chain data compared to conventional subgraphs, and can also benefit from massively parallelised processing, which can mean much faster processing times.
Substreams-powered subgraphs combine all the benefits of Substreams with the queryability of subgraphs. They bring greater composability and high-performance indexing to The Graph. They also enable new data use cases; for example, once you've built your Substreams-powered Subgraph, you can reuse your Substreams modules to output to different sinks such as PostgreSQL, MongoDB, and Kafka.
There are many benefits to using Substreams, including:
Composable: You can stack Substreams modules like LEGO blocks, and build upon community modules, further refining public data.
High-performance indexing: Orders of magnitude faster indexing through large-scale clusters of parallel operations (think BigQuery).
Sink anywhere: Sink your data to anywhere you want: PostgreSQL, MongoDB, Kafka, subgraphs, flat files, Google Sheets.
Programmable: Use code to customize extraction, do transformation-time aggregations, and model your output for multiple sinks.
Access to additional data which is not available as part of the JSON RPC
All the benefits of the Firehose.
Developed by StreamingFast, the Firehose is a blockchain data extraction layer designed from scratch to process the full history of blockchains at speeds that were previously unseen. Providing a files-based and streaming-first approach, it is a core component of StreamingFast's suite of open-source technologies and the foundation for Substreams.
Go to the documentation to learn more about the Firehose.
There are many benefits to using Firehose, including:
Lowest latency & no polling: In a streaming-first fashion, the Firehose nodes are designed to race to push out the block data first.
Prevents downtimes: Designed from the ground up for High Availability.
Never miss a beat: The Firehose stream cursor is designed to handle forks and to continue where you left off in any condition.
Richest data model: Best data model that includes the balance changes, the full call tree, internal transactions, logs, storage changes, gas costs, and more.
Leverages flat files: Blockchain data is extracted into flat files, the cheapest and most optimized computing resource available.
The Substreams documentation will teach you how to build Substreams modules.
The Substreams-powered subgraphs documentation will show you how to package them for deployment on The Graph.
Rust modules are the equivalent of the AssemblyScript mappers in subgraphs. They are compiled to WASM in a similar way, but the programming model allows for parallel execution. They define the sort of transformations and aggregations you want to apply to the raw blockchain data.
See modules documentation for details.
When using Substreams, the composition happens at the transformation layer enabling cached modules to be re-used.
As an example, Alice can build a DEX price module, Bob can use it to build a volume aggregator for some tokens of his interest, and Lisa can combine four individual DEX price modules to create a price oracle. A single Substreams request will package all of these individual's modules, link them together, to offer a much more refined stream of data. That stream can then be used to populate a subgraph, and be queried by consumers.
You can visit this Github repo to find examples of Substreams and Substreams-powered subgraphs.
The integration promises many benefits, including extremely high-performance indexing and greater composability by leveraging community modules and building on them.