The Graph Now Supports Solana with Substreams
The Graph Foundation is excited to announce support for Solana with . The Solana developer community can now begin using The Graph to build lightning-fast dapps. By using the new substreams technology, developers can efficiently extract and interpret on-chain data from Solana’s mainnet-beta to feed their applications. Providing support with substreams is the first step in bringing subgraphs to Solana.
Substreams, which are fully open-source, empower Solana developers to build with on-chain data in brand new ways, thanks to their speed and efficiency. Developers can use substreams modules, coded in Rust, to build protocol-specific data streams or market-wide analytical datasets. They can also be used to power real-time notifications, and display long, time-series information. Breaking out of walled gardens, substreams devs can leverage streams created by others to save time, and can empower the whole web3 ecosystem by making their work openly available and composable. As a result, substreams give rise to new and innovative use cases throughout the Solana developer community.
Developed by , a core developer in The Graph ecosystem, substreams allow for extremely fast historical processing (in batch and in streaming). Substreams open the door to many benefits, including: feeding any data systems through technology-specific sinks, reusing your Solana program’s Rust code to read on-chain data, a laser-focused debugging experience, communal and composable refinement of data streams, and reliable reorg-aware streams.
A true industry-shifting technology, substreams are poised to unlock subgraph performance with parallel data processing to greatly increase syncing speeds. Through a horizontally scalable parallel engine, substreams are capable of multiplying historical indexing performance by more than 100x.
Developers can utilize substreams to generate new and exciting use cases, such as cross-chain bridges, large-scale analytics, refined intelligence for block explorers, trading engines, and any application in need of a rich, consistent data stream.
A free-to-use hosted service for this technology will be available until it is deployed on The Graph Network.
“Solana support on The Graph has been long-awaited and we’re thrilled to be providing an efficient way to get historical Solana data using substreams, a new cutting-edge architecture for data streaming.”
How Do Substreams Work?
Substreams are new data sources on The Graph that more efficiently extract enriched data through modules built in Rust. While substreams can be used independently, there are ongoing efforts to integrate substreams to power subgraphs and be supported on The Graph Network.
The culmination of years of research and development, substreams were created by StreamingFast, a core dev that has worked across many chains to learn the needs for data-indexing architecture. Following the launch of , which revealed the potential efficiencies of extracting data to optimize indexing, substreams were created to unlock greater opportunities for dapp developers.
In an ETL (extract, transform, load) analogy, substreams are the transformation layer, whereas Firehose is the extraction layer. By contrast, subgraphs provide the full ETLQ experience, including the load and query layers.
RPC-based indexing technologies usually poll API from the native chain clients. technology replaces those polling API calls with a stream of data utilizing a push model and sending data to the indexing node faster. This increases the speed of syncing and indexing, and does away with most needs for archive nodes.
Substreams, which are blockchain agnostic, take things even further by enabling massively parallelized streaming data. Substreams can be combined in powerful new ways to feed data into subgraphs or end-user applications in a fraction of the time. Early testing on some subgraphs saw sync speed increases of over 100x with substreams parallelization.
Because substreams support stateful modules, analytic use cases can aggregate computations across the history of the chain, even in parallel, enabling new powerful ad-hoc analysis to be performed.
Substreams can feed into many sinks, with Postgres and MongoDB already available, and graph-node integration on its way. Substreams can also easily be consumed by simple programs written in any language that supports gRPC (Python, Go, Rust, C/#/++, Java/Kotlin ), feeding into any system you may already have.
With any Solana programs being written in Rust, instructions can be decoded in substreams using the same code you use to validate transactions on-chain, targeting WebAssembly instead of BPF.
Being fully deterministic, substreams have excellent caching capabilities. They allow you to leverage the cached state of previously executed modules to jump in the middle of history to zero in on a bug without starting over from the beginning. Once dependencies of your module have been processed once, anyone can start building off of it, at any point in time in on-chain history. This massively impacts agility and speed of iteration.
Substreams also create new forms of in-flight composition. This means that modules taken from different authors can be combined together at the time of transformation, not at a later query time.
As an example, substreams make it possible to use a Serum price module developed by team A, combine it with a Metaplex sales module developed by team B, and then create a third, enriched and refined USD volume of trades, developed by yourself. Each stream would stay independently composable. So, if you need to access data on prices, you could just hook into the prices module; or if you need sales volumes, you could just hook into the sales volume module.
Lastly, reliability is baked into the substream technology in the form of a cursor, accompanying every streamed payload. This cursor can be sent back in the next request in case of disconnection - just like a web cookie - and guarantees that you will never miss any re-org signal, even if the event happened while you were disconnected.
A full-fledged integration of substreams across chains as well as a subgraph-substreams integration to bring performance improvements to subgraphs is coming soon! When combining the speed and data composability of subgraphs and substreams with unpacked blockchain data from the Firehose, The Graph is unarguably the fastest and most efficient way to get data from blockchains.
How to Get Started Indexing Solana Data with The Graph
You can access Solana today from the hosted service, while we work to bring substreams as a native product of The Graph’s decentralized network economics. Here are some resources to help you get started:
- Substreams
- Substreams
- Substreams Playground on
- Chains & Endpoints
- Devcon 2022: Introducing The Graph Substreams for High-Performance Indexing
About StreamingFast
is a web3 builder and investor. As a core developer on The Graph, it excels at building massively scalable open-source software for processing and indexing blockchain data. Founded by a team of serial tech entrepreneurs, the company has deep expertise in large-scale data science. Its core innovations, the and , is a files-based and streaming-first approach that enables high-performance indexing on high throughput chains.
You can follow StreamingFast on and on .
About The Graph
is the source of data and information for the decentralized internet. As the original decentralized data marketplace that introduced and standardized subgraphs, The Graph has become web3’s method of indexing and accessing blockchain data. Since its launch in 2018, tens of thousands of developers have for dapps across 90+ blockchains - including Ethereum, Solana, Arbitrum, Optimism, Base, Polygon, Celo, Fantom, Gnosis, and Avalanche.
What Is The Graph?
is a decentralized protocol for indexing and querying blockchain data, empowering developers to build scalable web3 applications without managing infrastructure. Subgraphs—open APIs that structure and expose on-chain data—have established The Graph as the standard for accessing blockchain information across 90+ networks, including Ethereum, Solana, Arbitrum, Optimism, and Polygon.
To enhance data processing, The Graph introduced Substreams, a high-performance, parallelized indexing technology. Substreams enables developers to:
- Extract blockchain data from networks like Ethereum, Solana, and Polygon.
- Apply custom data transformations.
- Stream or store processed data in databases or files.
By leveraging Substreams packages, developers can define precise extraction parameters, such as retrieving data from the Uniswap v3 smart contract, and direct the output to suit their needs.
As demand for data in web3 continues to grow, The Graph enters a with a more expansive vision including new data services and query languages, ensuring the decentralized protocol can serve any use case - now and into the future.
Discover more about how The Graph is shaping the future of decentralized physical infrastructure networks (DePIN) and stay connected with the community. Follow The Graph on , , , , , and . Join the community on The Graph’s , join technical discussions on The Graph’s .
oversees The Graph Network. The Graph Foundation is overseen by the . , , , , , and are seven of the many organizations within The Graph ecosystem.