The Graph Now Supports Solana with Substreams

The Graph Foundation is excited to announce support for Solana with substreams. The Solana developer community can now begin using The Graph to build lightning-fast dapps. By using the new substreams technology, developers can efficiently extract and interpret on-chain data from Solana’s mainnet-beta to feed their applications. Providing support with substreams is the first step in bringing subgraphs to Solana.

Substreams, which are fully open-source, empower Solana developers to build with on-chain data in brand new ways, thanks to their speed and efficiency. Developers can use substreams modules, coded in Rust, to build protocol-specific data streams or market-wide analytical datasets. They can also be used to power real-time notifications, and display long, time-series information. Breaking out of walled gardens, substreams devs can leverage streams created by others to save time, and can empower the whole web3 ecosystem by making their work openly available and composable. As a result, substreams give rise to new and innovative use cases throughout the Solana developer community.

Developed by StreamingFast, a core developer in The Graph ecosystem, substreams allow for extremely fast historical processing (in batch and in streaming). Substreams open the door to many benefits, including: feeding any data systems through technology-specific sinks, reusing your Solana program’s Rust code to read on-chain data, a laser-focused debugging experience, communal and composable refinement of data streams, and reliable reorg-aware streams.

A true industry-shifting technology, substreams are poised to unlock subgraph performance with parallel data processing to greatly increase syncing speeds. Through a horizontally scalable parallel engine, substreams are capable of multiplying historical indexing performance by more than 100x.

Developers can utilize substreams to generate new and exciting use cases, such as cross-chain bridges, large-scale analytics, refined intelligence for block explorers, trading engines, and any application in need of a rich, consistent data stream.

A free-to-use hosted service for this technology will be available until it is deployed on The Graph Network.


Solana support on The Graph has been long-awaited and we’re thrilled to be providing an efficient way to get historical Solana data using substreams, a new cutting-edge architecture for data streaming.
Eva BeylinDirector of The Graph Foundation

How Do Substreams Work?

Substreams are new data sources on The Graph that more efficiently extract enriched data through modules built in Rust. While substreams can be used independently, there are ongoing efforts to integrate substreams to power subgraphs and be supported on The Graph Network.

The culmination of years of research and development, substreams were created by StreamingFast, a core dev that has worked across many chains to learn the needs for data-indexing architecture. Following the launch of Firehose, which revealed the potential efficiencies of extracting data to optimize indexing, substreams were created to unlock greater opportunities for dapp developers.

In an ETL (extract, transform, load) analogy, substreams are the transformation layer, whereas Firehose is the extraction layer. By contrast, subgraphs provide the full ETLQ experience, including the load and query layers.

RPC-based indexing technologies usually poll API from the native chain clients. Firehose technology replaces those polling API calls with a stream of data utilizing a push model and sending data to the indexing node faster. This increases the speed of syncing and indexing, and does away with most needs for archive nodes.

Substreams, which are blockchain agnostic, take things even further by enabling massively parallelized streaming data. Substreams can be combined in powerful new ways to feed data into subgraphs or end-user applications in a fraction of the time. Early testing on some subgraphs saw sync speed increases of over 100x with substreams parallelization.

Because substreams support stateful modules, analytic use cases can aggregate computations across the history of the chain, even in parallel, enabling new powerful ad-hoc analysis to be performed.

Substreams can feed into many sinks, with Postgres and MongoDB already available, and graph-node integration on its way. Substreams can also easily be consumed by simple programs written in any language that supports gRPC (Python, Go, Rust, C/#/++, Java/Kotlin and more), feeding into any system you may already have.

With any Solana programs being written in Rust, instructions can be decoded in substreams using the same code you use to validate transactions on-chain, targeting WebAssembly instead of BPF.

Being fully deterministic, substreams have excellent caching capabilities. They allow you to leverage the cached state of previously executed modules to jump in the middle of history to zero in on a bug without starting over from the beginning. Once dependencies of your module have been processed once, anyone can start building off of it, at any point in time in on-chain history. This massively impacts agility and speed of iteration.

Substreams also create new forms of in-flight composition. This means that modules taken from different authors can be combined together at the time of transformation, not at a later query time.

As an example, substreams make it possible to use a Serum price module developed by team A, combine it with a Metaplex sales module developed by team B, and then create a third, enriched and refined USD volume of trades, developed by yourself. Each stream would stay independently composable. So, if you need to access data on prices, you could just hook into the prices module; or if you need sales volumes, you could just hook into the sales volume module.

Lastly, reliability is baked into the substream technology in the form of a cursor, accompanying every streamed payload. This cursor can be sent back in the next request in case of disconnection - just like a web cookie - and guarantees that you will never miss any re-org signal, even if the event happened while you were disconnected.

A full-fledged integration of substreams across chains as well as a subgraph-substreams integration to bring performance improvements to subgraphs is coming soon! When combining the speed and data composability of subgraphs and substreams with unpacked blockchain data from the Firehose, The Graph is unarguably the fastest and most efficient way to get data from blockchains.

How to Get Started Indexing Solana Data with The Graph

You can access Solana today from the hosted service, while we work to bring substreams as a native product of The Graph’s decentralized network economics. Here are some resources to help you get started:

  • Substreams developer guide
  • Substreams documentation
  • Substreams Playground on GitHub
  • Chains & Endpoints documentation
  • Devcon 2022: Introducing The Graph Substreams for High-Performance Indexing video

About StreamingFast

StreamingFast is a web3 builder and investor. As a core developer on The Graph, it excels at building massively scalable open-source software for processing and indexing blockchain data. Founded by a team of serial tech entrepreneurs, the company has deep expertise in large-scale data science. Its core innovations, the Firehose and Substreams, is a files-based and streaming-first approach that enables high-performance indexing on high throughput chains.

You can follow StreamingFast on Twitter and on Discord.

About The Graph

The Graph is the source of data and information for the decentralized internet. As the original decentralized data marketplace that introduced and standardized subgraphs, The Graph has become web3’s method of indexing and accessing blockchain data. Since its launch in 2018, tens of thousands of developers have built subgraphs for dapps across 40+ blockchains - including  Ethereum, Arbitrum, Optimism, Base, Polygon, Celo, Fantom, Gnosis, and Avalanche.

As demand for data in web3 continues to grow, The Graph enters a New Era with a more expansive vision including new data services and query languages, ensuring the decentralized protocol can serve any use case - now and into the future.

Discover more about how The Graph is shaping the future of decentralized physical infrastructure networks (DePIN) and stay connected with the community. Follow The Graph on X, LinkedIn, Instagram, Facebook, Reddit, and Medium. Join the community on The Graph’s Telegram, join technical discussions on The Graph’s Discord.

The Graph Foundation oversees The Graph Network. The Graph Foundation is overseen by the Technical Council. Edge & Node, StreamingFast, Semiotic Labs, The Guild, Messari, GraphOps, Pinax and Geo are eight of the many organizations within The Graph ecosystem.


Category
Graph Protocol
Author
The Graph Foundation
Published
November 3, 2022

The Graph Foundation

View all blog posts