graph protocol

The Graph R&D Roadmap

author
The Graph Foundation
March 23, 2022

The Graph is a web3 protocol for indexing and querying blockchain data. Since the launch of The Graph Network in December 2020, core contributors in The Graph community have been working to enhance and improve the protocol, to empower users to access blockchain data in a verifiable and decentralized way. Multiple independent teams are driving research and development, constantly advancing the vision of The Graph and the mission of web3 forward.

Today The Graph Foundation is excited to share The Graph R&D Roadmap, a collaborative plan being executed by contributors from around the world, including Edge & Node, StreamingFast, Figment, The Guild, Semiotic AI, GraphOps, LimeChain, BlockScience, Prysm Group and other independent researchers. This roadmap is constructed from years of research and design across multiple teams to enhance The Graph Network, to fulfill the data needs of dapp developers and consumers.

The developments detailed in this roadmap include: significant improvements to indexing performance and streaming architecture with Firehose, new data sources and chain support, work towards a SNARK proof for scalable state channels and verifiable querying, upgrades to Delegator and Curator mechanisms, layer 2 scaling, more gateways, improvements to the subgraph developer experience, indexer tooling optimizations, and much more! Additionally, some of the milestones in this roadmap will also enable Ethereum dapp developers to access historical data after the implementation of Ethereum client upgrade EIP-4444.

Roadmap

An integral part of continuing to push decentralization of The Graph is decentralizing the way teams collaborate on core elements of the protocol. To make it easier for teams developing The Graph to collaborate cross-functionally, R&D working groups were created for distinct focus areas. The Graph is a core component of the web3 stack, and the working groups cover key areas of the protocol that web3 developers rely on.

There are 5 distinct working groups that make up the focus areas of the roadmap:

  • Data & APIs
  • SNARK Force
  • Protocol Economics
  • Protocol & Network Operations
  • Indexer Experience

These working groups enable teams in The Graph community to contribute to different components of the protocol in parallel and scale coordination more efficiently. There are also many dependencies across working groups, where outcomes unlock efforts for others (illustrated below by the connecting arrows).

standard

The Graph core contributors are building in public. Teams are collaborating in the Graph Protocol Github and meeting monthly at the Core R&D Call to discuss updates to work streams, engage in cross-functional brainstorming, and receive feedback from one another. Subscribe to the The Graph Ecosystem Calendar to attend these calls! Progress for each working group can also be tracked in the new Core R&D Workspace dashboard, GIPs hosted on Radicle and feedback can be found in The Graph Forum.

Working Groups In-Depth

Data & APIs

The Data & APIs working group focuses on all things Graph Node and the subgraph ecosystem, ensuring data can be indexed and served in a performant manner. The Graph Network is home to many Indexers providing high quality indexing services that subgraph developers, dapps and data consumers can depend on.

standard

One of the top priorities for The Graph is improving indexing performance and reliability to ensure indexing uptime, speed and scalability. Core contributors are focused on several work streams to improve performance: developing a parallel data execution model, a novel 100x faster hashing algorithm for Proofs of Indexing (POIs), adopting a novel streaming architecture, and a new blockchain data extraction and ingestion mechanism through Firehose.

Firehose is a critical component of the indexing stack that enables highly efficient access to raw blockchain data. There is significant focus on standardizing the data extraction and developing multi-chain integrations with Firehose so all chains supported by The Graph can benefit from the streaming efficiency of the Firehose framework.

With the introduction of streaming architecture, the network will unlock unparalleled indexing speeds and new use cases such as token balances and transfers. This new framework is focused on data flows and will allow for efficient parallel data processing and sharing, introducing new capabilities such as pipelining, filtering, and aggregations through the Firehose and new interfaces. More details on this workstream will be shared as a GIP and during subsequent core R&D calls.

The Graph Client is a dedicated tool to make it easier to build dapps with subgraphs by wrapping and abstracting all operations supported by the GraphQL API. The client exposes a suite of features that enrich the developer experience, like simplifying interactions with the network, exposing mechanisms to fetch data from multiple Indexers and GraphQL endpoints, built-in improved network fallback, new features like analytics and aggregations, and support for client-side composition. The latter is particularly relevant for complex dapps that need data from several different subgraphs, simplifying multiple requests for a unified schema. The Graph Client’s functionalities will expand as new subgraph features and tooling are developed.

Over the past year, support for more than 27 new chains has been added on The Graph’s hosted service including NEAR, Avalanche, Polygon, Optimism and Arbitrum. Currently Cosmos, Solana and Arweave integrations are in progress with more chains to come. In addition to new chains, New Data Sources like IPFS are being worked on to better support indexing data such as NFT metadata.

The Subgraph Developer Experience is also constantly being improved so developers get the most functionality out of subgraphs and save time when querying blockchain data. Improved tooling is being developed for the subgraph lifecycle experience, like a Hardhat plugin, a richer GraphQL API, unit and integration testing and native time series support to save time and effort for subgraph developers.

SNARK Force

Over the last four years, core researchers have come together to form a SNARK Force, enabling Scalar state channel scalability with zero-knowledge proofs (ZKPs), long-term working towards using ZKPs for verifiable querying. Together with fraud proofs for indexing these proofs will lower the trust assumption required to use The Graph to 1 of N network participants.

standard

Verifiable queries make it much harder for a malicious Indexer to serve an incorrect query result. The implementation of verifiable queries will evolve the network’s trust model away from utilizing arbitration, and towards using cryptography to verify the accuracy of the data being served to dapps. Participants in the network will be able to succinctly verify any claims of malicious behavior.

These changes will enable The Graph to serve as a verifiable and decentralized solution for providing historical data following the EIP-4444 upgrade that removes data requirements for Ethereum clients. Dapps, data providers and any Ethereum users will be able to rely on The Graph Network for serving historical data across chains. At this time, the SNARK Force is seeking peer review of their zero-knowledge proof (ZKP) research — if you want to know more, you are invited to get in touch!

Scalar with Zero-Knowledge Proofs will enable the payments scale needed to support dapps. Scalar is the first state channel system designed to operate at the scale and robustness required by The Graph Network, with millions of parallel Scalar state channels powering the network’s query fees at any given time. Whenever there are many state channels, each channel needs to be resolved individually on-chain in the case of a dispute by posting provably correct data on-chain. As a result, it becomes economically unviable to perform disputes with a large number of channels. The Scalar ZKP will allow many state channels to be resolved at once with a single, low-gas transaction. This fee reduction dramatically reduces the trust requirements for using Scalar, which is vital for enabling direct consumer payments.

Protocol Economics

The Protocol Economics working group is responsible for the system level and incentive design of the protocol. This mandate includes evolving the protocol economics to support new features, such as subgraph composition, multi-blockchain or off-chain data sources. It also involves improving the design of existing mechanisms such as the query market, the curation market, as well as staking and delegation. The group is also focused on the implementations of the protocol economics and how these can be scaled to meet the growing needs of The Graph ecosystem.

The working group is a multi-disciplinary effort that combines expertise from engineering, product, user experience, economics and AI. As such, proposed protocol improvements are validated through user research, economic analysis, smart contract audits and simulations, in addition to community feedback.

standard

Ongoing research and optimization of different subsystems will collectively amount to a much improved overall user experience for various stakeholders and a more balanced and healthier network.

The protocol economics working group is exploring how The Graph can leverage layer 2 rollups to scale the protocol logic. This will dramatically reduce costs for all protocol participants and have numerous second order benefits, such as making it cheaper for Indexers to provide a high quality of service on a subgraph or enabling the curation and delegation markets to be more dynamic. As The Graph also relies on global state for several of its reward mechanisms, expanding to L2 is not as trivial as replicating the protocol on L2 as can be done with some DeFi protocols. The strategy for expanding to layer 2 must be carefully executed as the decentralized network is already relied upon by dapps in production.

Subgraph developer improvements include streamlining the developer experience of publishing and querying subgraphs through the decentralized network. Bonding curve mechanics are a key focus area of the group. Extensive research has gone into the concept of a principal-protected bonding curve that allows subgraph developers to publish and curate subgraphs, with peace of mind that their initial signal is protected from volatility. As a side effect, this will introduce better N-1 support, allowing developers to move signal between different versions of a subgraph to ensure no downtime occurs during subgraph upgrades. Similarly, signal renting functionality is being explored and tested to provide a more familiar subscription model for enlisting indexing resources in the network.

For Indexer improvements, recent work around stake rebates will introduce improved mechanisms, further encouraging Indexers to serve queries in proportion to their stake in the network (the goal of the existing Cobb-Douglas query fee rebates). Knowing that stale allocations (not closed within 28 epochs) are detrimental to the network's health, new mechanisms to force-close such allocations are being explored to deter inactive Indexers. Relatedly, it is important that Indexers settle their queries to collect query fees as this would add to network transparency and better overall understanding of the market dynamics. In addition to the above mechanisms, this will be supported through subsidized query settlements.

Core contributors have been experimenting with different curation improvements, many of which were initially introduced by community members. One priority is reducing low-quality signaling by potentially implementing an initialization phase that fights volatility and reduces high signaling swings typically induced by impulsive curation behavior. Similarly, research around introducing a decaying capital gain tax as a countermeasure to fight economic attacks like MEV (sandwich attacks, specifically) might also bring additional benefits to the network by incentivizing long-term curation. Improving allocative efficiency aims to enable curation shares to be minted by those who have the greatest utility for them, while preserving the strong incentive to signal early on a subgraph.

Delegation improvements are focused on making delegation more efficient so active Delegators have a better experience when selecting Indexers. Improvements include batch delegation across Indexers, instant redelegation, and improving staking efficiency.

Finally, to validate the improvements described above, the economics working group continues to invest its time in modeling the protocol, both with classical economic techniques as well as simulations, including agent-based models. This latter effort is also aimed at designing agents that optimize over one or more of the protocol’s mechanisms–these can be leveraged by other working groups such as Protocol & Network Operations and Indexer Experience to provide participants like Consumers, Subgraph Developers and Indexers better tools for interacting with the protocol in an automated fashion.

Protocol & Network Operations

The Protocol & Network Operations working group is dedicated to the instantiation, maintenance, and optimization of the different subsystems of the protocol. It operates at the intersection between protocol economics, Indexer behavior, and payments infrastructure, unifying these layers.

standard

Protocol Automations include automated dispute resolution and integrations with keepers, bridges, and watchers to improve the current dispute management process in the network. While verifiable proofs are still in development, disputes are resolved with arbitration where Indexers sign their claims on their Indexing and offending Indexers are slashed.

The Protocol Engineering working group is responsible for improving the network gateways, infrastructure that assist with GRT allocations and payments, and data availability in the network. Additionally, significant improvements to the payments experience are in the works like enabling subscription-based payments for subgraph consumers and Scalar with ZKPs to reduce gas costs and enable support for high volume of disputes.

An Epoch Block Oracle is also in development to enable the effective tracking of multi-chain indexing rewards, so Indexers earn the accurate rewards per respective chain and subgraph they serve. The Epoch Block Oracle will ensure a canonical source of truth about time for the purpose of closing allocations across chains with Proofs of Indexing. Furthermore, there's ongoing research on an Availability Oracle which is required when introducing determinism for data sources with unreliable availability, such as NFT metadata on IPFS.

As Scalar is upgraded with ZKPs (SNARK Force working group), the protocol will take a significant step forward to decentralizing the gateways efficiently.

Indexer Experience

The Indexer Experience working group is focused on all things related to Indexer interactions with the network and protocol. Focus areas are the design, development, and optimization of Indexer operations and tooling to augment the Indexer experience.

standard

Doubling down on using machine learning (ML) to optimize the network has also led to significant progress in predicting more efficient costs per query. This is valuable for Indexers who rely on Agora to manually model their query costs. The result of months of R&D is an automated cost modeling framework that Indexers can leverage to automatically generate Agora cost models according to actual resource consumption. This is possible to derive using an instrumented instance of Postgres that reports advanced resource usage per SQL query.

Today, most Indexers resort to tooling to validate POIs against those already submitted on-chain for closed allocations. POI cross-checking will allow Indexers to opt-in for POI sharing in the network so that Indexers can collectively detect state divergence in real-time. This is important for Indexers, so they can effectively opt out from serving data when such state divergence is detected. This will effectively reduce the risk of having queries disputed, particularly relevant as network volume steadily increases.

The adoption of orchestration and automation tools such as Firehose tooling are key for Indexers to efficiently manage the lifecycle of serving queries and maintaining underlying infrastructure. Proper tooling ensures a highly available and secure setup that is crucial for the health of the network, and for dapps to feel confident in Indexer reliability. As new frameworks are integrated into the stack, this working group will ensure Indexers have the means to efficiently operate new services.

As gas fees are one of the most significant costs for Indexers operating on the network, improving gas efficiency of all Indexer interactions with the network has several positive knock-on effects. It will serve to improve decentralization by making smaller Indexer operations more economically viable, increase data integrity on the network by leaving less dust on the network (particularly query fees), and give Indexers more flexibility to support more subgraphs. One focus of the Indexer Experience working group will be on improving the efficiency of indexing operations by implementing flexible batching of all allocation management transactions and collaborating with other working groups to research and test other chain deployment options.

Another class of improvements to the Indexer Experience are upgrades to advanced indexer decision making strategies, like the new automated allocation optimizer capable of maximizing Indexer operations efficiency. Indexers vary in stake, delegation and delegation parameters, so they spend a significant amount of time optimizing their Indexer setup and economics to balance between appropriately sized infrastructure (to serve the network in a performant and reliable way) and capital-efficient allocation strategies to maximize rewards. Automated strategy optimizers will help Indexers identify their ideal allocation strategy to best meet consumers’ needs to be competitive in the network and effectively implement their business strategies in a complex market.

Collectively, core contributors work with The Graph Foundation and The Graph Council to align on the protocol roadmap and prioritized working groups, each building a piece of the decentralized, permissionless and verifiable indexing and query layer of the web3 stack. Anyone interested in contributing to the protocol can get involved in the community and apply for a protocol grant!

About The Graph

The Graph is the indexing and query layer of web3. Developers build and publish open APIs, called subgraphs, that applications can query using GraphQL. The Graph currently supports indexing data from 31 different networks including Ethereum, NEAR, Arbitrum, Optimism, Polygon, Avalanche, Celo, Fantom, Moonbeam, IPFS, and PoA with more networks coming soon. To date, over 38,000+ subgraphs have been deployed on the hosted service and now subgraphs can be deployed directly on the network. Over 28,000 developers have built subgraphs for applications such as Uniswap, Synthetix, Zora, KnownOrigin, Art Blocks, Gnosis, Balancer, Livepeer, DAOstack, Audius, Decentraland, and many others.

author
The Graph Foundation
March 23, 2022