How to Become an Effective Indexer on The Graph Network

As web3 proliferates, open and reliable access to blockchain data continues to crystallize as one of the core components of a truly decentralized internet. The Graph Network upholds decentralization in the web3 stack by enabling users to query blockchain data in a decentralized way, thereby empowering developers to build more robust, secure, and resilient dapps that don’t require trust in centralized entities.

The individual geographically distributed contributors that power data access with The Graph are called Indexers. Indexers operate nodes on The Graph Network, indexing subgraphs (open source APIs) that organize blockchain data. Dapps make queries to retrieve data and Indexers serve those queries, equipping developers to create lightning-fast user experiences for users of their dapps.

This guide helps you learn the operational requirements, best practices, and communication tactics to become a best-in-class Indexer on The Graph Network.

Minimum Requirements for Indexers

There are four main requirements needed to be an effective Indexer on The Graph Network.

Stake

You must stake at least 100,000 GRT to be an Indexer. However, this is only the minimum requirement and there is no upper limit. Keep in mind that Delegators can delegate to your Indexer (staking GRT on your Indexer), thereby raising your stake amount by up to 16x. Since Indexers allocate a portion of stake to the subgraphs they index, a larger stake enables Indexers to serve more subgraphs and may offer more flexibility.

Skills

Ideally, Indexers should have a background in DevOps or experience operating blockchain nodes. Indexers should be comfortable deploying cloud or hosted servers, or running their own server hardware. Familiarity with running a validator node and working on Linux is also suggested.

Hardware

In order to balance efficiency, query output, and speed, many Indexers often start with a setup of 16 CPUs, a 1 TB disk, and 32 GB of RAM. You can learn more about the hardware requirements and recommendations via The Graph Academy.

Infrastructure Software

Familiarity with container and scaling technologies such as Docker and Kubernetes is advantageous, however it is also possible to deploy the Graph software to bare metal. Infrastructure software requirements include a PostgreSQL database, Graph Node, Indexer agent, Indexer service, Prometheus metrics server, and potentially more depending on your specific requirements and setup.

Best Practices for Indexers on The Graph Network

After covering the basic setup, skills, hardware, infrastructure software, and staked GRT, aspiring Indexers must learn to operate successfully on The Graph’s decentralized network. An effective Indexer is selective and intentional about the choices they make regarding subgraph allocations, attracting delegation, and indexing of subgraph data.

Allocations on a Subgraph

As part of basic indexing operations, Indexers allocate a portion of their stake to subgraphs they wish to index and serve queries for. It is recommended that Indexers allocate stake in subgraphs proportionally to the curation signal on those subgraphs and query fees generated. Curators signal on subgraphs by staking GRT, and thereby indicating to Indexers which subgraphs should be prioritized. This can change from epoch to epoch.

Allocations affect how the protocol determines the amount of GRT that will be allocated to Indexers, Curators, and Delegators. Each allocation must be closed within 28 epochs with a valid Proof of Indexing (PoI). Epochs last 24 hours and are associated with a PoI to ensure that Indexers are accurately indexing the subgraphs they have allocated to. Allocations are assigned to epochs, which start or end a block.

At the end of an epoch, allocation rewards are distributed to Indexers based on how many queries they served and based on how much stake Indexers have allocated on subgraphs.

Lifecycle of Query Fee Rebates

Query fee rebates are GRT payments from dapps and users for querying data. These payments are mediated by a gateway and are split between the Indexer and Delegators.

Query fees are collected by the gateway when allocations are closed and accumulated in a rebate pool. The rebate pool is designed to encourage Indexers to allocate stake proportionally to the amount of query fees they earn for the network. Once an allocation has been closed, the rebates are available to be claimed by the Indexer. Upon claiming, the query fee rebates are distributed to the Indexer and their Delegators based on the query fee cut and the delegation pool proportions.

Indexing Accurate Data

Indexers are incentivized to provide accurate data. When an allocation is closed, Indexers can be challenged by Fishermen. A Fisherman is anyone in The Graph ecosystem who discovers that false data being served by an Indexer. The Fisherman must deposit 10,000 GRT to create a dispute. If their dispute is accepted, the Indexer’s reward will be slashed and the Fisherman will receive 50% of the slashed rewards while the other 50% is burned. If the claim is rejected, the Fisherman’s 10,000 GRT will be burned.

If the Indexers serve false data, they risk being slashed by 2.5% of their self stake. This slash doesn’t apply to delegated stake, but the Indexers might lose the support of their Delegators if they’re perceived as malicious.

Attract Delegators and Support The Graph community

Indexers are incentivized to serve more queries and power decentralized applications to earn more for their efforts. The capacity to serve queries increases with how much GRT is delegated to Indexers by Delegators.

There are multiple ways Delegators are able to evaluate an Indexer. Some are objective and based on the quality of the data provided, while others are subjective and based on community participation and responsiveness to messages on Telegram and Discord.

Effective Reward Cut

Effective reward cut is what the Indexer takes from the Indexer rewards received from the delegation pool. At the time of this writing, most Indexers typically offer an effective reward cut of 7-12%. Indexers that take lower reward cuts leave more GRT for Delegators, but this will offer the Indexer a smaller percentage of GRT rewards.

Indexers also have a second cut, which is the effective cut of the Query Fees that the Indexer generates. As the network grows and the total amount of Query revenue in the network increases, this number will become more important when deciding if an Indexer is worth delegating to, or not.

Marketing

One of the most effective ways to attract delegations is by having an ENS domain associated with your indexing node. When Delegators select an Indexer on The Graph Explorer, they’ll see either your wallet address or your ENS domain.

ENS domains make your indexing node human-readable and recognizable. This will make it easier for Delegators to find and recommend your node, and is therefore a great anchoring point of consistency for marketing and community outreach as an Indexer.

Ideal Indexer Community Participation

One of the most important aspects of being an Indexer is participating in and maintaining consistent communication in The Graph community. Some of the most effective Indexers operate in an abundance mindset, helping each other run infrastructure efficiently. Community activity also helps to build trust with Delegators who stake their GRT with Indexers when they delegate on the network.

There are a number of ways to get involved and participate.

Indexer Office Hours

Indexers and those interested in indexing are encouraged to attend Indexer Office Hours. There is a meeting every Tuesday at 17:00-18:00 UTC. Any and all questions are welcome.

The meeting takes place on The Graph Discord in the Stage video channel.

It’s an informal drop-in event for anyone to get advice on indexing. It’s also an opportunity to get answers to specific questions you may have related to indexing.

All Indexer Office Hours are recorded, and you can view all the past calls on YouTube.

Discord Activity

Indexers are strongly encouraged to be active on the Indexer Discord channels in order to share and receive tips with other Indexers and stay up to date with changes to the protocol.

Governance Participation

The Graph is an open protocol and the community decides its future. Indexers are also encouraged to participate and voice their opinions on various governance proposals (known as Graph Improvement Proposals, or GIPs) in The Graph Forum.

Anyone can voice their support or opposition to governance proposals and propose changes themselves. This ensures an open community where the best ideas rise to the top.

Core Dev Calls

Every month, contributors across various core dev teams and other working groups in The Graph ecosystem gather on core dev calls to discuss major updates in the protocol and brainstorm around various R&D tracks together.

Indexers are encouraged to listen in on these conversations live, or watch the recordings, in order to learn about upcoming improvements to The Graph Network and developments regarding the Indexer experience.

Subscribe to The Graph Foundation’s ecosystem calendar to learn when each core dev call takes place, and review The Graph Core R&D Workspace in order to follow each working group’s progress on significant workstreams, access meeting notes, and recordings.

Indexers provide an invaluable resource to web3 by ensuring open, decentralized access to data. While becoming an Indexer requires a little more effort than running a traditional validator node, it is an extremely fulfilling way to contribute to the web3 revolution. Thanks to the supportive community and plenty of resources, there are powerful tools that help streamline and simplify the indexing process.

This section provides an overview of tools, their functions, and shows you how to get started.

Grafana

One of the most important aspects of indexing is querying and monitoring data. Grafana is an open source software that allows you to do this with ease, enabling you to monitor metrics, logs, and traces.

Grafana has many useful features that streamline the indexing process:

  • Alerts - get notified when a specific event happens via PagerDuty, SMS, email, VictorOps, OpsGenie, or Slack. If there’s an issue with a subgraph or an unexpected number of queries, you will know immediately.
  • Import dashboards and plugins - discover and use hundreds of dashboards and plugins from the community, including members of The Graph ecosystem.
  • Provisioning - automate dashboard set up with a single script.
  • Permissions - manage multiple teams with different levels of access to dashboards. If your company is running an Indexer node, multiple teams within your organization can access the appropriate data.

One of Grafana’s strengths is how easily it works with Prometheus, a systems monitoring and alert toolkit. The Prometheus metrics server is widely used by Indexers across the network. The Graph Node and Indexer components log their metrics to the metrics server.

If you’d like to begin working with Grafana, you can start with the official tutorial.

Logging tools

The Graph’s official documentation instructs Indexers to set up their server infrastructure with Terraform on Google Cloud. This allows Indexers to manage and analyze log data in real-time using Google Logs.

Alternatively, some Indexers have also used Grafana Logs. Others have chosen not to utilize logging tools in any capacity.

Indexers are encouraged to adjust their setup to best suit their specific needs.

Graphman

Graphman is a command line tool that helps Indexers resolve problems when working with a particular subgraph. The Graphman command is included with the official containers.

Graphman enables Indexers to maintain the most accurate data. When a new version of a subgraph is deployed, the data from the previous subgraph still exists in the database. With Graphman, Indexers can remove the unused, deployed subgraph ensuring that the new subgraph’s data is the only data available.

Graphman also includes commands to modify assignments. Each deployment is assigned to a specific Graph Node instance to be indexed. Graphman allows you to change the instance by either reassigning or modifying it.

If you’re not already using Graphman in your project, you can create a configuration file here.

PostgreSQL

A basic understanding of PostgreSQL is required to create custom indexes. The Graph Node uses a PostgreSQL database to store subgraph data. The Indexer service and agent also uses it to store state channel data, cost models, and indexing rules.

PostgreSQL’s statistics collector also comes in handy for collecting and reporting server activity. It enables Indexers to gradually optimize their setup based on server activity.

AllocationOpt

One of the most important decisions an Indexer makes is how they allocate their stake on The Graph. AllocationOpt is a library that helps optimize how an Indexer should allocate their stake by automating smaller decisions.

You can learn more about AllocationOpt here.

Resources and Guides

The Graph community is open and accessible. Countless community members provide free tutorials, tools, and code snippets. Since running an indexing node can be difficult, Indexers have worked tirelessly to provide each other with valuable resources to make the process as simple as possible.

This section provides further guides and resources that will help you become a thriving Indexer.

Indexer Guides

These guides are specifically designed for Indexers:

Infrastructure Guides

Indexers use a variety of different infrastructure setups to run a node. While these guides don’t come directly from The Graph ecosystem, they are invaluable resources for setting up the infrastructure to be an Indexer:

Tools and Resources

Resources for becoming an Indexer on The Graph:

Start your indexing journey

Becoming an Indexer is one of the greatest opportunities to contribute to blockchain data accessibility as a public good. By contributing to web3’s indexing and querying layer, you’ll be joining a historic cohort of revolutionaries working to create an internet that is more equitable, interoperable, and collaborative. Indexing and serving data in a decentralized, open-source way unshackles the world from siloed, extractive, and often manipulative centralized stores of data. To dive deeper into starting your own Indexer operation, consult The Graph Docs or continue to learn more via The Graph Academy.

About The Graph

The Graph is the source of data and information for the decentralized internet. As the original decentralized data marketplace that introduced and standardized subgraphs, The Graph has become web3’s method of indexing and accessing blockchain data. Since its launch in 2018, tens of thousands of developers have built subgraphs for dapps across 40+ blockchains - including  Ethereum, Arbitrum, Optimism, Base, Polygon, Celo, Fantom, Gnosis, and Avalanche.

As demand for data in web3 continues to grow, The Graph enters a New Era with a more expansive vision including new data services and query languages, ensuring the decentralized protocol can serve any use case - now and into the future.

Discover more about how The Graph is shaping the future of decentralized physical infrastructure networks (DePIN) and stay connected with the community. Follow The Graph on X, LinkedIn, Instagram, Facebook, Reddit, and Medium. Join the community on The Graph’s Telegram, join technical discussions on The Graph’s Discord.

The Graph Foundation oversees The Graph Network. The Graph Foundation is overseen by the Technical Council. Edge & Node, StreamingFast, Semiotic Labs, The Guild, Messari, GraphOps, Pinax and Geo are eight of the many organizations within The Graph ecosystem.


Category
Graph Protocol
Author
The Graph Foundation
Published
September 22, 2022

The Graph Foundation

View all blog posts