

The Subgraph DevEx Blueprint: 7 Tips to Create Great Subgraph Development Experiences
Introduction
For every piece of software—be it a mobile app, web app, or API server—there are efficient and inefficient ways of building it.
The difference between well-written software and badly written one isn't only glaring from the code and project structure; it manifests clearly in the system's performance. Two applications built for the same purpose and performing identical functions can have a huge gap in their performance, and this can often be traced back to how they were built. Take an API server, for instance: the way the database is queried can make a dramatic difference in performance, and defining proper indexes for your data model can significantly improve query speed.
Subgraphs on The Graph protocol, just like every other piece of software, must be built with both efficiency and developer experience in mind. Poor implementation choices in subgraph development can lead to slower indexing times, increased resource consumption, and a frustrating developer experience for those consuming the API. In this article, we'll dive into proven strategies for building subgraphs that are both efficient to run and intuitive for developers to query, ensuring your blockchain data is both accessible and performant.
Before we dive into this article proper, here are the key concepts we'll be covering:
- Subgraph-friendly smart contract
- Reverse lookups & Meaningful relationships between entities
- Data consistency across entities
- Count tracking
- Timestamp and ordering considerations
- Smart contract derived values
- Poor entity relationships leading to N+1 query problems
This is not in any way a silver bullet for building subgraphs, but a collection of best practices, optimization techniques, and common pitfalls to avoid in order to achieve building an efficient subgraph. That said, let's jump right into them one after the other.
1. Subgraph-friendly smart contract
Building a subgraph starts way before you write the first line of GraphQL schema. It begins with how your smart contract is designed and the events it emits. Think of your smart contract as the primary data source, and the events as the storytellers that will feed your subgraph with rich, meaningful information. The goal is simple: design your smart contract to be a data-friendly companion to your subgraph. This means emitting comprehensive events that capture every change in your contract's state. Why is this important? Because every additional external call your subgraph makes to the smart contract is a performance hit waiting to happen.
What Makes a Smart Contract Subgraph-Friendly?
A subgraph is like an off-chain copy of your smart contract state, it always tries to build an exact copy of the smart contract state through events. Hence, a subgraph-friendly smart contract emits events for every single state change. Staking, unstaking, rewards claiming, configuration updates, etc. These events combined help the subgraph get an accurate sequence of changes in the smart contract and always maintain the same state as your smart contract at all times.
What makes an Event Subgraph-Friendly?
A subgraph-friendly event is like a detailed postcard. It doesn't just say "something happened," it tells you exactly what happened, who was involved, and provides all the context you might need. Let's break down the key characteristics:
- Complete Data Payload: Your events should contain all relevant information about the state change. For example, if a stake happens, don't just emit the ID. Include the address, amount, and every other detail that might be useful for indexing and building the state.
- Avoid External Lookups: Every time your subgraph needs to make an additional call to the smart contract to fetch more data, you're adding latency and potential points of failure. A well-designed event eliminates these extra steps.
Example of a non-subgraph-friendly event:
event Stake(uint256 indexed positionId);
Example of a subgraph-friendly event:
event Stake(uint256 indexed positionId, address indexed user, uint256 amount);
See the difference? The second event provides complete information about the stake that happened, reducing the need for additional contract calls.
While being comprehensive, also be mindful of gas costs. Find the right balance between detailed events and contract execution cost.
2. Reverse lookups & Meaningful relationships Between Entities
In subgraph development, the relationships between your entities are like the social networks of your data. Just as meaningful connections make a social network valuable, well-designed entity relationships make your subgraph powerful and easy to query.
GraphQL shines when you can easily traverse and explore relationships between different entities. This is where meaningful relationships and reverse lookups come into play. The goal is to structure your schema in a way that makes data exploration intuitive and efficient.
What Are Meaningful Relationships?
Imagine you're building a staking platform. Instead of just having isolated entities, you want to create natural connections:
type User @entity {id: ID!# Reverse lookup of stakes owned by this userstakes: [Stake!]! @derivedFrom(field: "user")}type Stake @entity {id: ID!amount: BigInt!# Direct link to the user who created this stakeuser: User!}
With these relationships defined, you can easily traverse the data with queries like:
{users(first: 5) {idstakes {idamount}}}
Performance Implications of Nested Relationships
While relationships make querying intuitive, be cautious with deeply nested relationships. Each level of nesting can impact query performance, especially when dealing with large datasets. For example, this query could become expensive:
{protocols {pools {tokens {pairs {swaps {# Deep nesting can impact performance}}}}}}
Instead, consider flattening relationships where possible and providing direct access paths to frequently accessed data.
The Power of Reverse Lookups
Reverse lookups allow you to create bi-directional relationships without storing redundant data. The @derivedFrom directive is your best friend here. It creates a virtual relationship that can be queried just like a direct relationship, but without the additional storage overhead.
Here's why you should use reverse lookups in your subgraph:
- One-to-Many Relationships: Perfect for scenarios like users and their stakes (as seen in the previous example), pools and their swaps, etc.
- Data Deduplication: Instead of storing the same information in multiple places. When you create/update an entity, reverse lookup fields are read-only, there is no need to explicitly set/update them.
- Efficient Querying: Enable easy traversal between related entities.
Meaningful entity relationships and reverse lookups are about creating a natural, intuitive data graph that makes querying as smooth as possible. It's not just about connecting entities, but about telling a coherent story with your data.
3. Data Consistency Across Entities
In subgraph development, data consistency is not just a best practice—it's a critical requirement. When an event occurs in your smart contract, it's crucial to update every single entity that is affected by that event. Failing to do so is like telling a story with missing pages—the narrative becomes broken and unreliable.
Consider a staking platform with multiple interconnected entities:
type StakingContract @entity {id: ID!totalStaked: BigInt!totalPositionCount: Int!totalOpenPositionCount: Int!}type User @entity {id: ID!stakePositionCount: Int!openPositionCount: Int!totalStakedAmount: BigInt!stakePositions: [StakePosition!]! @derivedFrom(field: "user")}type Withdrawal {id: ID!stakePosition: StakePosition!reward: BigInt!}type StakePosition @entity {id: String!user: User!amount: BigInt!withdrawal: Withdrawalstatus: String!}
When a user stakes tokens, you must update multiple entities:
export function handleStake(event: StakeEvent): void {// Load or create entitieslet stakingContract = loadOrCreateStakingContract(event.address)let user = loadOrCreateUser(event.params.user.toHexString())// Create new stake positionlet stakePosition = new StakePosition(event.params.positionId.toString())// Update StakingContract global state to track total protocol metricsstakingContract.totalStaked = stakingContract.totalStaked.plus(event.params.amount)stakingContract.totalPositionCount += 1stakingContract.totalOpenPositionCount += 1// Update User entity to maintain user-specific statisticsuser.stakePositionCount += 1user.openPositionCount += 1user.totalStakedAmount = user.totalStakedAmount.plus(event.params.amount)// Create and configure stake position with initial statestakePosition.user = user.idstakePosition.amount = event.params.amountstakePosition.status = 'OPEN'// Save all entities atomicallystakingContract.save()user.save()stakePosition.save()}
export function handleWithdrawal(event: WithdrawalEvent): void {// Load or create entitieslet stakingContract = loadOrCreateStakingContract(event.address)let user = loadOrCreateUser(event.params.user.toHexString())let stakePosition = StakePosition.load(event.params.positionId.toString())// Update StakingContract global state to reflect withdrawalstakingContract.totalStaked = stakingContract.totalStaked.minus(event.params.amount)stakingContract.totalOpenPositionCount -= 1// Update User entity to maintain accurate user statisticsuser.openPositionCount -= 1user.totalStakedAmount = user.totalStakedAmount.minus(event.params.amount)// Create the withdrawal entity to track withdrawal detailslet withdrawal = new Withdrawal(event.transaction.hash)withdrawal.stakePosition = stakePosition.idwithdrawal.reward = event.params.reward// Update stake position to reflect withdrawn statestakePosition.withdrawal = withdrawal.idstakePosition.status = 'EXITED'// Save all entities atomicallystakingContract.save()user.save()withdrawal.save()stakePosition.save()}
Likewise when the user withdraws the stake position:
Notice how for every stake and withdrawal event, several entities were updated, and this is because these entities hold values that are affected by the changes caused by the operations that emitted the event.
Key Principles
- Update All Related Entities: Every event should update all the entities it affects
- Maintain Derived State: Keep global and user-level statistics in sync
Potential Risks of Inconsistent Updates
- Mismatched user or global state statistics
- Unreliable reporting and analytics
- Difficult debugging
Remember, in a subgraph, your goal is to create an exact, trustworthy replica of your smart contract's state. Consistency is key.
4. Count Tracking
In subgraph development, keeping track of counts isn't just about having nice statistics—it's a crucial aspect that impacts both data overview and user experience. As we saw in the previous section with the staking example, tracking counts like totalPositionCount and openPositionCount serves multiple important purposes.
Why Track Counts?
Data Overview
Having count fields in your entities provides quick insights into your protocol's state. Instead of running complex queries or calculating totals on the fly, developers can easily access summarized data. For example:
- Total number of users
- Total active positions
- Total transactions processed
Atomic Updates for Counts
When updating counts, it's crucial to ensure that all related count updates happen atomically within the same event handler. This means:
- All count updates should be part of the same transaction
- If any part of the update fails, all updates should be rolled back
- Counts across different entities should always stay in sync
Here's an example schema that demonstrates count tracking:
type Protocol @entity {id: ID!totalUsers: Int!totalTransactions: Int!activePositions: Int!}type User @entity {id: ID!positionCount: Int!activePositionCount: Int!positions: [Position!]! @derivedFrom(field: "user")}# ...other entities here
With these counts, a dApp developer can easily implement pagination:
const ITEMS_PER_PAGE = 10;const totalPages = Math.ceil(user.positionCount / ITEMS_PER_PAGE);
Remember to keep these counts up to date with every relevant event. Accurate count tracking ensures reliable pagination and statistics across your subgraph's consumers.
5. Timestamp and Ordering Considerations
In subgraph development, how you store your data directly impacts how you can query it. One crucial aspect often overlooked is the ability to order entities effectively. Let's look into two key considerations that can make or break your querying experience.
Timestamp Fields
When you want to order data chronologically (which is a common requirement), having a timestamp field in your entity is crucial. GraphQL queries in subgraphs cannot order by nested entity properties, so the timestamp must be present in the main entity itself.
Here's a schema and query example:
type Position @entity {id: ID!user: User!amount: BigInt!createdAt: BigInt! # Timestamp for ordering}type User @entity {id: ID!positions: [Position!]! @derivedFrom(field: "user")}
{positions(orderBy: createdAt, orderDirection: desc) {idamountuser {id}}}
Numeric Value Types
When storing numeric values that you might want to order by, the type choice is critical. Always use Int or BigInt for numeric values—never String. Here's why:
type Token @entity {id: ID!amount: BigInt! # ✅ Good: Will order correctlyamountString: String! # ❌ Bad: Will cause unexpected ordering}
With string numeric values, you'll get counterintuitive results:
- "10" comes before "2" in string ordering
- "100" comes before "20" in string ordering
This can lead to confusing results when ordering token amounts, user balances, or any numeric data.
Remember to always choose the appropriate type for your data based on how you plan to query and order it. Your future self (and other developers) will thank you!
6. Smart Contract Derived Values
In subgraph development, one critical pitfall to avoid is storing values that are derived from smart contract or blockchain state. These values can become stale without any event to trigger an update, leading to inconsistencies between your subgraph and the actual blockchain state.
Understanding the Problem
Let's look at a staking contract example with different time-based states:
- Stake Cooldown Period: 24 hours
- Stake Lock Period: 30 days
- Withdrawal Waiting Period: 24 hours
Here's what NOT to do:
type StakePosition @entity {id: ID!amount: BigInt!status: String! # ⚠️ BAD: Storing computed state like "COOLDOWN", "LOCKED", "WITHDRAWAL_READY"}
Why Avoid Storing Derived State?
Consider this timeline:
- User stakes at time T
- Subgraph stores status as "COOLDOWN"
- 24 hours pass...
- Status should be "LOCKED" but no event occurred to update it as it simply time-based.
- Subgraph now shows incorrect state
The Solution
Store the raw data and compute the state when needed:
type StakePosition @entity {id: ID!amount: BigInt!createdAt: BigInt! # Timestamp when stake was createdcoolDownDuration: BigInt! # Duration for cooldown statelockDuration: BigInt! # Duration for locked statewithdrawalWaitingPeriod: BigInt! # Duration for withdrawal waiting period}
And in the application consuming the subgraph data, you could compute the state when needed like so:
function getStakeState(position: StakePosition, blockTimestamp: BigInt): string {const cooldownEnds = position.createdAt + position.coolDownDuration;const lockEnds = cooldownEnds + position.lockDuration;if (blockTimestamp < cooldownEnds) return "COOLDOWN";if (blockTimestamp < lockEnds) return "LOCKED";return "WITHDRAWAL_READY";}
This approach ensures your data stays accurate regardless of time passage, as the state is computed based on current blockchain conditions. Remember: Store the facts, compute the state. This principle helps maintain data consistency in your subgraph over time.
7. N+1 Query Problems in Entity Relationships
In subgraph development, the N+1 query problem is a performance challenge that occurs when your entity relationships force the need for multiple separate queries to fetch related data. This often happens when entities are not properly structured or when relationships aren't efficiently defined.
Understanding the N+1 Problem
Let's look at a problematic example:
type Pool @entity {id: ID!totalLiquidity: BigInt!tokens: [Token!]! @derivedFrom(field: "pool")}type Token @entity {id: ID!pool: Pool! # ✅ GOOD: Direct relationship to Poolsymbol: String!}
With this structure, to get pool data for multiple tokens, you'd need:
- 1 query to fetch all tokens
- N queries to fetch each token's pool information
The Solution: Proper Entity Relationships
Here's how to structure it correctly:
type Pool @entity {id: ID!totalLiquidity: BigInt!tokens: [Token!]! @derivedFrom(field: "pool")}type Token @entity {id: ID!pool: Pool! # ✅ GOOD: Direct relationship to poolsymbol: String!}
Now you can fetch tokens with their pool data in a single query:
{tokens {idsymbolpool {totalLiquidity}}}
Conclusion
Building an efficient subgraph requires careful consideration at every step - from smart contract design to entity relationships and data consistency. The practices outlined in this article serve as a foundation for creating subgraphs that are both performant and developer-friendly.
Key takeaways:
- Design smart contracts with comprehensive events that minimize the need for external calls
- Structure entity relationships thoughtfully, using reverse lookups and avoiding deep nesting
- Maintain data consistency by updating all related entities atomically
- Track counts properly and update them atomically
- Store raw data instead of derived values
- Design relationships to avoid N+1 query problems
Remember that these best practices not only ensure technical excellence but also enhance the developer experience by making queries simpler, faster, and more reliable.
While this article covers many important aspects of subgraph development, the field is constantly evolving. Stay current with The Graph protocol's documentation and community discussions for the latest best practices and optimizations.
By following these guidelines, you'll be well-equipped to build subgraphs that are efficient, maintainable, and a joy for developers to work with.
About The Graph
is the leading indexing and query protocol powering the decentralized internet. Since launching in 2018, it has empowered tens of thousands of developers to effortlessly build and leverage across countless blockchains, including Ethereum, Solana, Arbitrum, Optimism, Base, Polygon, Celo, Soneium, and Avalanche. With powerful tools like Substreams and Token API, The Graph delivers high-performance, real-time access to onchain data. From low-latency indexing to rapid token data, it serves as the premier solution for building composable, data drive dapps.
Discover more about how The Graph is shaping the future of decentralized physical infrastructure networks (DePIN) and stay connected with the community. Follow The Graph on , , , , , and . Join the community on The Graph’s , join technical discussions on The Graph’s .
oversees The Graph Network. , , , and are five of the many organizations within The Graph ecosystem.
