The Subgraph DevEx Blueprint: 7 Tips to Create Great Subgraph Development Experiences

Introduction

For every piece of software—be it a mobile app, web app, or API server—there are efficient and inefficient ways of building it.

The difference between well-written software and badly written one isn't only glaring from the code and project structure; it manifests clearly in the system's performance. Two applications built for the same purpose and performing identical functions can have a huge gap in their performance, and this can often be traced back to how they were built. Take an API server, for instance: the way the database is queried can make a dramatic difference in performance, and defining proper indexes for your data model can significantly improve query speed.

Subgraphs on The Graph protocol, just like every other piece of software, must be built with both efficiency and developer experience in mind. Poor implementation choices in subgraph development can lead to slower indexing times, increased resource consumption, and a frustrating developer experience for those consuming the API. In this article, we'll dive into proven strategies for building subgraphs that are both efficient to run and intuitive for developers to query, ensuring your blockchain data is both accessible and performant.

Before we dive into this article proper, here are the key concepts we'll be covering:

Subgraph-friendly smart contract
Reverse lookups & Meaningful relationships between entities
Data consistency across entities
Count tracking
Timestamp and ordering considerations
Smart contract derived values
Poor entity relationships leading to N+1 query problems

This is not in any way a silver bullet for building subgraphs, but a collection of best practices, optimization techniques, and common pitfalls to avoid in order to achieve building an efficient subgraph. That said, let's jump right into them one after the other.

1. Subgraph-friendly smart contract

Building a subgraph starts way before you write the first line of GraphQL schema. It begins with how your smart contract is designed and the events it emits. Think of your smart contract as the primary data source, and the events as the storytellers that will feed your subgraph with rich, meaningful information. The goal is simple: design your smart contract to be a data-friendly companion to your subgraph. This means emitting comprehensive events that capture every change in your contract's state. Why is this important? Because every additional external call your subgraph makes to the smart contract is a performance hit waiting to happen.

What Makes a Smart Contract Subgraph-Friendly?

A subgraph is like an off-chain copy of your smart contract state, it always tries to build an exact copy of the smart contract state through events. Hence, a subgraph-friendly smart contract emits events for every single state change. Staking, unstaking, rewards claiming, configuration updates, etc. These events combined help the subgraph get an accurate sequence of changes in the smart contract and always maintain the same state as your smart contract at all times.

What makes an Event Subgraph-Friendly?

A subgraph-friendly event is like a detailed postcard. It doesn't just say "something happened," it tells you exactly what happened, who was involved, and provides all the context you might need. Let's break down the key characteristics:

Complete Data Payload: Your events should contain all relevant information about the state change. For example, if a stake happens, don't just emit the ID. Include the address, amount, and every other detail that might be useful for indexing and building the state.
Avoid External Lookups: Every time your subgraph needs to make an additional call to the smart contract to fetch more data, you're adding latency and potential points of failure. A well-designed event eliminates these extra steps.

Example of a non-subgraph-friendly event:

event Stake(uint256 indexed positionId);

Example of a subgraph-friendly event:

event Stake(uint256 indexed positionId, address indexed user, uint256 amount);

See the difference? The second event provides complete information about the stake that happened, reducing the need for additional contract calls.

While being comprehensive, also be mindful of gas costs. Find the right balance between detailed events and contract execution cost.

2. Reverse lookups & Meaningful relationships Between Entities

In subgraph development, the relationships between your entities are like the social networks of your data. Just as meaningful connections make a social network valuable, well-designed entity relationships make your subgraph powerful and easy to query.

GraphQL shines when you can easily traverse and explore relationships between different entities. This is where meaningful relationships and reverse lookups come into play. The goal is to structure your schema in a way that makes data exploration intuitive and efficient.

What Are Meaningful Relationships?

Imagine you're building a staking platform. Instead of just having isolated entities, you want to create natural connections:

type User @entity {
  id: ID!
  # Reverse lookup of stakes owned by this user
  stakes: [Stake!]! @derivedFrom(field: "user")
}

type Stake @entity {
  id: ID!
  amount: BigInt!
  # Direct link to the user who created this stake
  user: User!
}

With these relationships defined, you can easily traverse the data with queries like:

{
  users(first: 5) {
    id
    stakes {
      id
      amount
    }
  }
}

Performance Implications of Nested Relationships

While relationships make querying intuitive, be cautious with deeply nested relationships. Each level of nesting can impact query performance, especially when dealing with large datasets. For example, this query could become expensive:

{
  protocols {
    pools {
      tokens {
        pairs {
          swaps {
            # Deep nesting can impact performance
          }
        }
      }
    }
  }
}

Instead, consider flattening relationships where possible and providing direct access paths to frequently accessed data.

The Power of Reverse Lookups

Reverse lookups allow you to create bi-directional relationships without storing redundant data. The @derivedFrom directive is your best friend here. It creates a virtual relationship that can be queried just like a direct relationship, but without the additional storage overhead.

Here's why you should use reverse lookups in your subgraph:

One-to-Many Relationships: Perfect for scenarios like users and their stakes (as seen in the previous example), pools and their swaps, etc.
Data Deduplication: Instead of storing the same information in multiple places. When you create/update an entity, reverse lookup fields are read-only, there is no need to explicitly set/update them.
Efficient Querying: Enable easy traversal between related entities.

Meaningful entity relationships and reverse lookups are about creating a natural, intuitive data graph that makes querying as smooth as possible. It's not just about connecting entities, but about telling a coherent story with your data.

3. Data Consistency Across Entities

In subgraph development, data consistency is not just a best practice—it's a critical requirement. When an event occurs in your smart contract, it's crucial to update every single entity that is affected by that event. Failing to do so is like telling a story with missing pages—the narrative becomes broken and unreliable.

Consider a staking platform with multiple interconnected entities:

type StakingContract @entity {
  id: ID!
  totalStaked: BigInt!
  totalPositionCount: Int!
  totalOpenPositionCount: Int!
}

type User @entity {
  id: ID!
  stakePositionCount: Int!
  openPositionCount: Int!
  totalStakedAmount: BigInt!
  stakePositions: [StakePosition!]! @derivedFrom(field: "user")
}

type Withdrawal {
  id: ID!
  stakePosition: StakePosition!
  reward: BigInt!
}

type StakePosition @entity {
  id: String!
  user: User!
  amount: BigInt!
  withdrawal: Withdrawal
  status: String!
}

When a user stakes tokens, you must update multiple entities:

export function handleStake(event: StakeEvent): void {
  // Load or create entities
  let stakingContract = loadOrCreateStakingContract(event.address)
  let user = loadOrCreateUser(event.params.user.toHexString())

  // Create new stake position
  let stakePosition = new StakePosition(event.params.positionId.toString())

  // Update StakingContract global state to track total protocol metrics
  stakingContract.totalStaked = stakingContract.totalStaked.plus(event.params.amount)
  stakingContract.totalPositionCount += 1
  stakingContract.totalOpenPositionCount += 1

  // Update User entity to maintain user-specific statistics
  user.stakePositionCount += 1
  user.openPositionCount += 1
  user.totalStakedAmount = user.totalStakedAmount.plus(event.params.amount)

  // Create and configure stake position with initial state
  stakePosition.user = user.id
  stakePosition.amount = event.params.amount
  stakePosition.status = 'OPEN'

  // Save all entities atomically
  stakingContract.save()
  user.save()
  stakePosition.save()
}

export function handleWithdrawal(event: WithdrawalEvent): void {
  // Load or create entities
  let stakingContract = loadOrCreateStakingContract(event.address)
  let user = loadOrCreateUser(event.params.user.toHexString())
  let stakePosition = StakePosition.load(event.params.positionId.toString())

  // Update StakingContract global state to reflect withdrawal
  stakingContract.totalStaked = stakingContract.totalStaked.minus(event.params.amount)
  stakingContract.totalOpenPositionCount -= 1

  // Update User entity to maintain accurate user statistics
  user.openPositionCount -= 1
  user.totalStakedAmount = user.totalStakedAmount.minus(event.params.amount)

  // Create the withdrawal entity to track withdrawal details
  let withdrawal = new Withdrawal(event.transaction.hash)
  withdrawal.stakePosition = stakePosition.id
  withdrawal.reward = event.params.reward

  // Update stake position to reflect withdrawn state
  stakePosition.withdrawal = withdrawal.id
  stakePosition.status = 'EXITED'

  // Save all entities atomically
  stakingContract.save()
  user.save()
  withdrawal.save()
  stakePosition.save()
}

Likewise when the user withdraws the stake position:

Notice how for every stake and withdrawal event, several entities were updated, and this is because these entities hold values that are affected by the changes caused by the operations that emitted the event.

Key Principles

Update All Related Entities: Every event should update all the entities it affects
Maintain Derived State: Keep global and user-level statistics in sync

Potential Risks of Inconsistent Updates

Mismatched user or global state statistics
Unreliable reporting and analytics
Difficult debugging

Remember, in a subgraph, your goal is to create an exact, trustworthy replica of your smart contract's state. Consistency is key.

4. Count Tracking

In subgraph development, keeping track of counts isn't just about having nice statistics—it's a crucial aspect that impacts both data overview and user experience. As we saw in the previous section with the staking example, tracking counts like totalPositionCount and openPositionCount serves multiple important purposes.

Why Track Counts?

Data Overview

Having count fields in your entities provides quick insights into your protocol's state. Instead of running complex queries or calculating totals on the fly, developers can easily access summarized data. For example:

Total number of users
Total active positions
Total transactions processed

Atomic Updates for Counts

When updating counts, it's crucial to ensure that all related count updates happen atomically within the same event handler. This means:

All count updates should be part of the same transaction
If any part of the update fails, all updates should be rolled back
Counts across different entities should always stay in sync

Here's an example schema that demonstrates count tracking:

 type Protocol @entity {
  id: ID!
  totalUsers: Int!
  totalTransactions: Int!
  activePositions: Int!
}

type User @entity {
  id: ID!
  positionCount: Int!
  activePositionCount: Int!
  positions: [Position!]! @derivedFrom(field: "user")
}

# ...other entities here

With these counts, a dApp developer can easily implement pagination:

const ITEMS_PER_PAGE = 10;
const totalPages = Math.ceil(user.positionCount / ITEMS_PER_PAGE);

Remember to keep these counts up to date with every relevant event. Accurate count tracking ensures reliable pagination and statistics across your subgraph's consumers.

5. Timestamp and Ordering Considerations

In subgraph development, how you store your data directly impacts how you can query it. One crucial aspect often overlooked is the ability to order entities effectively. Let's look into two key considerations that can make or break your querying experience.

Timestamp Fields

When you want to order data chronologically (which is a common requirement), having a timestamp field in your entity is crucial. GraphQL queries in subgraphs cannot order by nested entity properties, so the timestamp must be present in the main entity itself.

Here's a schema and query example:

type Position @entity {
  id: ID!
  user: User!
  amount: BigInt!
  createdAt: BigInt! # Timestamp for ordering
}

type User @entity {
  id: ID!
  positions: [Position!]! @derivedFrom(field: "user")
}

{
  positions(orderBy: createdAt, orderDirection: desc) {
    id
    amount
    user {
      id
    }
  }
}

Numeric Value Types

When storing numeric values that you might want to order by, the type choice is critical. Always use Int or BigInt for numeric values—never String. Here's why:

type Token @entity {
  id: ID!
  amount: BigInt!        # ✅ Good: Will order correctly
  amountString: String!  # ❌ Bad: Will cause unexpected ordering
}

With string numeric values, you'll get counterintuitive results:

"10" comes before "2" in string ordering
"100" comes before "20" in string ordering

This can lead to confusing results when ordering token amounts, user balances, or any numeric data.

Remember to always choose the appropriate type for your data based on how you plan to query and order it. Your future self (and other developers) will thank you!

6. Smart Contract Derived Values

In subgraph development, one critical pitfall to avoid is storing values that are derived from smart contract or blockchain state. These values can become stale without any event to trigger an update, leading to inconsistencies between your subgraph and the actual blockchain state.

Understanding the Problem

Let's look at a staking contract example with different time-based states:

Stake Cooldown Period: 24 hours
Stake Lock Period: 30 days
Withdrawal Waiting Period: 24 hours

Here's what NOT to do:

type StakePosition @entity {
  id: ID!
  amount: BigInt!
  status: String!  # ⚠️ BAD: Storing computed state like "COOLDOWN", "LOCKED", "WITHDRAWAL_READY"
}

Why Avoid Storing Derived State?

Consider this timeline:

User stakes at time T
Subgraph stores status as "COOLDOWN"
24 hours pass...
Status should be "LOCKED" but no event occurred to update it as it simply time-based.
Subgraph now shows incorrect state

The Solution

Store the raw data and compute the state when needed:

type StakePosition @entity {
  id: ID!
  amount: BigInt!
  createdAt: BigInt!              # Timestamp when stake was created
  coolDownDuration: BigInt!      # Duration for cooldown state
  lockDuration: BigInt!          # Duration for locked state
  withdrawalWaitingPeriod: BigInt! # Duration for withdrawal waiting period
}

And in the application consuming the subgraph data, you could compute the state when needed like so:

function getStakeState(position: StakePosition, blockTimestamp: BigInt): string {
  const cooldownEnds = position.createdAt + position.coolDownDuration;
  const lockEnds = cooldownEnds + position.lockDuration;

  if (blockTimestamp < cooldownEnds) return "COOLDOWN";
  if (blockTimestamp < lockEnds) return "LOCKED";
  return "WITHDRAWAL_READY";
}

This approach ensures your data stays accurate regardless of time passage, as the state is computed based on current blockchain conditions. Remember: Store the facts, compute the state. This principle helps maintain data consistency in your subgraph over time.

7. N+1 Query Problems in Entity Relationships

In subgraph development, the N+1 query problem is a performance challenge that occurs when your entity relationships force the need for multiple separate queries to fetch related data. This often happens when entities are not properly structured or when relationships aren't efficiently defined.

Understanding the N+1 Problem

Let's look at a problematic example:

type Pool @entity {
  id: ID!
  totalLiquidity: BigInt!
  tokens: [Token!]! @derivedFrom(field: "pool")
}

type Token @entity {
  id: ID!
  pool: Pool!       # ✅ GOOD: Direct relationship to Pool
  symbol: String!
}

With this structure, to get pool data for multiple tokens, you'd need:

1 query to fetch all tokens
N queries to fetch each token's pool information

The Solution: Proper Entity Relationships

Here's how to structure it correctly:

type Pool @entity {
  id: ID!
  totalLiquidity: BigInt!
  tokens: [Token!]! @derivedFrom(field: "pool")
}

type Token @entity {
  id: ID!
  pool: Pool!       # ✅ GOOD: Direct relationship to pool
  symbol: String!
}

Now you can fetch tokens with their pool data in a single query:

{
  tokens {
    id
    symbol
    pool {
      totalLiquidity
    }
  }
}

Conclusion

Building an efficient subgraph requires careful consideration at every step - from smart contract design to entity relationships and data consistency. The practices outlined in this article serve as a foundation for creating subgraphs that are both performant and developer-friendly.

Key takeaways:

Design smart contracts with comprehensive events that minimize the need for external calls
Structure entity relationships thoughtfully, using reverse lookups and avoiding deep nesting
Maintain data consistency by updating all related entities atomically
Track counts properly and update them atomically
Store raw data instead of derived values
Design relationships to avoid N+1 query problems

Remember that these best practices not only ensure technical excellence but also enhance the developer experience by making queries simpler, faster, and more reliable.

While this article covers many important aspects of subgraph development, the field is constantly evolving. Stay current with The Graph protocol's documentation and community discussions for the latest best practices and optimizations.

By following these guidelines, you'll be well-equipped to build subgraphs that are efficient, maintainable, and a joy for developers to work with.

About The Graph

The Graph is the leading indexing and query protocol powering the decentralized internet. Since launching in 2018, it has empowered tens of thousands of developers to effortlessly build Subgraphs and leverage Substreams across countless blockchains, including Ethereum, Solana, Arbitrum, Optimism, Base, Polygon, Celo, Soneium, and Avalanche. With powerful tools like Substreams and Token API, The Graph delivers high-performance, real-time access to onchain data. From low-latency indexing to rapid token data, it serves as the premier solution for building composable, data drive dapps.

Discover more about how The Graph is shaping the future of decentralized physical infrastructure networks (DePIN) and stay connected with the community. Follow The Graph on X, LinkedIn, Instagram, Facebook, Reddit, Farcaster and Medium. Join the community on The Graph’s Telegram, join technical discussions on The Graph’s Discord.

Category: Developer Corner
Author: Adekunle Michael Ajayi
Published: April 17, 2025
Updated: April 17, 2025

Adekunle Michael Ajayi

View all blog posts⁠