The Graph Developer Newsletter #3

🏗️ Subscribe to get The Graph Developer Newsletter delivered straight to your inbox 🏗️

gm web3 devs.

We appreciate you building on the infrastructure provided by The Graph, and we are looking forward to sharing new developments that will improve your building experience.

Thanks for stopping by!

-

Before we go through these sections, please give us your feedback with this easy Developer Survey:

Developer Survey

Building The Graph Network is a collaborative effort and the core devs would love to hear your feedback!

How likely would you recommend The Graph's decentralized network to a fellow web3 developer? Just click the emoji to indicate your vote.

A short survey follows that you can answer or dismiss.

🙁  😐  🙂

Topics in The Graph Developer Newsletter #3 include:

  • Easier Payments for Queries: Use your credit card, debit card, or other payment methods to pay for query fees in fiat.
  • News from the ecosystem: zkEVMs
  • File Data Sources for parallel fetching of off-chain data during indexing
  • “And/or” filters: Syntactic sugar for GraphQL queries
  • Substreams. An overview of Substreams and their power
  • GraphQL Validations overview
  • Graph Node 0.30.0 Update

Easier Payments for Queries

By integrating Banxa with The Graph, we’ll be able to use good ol’ credit or debit cards and other payment methods to buy GRTs with fiat directly on Arbitrum and automatically add them to the billing balance.

For the crypto natives amongst us, it is still possible to pay directly in GRTs.

After listening closely to the community, we identified a need to improve the billing experience and remove the requirement that users need to hold GRT in order to pay for queries. 🫡

Watch this short video demo, then check out the Banxa integration docs.

News from the ecosystem: zkEVMs

It is no secret that the broader Ethereum ecosystem is very excited about zk-rollups. Compared to optimistic roll-ups like Arbitrum and Optimism, zk-rollups promise faster transaction validity and smaller storage consumption on Ethereum mainnet. In March, two of the most anticipated zk-rollups launched their mainnet and there is also support in Subgraph Studio to index their data:

  • zkSync Era: With network: zksync-era in the subgraph manifest.
  • Polygon zkEVM: With network: polygon-zkevm in the subgraph manifest

Note: To build subgraphs for these protocols, make sure to have the latest @graphprotocol/graph-cli version installed: 0.44.0

File Data Sources Allows Parallel Retrieval of Off-Chain Data During Indexing

Previously, getting data from IPFS paused indexing until the file was found & retrieved. This pause often took several minutes to conclude and sometimes the process never resolved resulting in failed subgraphs or missing data.

With File Data Sources, parallel retrievals and retries are now possible, empowering users to fetch data both on-chain and off-chain from IPFS at the same time!

A “day-1” use-case for File Data Sources is the ability to seamlessly fetch NFT metadata from IPFS and store them in a subgraph in order to aggregate and filter by traits. Retrieving and storing these metadata without slowing down indexing will make subgraphs more resilient and the developer experience more streamlined.

To use these features, update your graph-cli to v0.44.0 and graph-ts to version 0.29.3.

In a nutshell, File Data Sources are a special type of Data Source Templates and are defined accordingly:

There is a more detailed explanation about File Data Sources in the official documentation. File Data Sources are still new and experimental. If you have any questions or suggestions about it, feel free to directly reply to this email.

“And/Or” Filter Added GraphQL Queries

The Guild has added the ability to simplify our GraphQL queries by adding native “And/Or” functionality. These filters are one of the most requested query features.

Here is an example of how to use the And/Or filter:

More in-depth explanation and examples can be found on the official thegraph.com/docs.

Substreams. The next frontier in indexing.

At Graph Day on 2. June 2022 in San Francisco, Sebastian Lorenz from The Graph Councilannounced the developer preview of Substreams. Technical terms like ETLQ and parallelization were used to describe a new era of indexing, with speed gains of up to 100x 🤯. This month, StreamingFast announced general availability of Substreams.

But what are substreams exactly? Why are they so much faster than subgraphs? Can we use them already?

wen substreams?

Alexandre Bourget, CTO of StreamingFast, the second core dev team working on The Graph, did a great presentation during DevCon VI in Bogota about substreams which is available in the DevCon archive. Let’s dive into the technology, learn where it currently stands and how it could fit into your system architecture.

A subgraph basically is an ETL-Q process:

  • Extract blockchain data via the JSON RPC interface. While this is a great and lightweight protocol to directly interact with a blockchain node, it is slow and cumbersome to extract data in bulk. A further limitation of that interface is the lack of an efficient way to validate data integrity.
  • Transform: Extracted data is then handed over to the so-called mappings. They take the raw data, transform it and finally hand it over to the Graph Node to store. These mappings run sequentially, block after block, which makes them slow and hard to speed up.
  • Load: The Graph Node takes that data from the mappings and stores (loads) it into a PostgreSQL database. Database interactions can become a bottleneck for high throughput subgraphs too.
  • Query: Finally, we are able to access that data through a GraphQL interface that is automatically generated according to the database schema.

Now enter substreams:

The StreamingFast team reimagined that process: Following engineering best practices, they broke the problem down into smaller pieces and created specialized technologies to achieve these high performance improvements:

  • Firehose: Simply put, the Firehose circumvents the JSON RPC interface completely by extracting raw blockchain data from the nodes directly into a stream and into flat-files. This enables a streaming-first approach that is highly scalable. On top of that, it enables the possibility to check data integrity thanks to direct access to the merkle-roots. Another cool feature of the Firehose is that it standardizes the data schema across blockchains: Not only EVM-compatible blockchains are supported but also NEAR, Cosmos, Solana, Arweave, and Aptos. Other blockchains can be integrated easily and several integrations are currently in process. Stay tuned 📡.
  • Substreams are similar to a MapReduce process that consumes the raw blockchain stream from the Firehose, similar also to RxJS for the readers that are coming from a JavaScript background. Developers are writing composable Rust modules which are compiled into WebAssembly and combined into a Substream. This enables high reusability, module level caching and parallelization. The output of a Substream is a stream of transformed and refined strongly typed data. Think of a token-price feed for example. One Substream can be the input of another Substream.

Firehose and substreams are already available today. In fact, Cosmos, Arweave and NEAR subgraphs are powered by Firehose.

But what now? As hinted on the image above, we now have a stream of data but we did not store that data anywhere yet. This is where sinks come into play. A sink can be a database, a Slack channel, your MEV/trading bot or anything that can consume a data stream. The first and most obvious option is to bridge the stream to Subgraphs: reuse the database and GraphQL query interface. How this could look like is described in the draft GIP from Adam Fuller: Substreams into Subgraphs: a simple integration. Some first experiments were done and it looks great. This feature is currently in beta and if you would like to use it, simply reply to this email. In terms of enabling Firehose, substreams and more data services on The Graph Network, Jannis Pohlmann already laid out the longer term vision with the draft GIP-0041 A World Of Data ServicesStay tuned 📡.

Speaking of substreams: Chris Steege from Messari wrote an excellent technical article about their approach and usage of substreams: Parallel Indexing Of Blockchain Data With Substreams.

GraphQL Validations: Preparing for the next iteration of GraphQL APIs on subgraphs

GraphQL has become an industry standard and there is a rich ecosystem of tools, not only from The Guild, to make our lives as developers easier. In order to make these tools play nicely together, users of GraphQL should adhere to GraphQL validation.

Previous versions of Graph Node did not implement all validations and provided responses to queries even if they are not valid. So, in cases of ambiguity, graph-node was ignoring invalid GraphQL operations components.

GraphQL Validations support is the pillar for the upcoming new features and the performance at scale of The Graph Network.

It will also ensure determinism of query responses, a key requirement on The Graph Network.

Although there is no date specified for enabling GraphQL validations, developers are encouraged to start updating their query logic in advance because enabling the GraphQL Validations is expected to break some existing queries sent to subgraph query URLs.

The Guild wrote a thorough GraphQL Validations migration guide that you can find here. The simplest way to check if queries would pass validation is to replace the GraphQL endpoint with this one:

https://api-next.thegraph.com/subgraphs/name/<GITHUB_USER>/<SUBGRAPH_NAME>

The Guild is also offering free assistance to migrate your code-base. Just reply to this email if you would like to get in touch with them.

Updates across products:

Graph Node v0.30.0

Graph Node v0.30.0 was shipped by Edge & Node last month. The features are already released, available in Subgraph Studio developer preview URL, and on The Graph Network. We talked about File Data Sources and AND/OR filters above. Here are some more highlights:

  • New Graph Node installations now mandate PostgreSQL to use C locale and UTF-8 encoding.
  • This does not affect most subgraph developers or data consumers. However, if you are running local development scripts or CI pipelines, you may have to adjust your database initialization parameters. This can be done with initdb -E UTF8 --locale=C
  • Sorting by child entities (a.k.a. nested sorting). We can now order by properties of child entities as described here.

Full release notes for Graph Node v0.30.0

-

Happy hacking!

Marcus & Simon from Edge & Node, working on The Graph


Building The Graph Network is a collaborative effort and the core devs would love to hear your feedback!

How likely would you recommend The Graph's decentralized network to a fellow web3 developer? Just click the emoji to indicate your vote.

🙁  😐  🙂


Category
Graph Builders
Author
Marcus Rein
Published
April 1, 2023

Marcus Rein

View all blog posts