The Evolution of Subgraphs: What’s New and What’s Coming Next

Events, The Graph By Oct 08, 2024 No Comments

The Graph Builders Office Hours #50 featured a subgraph showcase highlighting new and upcoming features.

TL;DR: Learn about the latest advancements in subgraph technology, including timeseries aggregations, indexed argument filtering, and parallelized ETH calls, while also discussing the future potential of modular, reusable subgraphs and improvements to IPFS robustness within The Graph ecosystem.

The Graph ecosystem is evolving rapidly, offering more powerful tools for developers to build, manage, and optimize subgraphs. In episode 50 of The Graph Builders Office Hours, Marcus Rein and Alex Pakalniskis (both from Edge & Node) discuss advancements in subgraph technology.

This article summarizes the key updates, upcoming developments, and the future potential of subgraphs on the decentralized web.

Alex’s presentation notes

Key enhancements in subgraphs

Timeseries and aggregations

Introducing timeseries and aggregations is one of the most significant updates in subgraph technology. This experimental feature, available from spec version 1.1.0 onwards, allows developers to store raw data points as timeseries and define how these data points are aggregated over time.

Overview

  • Timeseries entities: These entities record data points with timestamps and are immutable. They are defined with @entity(timeseries: true) in the GraphQL schema. The id is automatically set as an Int8 type, and the timestamp is automatically set to the current block’s timestamp.
  • Aggregation entities: These entities perform pre-declared calculations on the timeseries data points on an hourly or daily basis and store the results for easy access via GraphQL. Aggregations are defined with the @aggregation annotation.

Example

type Data @entity(timeseries: true) {
  id: Int8!
  timestamp: Timestamp!
  price: BigDecimal!
}

type Stats @aggregation(intervals: ["hour", "day"], source: "Data") {
  id: Int8!
  timestamp: Timestamp!
  sum: BigDecimal! @aggregate(fn: "sum", arg: "price")
}

How it works

  • Raw data points: Mappings for this schema will add data points by creating Data entities, similar to normal entities. graph-node will automatically populate the Stats aggregations whenever a given hour or day ends.
  • Dimensions and aggregates: Aggregations can contain dimensions (fields that group the data) and aggregates (fields with the @aggregate directive that perform calculations). For example, a TokenStats aggregation could group data points by token and calculate the total volume and last price in USD.

Advanced features

During the session, Marcus raises a key question: “How do these aggregations handle the complexity of managing cumulative data across different intervals?”

Alex replies: “The answer lies in the flexibility of the @aggregate directive, which supports a cumulative flag. This allows developers to decide whether the aggregation should only sum data within the interval or continue accumulating values over time.”

Example GraphQL query

{
  stats(interval: "hour", where: { timestamp_gt: 1704085200 }) {
    id
    timestamp
    sum
  }
}

Querying hourly stats for data points with timestamps greater than 1704085200 to retrieve the sum of values.

Indexed argument filtering

Indexed argument filtering is another breakthrough feature, which allows developers to precisely filter blockchain events based on the values of indexed arguments. This feature enables more efficient data management by reducing data bloat and improving the speed of subgraphs.

For example, in token transfer events, developers can now specify which addresses to monitor, ensuring that only relevant data is indexed. This pre-filtering capability is particularly beneficial in scenarios where subgraphs handle large volumes of data but only need specific information.

Example

eventHandlers:
  - event: Transfer(indexed address,indexed address,uint256)
    handler: handleDirectedTransfer
    topic1: ['0xAddressA']
    topic2: ['0xAddressB']

This setup ensures that the subgraph only indexes transactions between specified addresses, optimizing resource usage.

Parallelized ETH calls

ETH calls have traditionally been a bottleneck in subgraph performance, particularly when processing large volumes of data. To address this, The Graph introduced declarative ETH calls, allowing these calls to be executed in parallel. This improvement boosts the efficiency of subgraphs, especially in high-transaction environments.

While still in its early stages, the ability to parallelize ETH calls is expected to make subgraphs more responsive and capable of handling complex operations faster.

Please note: Edge & Node is no longer pursuing this work as of September 23, 2024.

Example

calls:
  global0X128: Pool[event.address].feeGrowthGlobal0X128()
  global1X128: Pool[event.address].feeGrowthGlobal1X128()

This setup parallelizes the execution of ETH calls, making subgraphs more responsive and capable of handling complex operations with greater speed.

Marcus highlights the importance of this feature, asking: “What kind of performance gains can we expect when handling high-frequency ETH calls?”

Alex explains: “While exact metrics are still being gathered, the expected improvements are substantial, particularly for subgraphs dealing with high-volume data.”

What’s next for subgraphs?

Modular subgraphs as reusable data artifacts

One of the most exciting prospects is the concept of subgraphs as modular, reusable data artifacts. This development will allow developers to use existing subgraphs or components of subgraphs as data sources, which can then be extended and customized for specific needs. This modular approach will enable developers to build on top of each other’s work, fostering a more interconnected and efficient ecosystem.

This shift could revolutionize how we use subgraphs, turning them into flexible building blocks that developers can combine to create sophisticated applications. The potential for reusability and composability in subgraphs opens up new avenues for innovation and collaboration within the decentralized web.

Example

dataSources:
  - kind: subgraph
    name: Foo
    network: mainnet
    source:
      id: 'Qmblahblahblah'
      startBlock: 123456
    mapping:
      kind: subgraph/entities
      apiVersion: 0.0.y
      language: wasm/assemblyscript
      entityHandlers:
	      - entity: FooSpecificEntity
	        handler: handleFooSpecificEntity
	      - entity: SomeOtherFooEntity
		      handler: handleSomeOtherFooEntity
      file: ./src/fooMappings.ts

Setting up a subgraph data source for the ‘Foo’ entity on the mainnet, starting from block 123456 with custom entity handlers mapped to AssemblyScript.

Marcus poses a thought-provoking question: “How will this modularity impact the way developers collaborate across different projects?”

Alex responds: “The modular subgraphs approach encourages reuse and collaboration, allowing developers to build on top of each other’s work, leading to more cohesive and interconnected decentralized applications.”

IPFS robustness improvements

While not as flashy as other updates, ongoing work to improve IPFS robustness is crucial for ensuring the reliability and scalability of The Graph’s infrastructure. The migration to the IPFS Gateway API and the separation of file hosting for subgraph definition files versus optional data sources will help reduce collisions and enhance overall performance.

The future of subgraphs: a playground for innovation

Beyond these updates, The Graph ecosystem is exploring innovative ideas, such as a plugin system for subgraphs. This system would allow developers to extend subgraph functionality with minimal effort, potentially integrating features like ENS name resolution or decentralized oracle price feeds.

How it works

  • Plugins: Developers can specify plugins in the subgraph manifest, which extend the functionality of subgraphs with new features like ENS name lookups or price feeds.

Example

plugIns:
  - ENS
  - Chronicle

Integrating ENS and Chronicle plugins to extend subgraph functionality with enhanced name resolution and historical data features.

During the discussion, Marcus asks: “What kind of new possibilities does this plugin system open up for developers?”

Alex replies: “The plugin system is expected to significantly lower the barrier for extending subgraph functionality, enabling more customized and innovative solutions within the decentralized web.”

Stay in touch

From aggregation and filtering enhancements to the potential of modular subgraphs, these advancements will equip developers with powerful tools for building the decentralized applications of the future.

To stay updated or dive deeper into these developments, connect with the community at Graph Builders on X, explore resources on Graph BuildersDAO Discord, and catch up on all the discussions with the playlist for Graph Builders Office Hours.

You may also be interested in:

💡 This article answers questions like:
- What new features have been introduced to subgraphs?
- What features are planned for subgraphs in the future?
- What do timeseries and aggregations allow a developer to do?
- What is indexed argument filtering and what does it do?
- How could subgraphs be modular and reusable in the future?
Author

I am a dedicated member of the Graph Advocates DAO and proud to be a part of the Graphtronauts community. As a passionate crypto investor and enthusiast, I have delved into the world of decentralized technologies, with a strong focus on The Graph protocol. My journey includes writing insightful blogs for Graphtronauts and contributing to the development of subgraph documentation for various projects within The Graph ecosystem. Most recently, I have taken on the role of a Pinax technical writer, further expanding my commitment to advancing the adoption and understanding of blockchain and Graph-based technologies. /n https://twitter.com/PaulBarba12 https://github.com/PaulieB14 https://hey.xyz/u/paulieb https://medium.com/@paulieb.eth/about

No Comments

Leave a comment

Your email address will not be published. Required fields are marked *