The Graph Indexer Office Hours #155

Events By Wisdom Nwokocha May 03, 2024 1 Comment

TL;DR: This recap of Indexer Office Hours includes the announcement of Geth new release v1.14.0, which includes Firehose support, and a lively discussion about indexer automation.

Opening remarks

Hello everyone, and welcome to the latest edition of Indexer Office Hours! Today is April 30, and we’re here for episode 155.

GRTiQ 166

Don’t miss the GRTiQ Podcast with Connor Howe, founder at Enso Finance, a platform providing financial infrastructure for developers seeking to integrate DeFi interactions into their dApps.

Repo watch

The latest updates to important repositories

Execution Layer Clients

Geth New release v1.14.0 :
- Geth v1.14.0 (Asteria) is a major release with new features and some breaking changes. Please read the release notes before updating.
- State Trie Representation Change: Default state trie representation has shifted from hash mode to path mode, improving historical state pruning and controlling state growth.
- Live Tracing Feature: A new live-tracing feature was introduced, allowing transaction tracers to be embedded directly into the block processing pipeline. This enhances flexibility but requires modifications at the source code level.
- Transaction Propagation Improvement: Changes to transaction propagation to minimize reordering and reduce nonce gaps, improving network consistency.
- Dropping Pre-Merge Network Support: No longer supports running pre-merge networks, focusing on post-merge network operations driven by a consensus client.
- Reductions in Automatic Operations: Stops automatic construction of pending blocks and removes support for filtering pending logs, moving to more on-demand operations.
- Beacon Chain Light Client Integration: Ships with a beacon chain light client that follows the beacon chain using REST API of beacon nodes, suitable for light client operations but not for validation or staking.
- Go Version Update: Upgraded to use Go v1.22 by default, dropping support for Go v1.20.
Matthew from Pinax shared in the chat:
- The Geth release includes Firehose support!
- From the release notes: Geth v1.14.0 introduces a brand new live-tracing feature, where one or more transaction tracers might be injected into the block processing pipeline, ensuring that tracing and execution happen in lockstep (#29189). Since Go does not have a cross-platform OS native plugin infrastructure, adding live tracers needs to be done at the Geth source code level, and Geth itself subsequently rebuilt. That said, the advantage is that such tracers have full execution flexibility to do whatever they like and however they like. Please see the live-tracer changelog and docs for details.
  - Live tracing runs in lockstep with block execution, one waiting for the other. You should never run a live tracer on a validating node, or any other where latency is important. The recommended practice is to have tracers collect and export the bare minimum data needed and do any post-processing in external systems where latency is not relevant.
  - The live tracer work required a number of breaking internal API changes. If you had your own native tracers implemented before this change, the changelog contains the necessary steps needed to update your old code for the new APIs.
- ^^^ this supports Firehose.
- So any fork of Geth now gets Firehose support!
Nethermind: New release v1.26.0 :
- This release focuses on substantial backend improvements, particularly in state management and sync efficiency. It also includes better support features and an ongoing commitment to security and performance enhancements.
- State Design Upgrade to Half-Path: Introduces a new state design called half-path, which aims to improve validators’ performance and reduce archive node sync times and database size.
- Performance Improvements:
  - Block processing times improved by approximately 30% to 50%.
  - State database size reduced by about 25%.
  - Overall database size post-snap sync reduced by approximately 50 GB.
  - State database growth rate significantly decreased.
- Migration to Half-Path:
  - Default state design for newly synced nodes.
  - Existing nodes remain on the old hash design unless manually switched.
  - Migration options include full resync or full pruning with specific settings adjustments for efficiency.
- Snap Serving Support: Nethermind now supports snap serving, enhancing network redundancy and reducing reliance on Geth.
- New Docker Image: Introduction of a chiseled rootless Docker image for enhanced security.
- Deprecations:
  - Removal of Snappy dependency.
  - Several sync options and metrics deprecated, with a shift to more general, labeled metrics.
Celo: New release v1.8.4 :
- The v1.8.3 release introduced a bug in the effectiveGasPrice calculation for the RPC output. This release fixes that bug. All fee payments have been correct all the time.
Arbitrum-nitro New releases:
- v2.3.4-rc.3 :
  - This release improves EIP-4844 batch posting and adds support for Redis streams as a block validation work queue compared to the previous release candidate.
- v2.3.4-rc.4 :
  - This release improves EIP-4844 batch posting and merges in a new go-ethereum version compared to the previous release candidate.

Consensus Layer Clients

Information on the different clients

Teku: New release 24.4.4 :
- This is a recommended update containing performance improvements and removes support for the Goerli network –network=goerli.
Nimbus: New release v24.4.0 :
- Nimbus v24.4.0 is a low-urgency release with stability and performance improvements. It also removes built-in Prater/Goerli chain support; people are encouraged to migrate to Holesky or Sepolia.

Graph Stack

Subgraph-radio: New release 1.0.4-alpha.1 :
- Fixes db locked error.

Graph Orchestration Tooling

Join us every other Wednesday at 5 PM UTC for Launchpad Office Hours and get the latest updates on running Launchpad.

The next one is on May 8. Bring all your questions!

Blockchains Operator Upgrade Calendar

The Blockchains Operator Upgrade Calendar is your one-stop solution for tracking hard fork updates and scheduled maintenance for various protocols within The Graph ecosystem.

Simplify your upgrade process and never miss a deadline again.

Add to your calendar!

Protocol watch

The latest updates on important changes to the protocol

Forum Governance

GRC-002: QoS Oracle V2
- Please look at the structure of the data for the V2 QoS Oracle format and provide feedback on the forum post.
- What data is missing? What information does the gateway have that would help you understand your performance as an indexer?
GIP-0065: Subgraph Availability Manager

Contracts Repository

[WIP] Horizon: add subgraph data service #946 (open)
[WIP] Horizon Staking #967 (open)

Open discussion

Sunbeam has begun

Marcus, Developer Relations at Edge & Node, reminded everyone that the upgrade window is now open (it closes on June 12).

Now is the time to reach out to hosted subgraph owners and remind them to upgrade to the decentralized network.

Also, suggest that they optimize their subgraphs after upgrading:

Indexer automation

Abel moderated a discussion on indexer automation, the issue Marc-André from Ellipfra raised last week. Since Marc-André raised the topic, he was given room to clarify his concerns.

With the large influx of new subgraphs and more coming every week, it’s getting difficult to handle them all and determine which subgraphs to index. The current tooling revolves around the indexer figuring out the optimal allocation size for every subgraph. It doesn’t provide enough information to make the decision-making process quick and easy, and this decision-making process has become a bottleneck for indexers.

Marc-André described his workflow and the factors he considers:

Can I expect any query traffic coming from the subgraph?
Is the source a reputable project or team?
Is the subgraph resource-heavy or difficult to index?
Game theory or prediction: what will be the total allocation I will see? I have to predict how much allocation I will see at the end of the day, next week, or at the end of the epoch because I will size my allocation differently if I expect a lot of allocations. If I expect to be the only one indexing that subgraph, my allocation will be very small.
Am I the first to allocate? I have to be very careful about sizing it accordingly. Sometimes, I will use Explorer to see if anyone else managed to sync the subgraph.
Does it look difficult, or will it take a long time to index this subgraph? How heavy is it? Is it using a lot of resources?
Is it killing my RPC node? Is my database on its knees?

Alexis from Semiotic Labs shared his process in the chat:

Run the allocation optimizer without filters, examine all the subgraphs it gives me, and off-chain sync what I like.
Remove the failed ones.
Run the allocation optimizer with a whitelist containing only the subgraphs I got to chain head, then allocate.

All this is handled using his Python wrapper script.

Vincent from Data Nexus shared in the chat:

I have filters in my allocation tool that I use to get close to what we’re looking for. It is limited to network subgraph data though.

Indexer Tools V3

Derek from Data Nexus shared:

There is a tool that gives you a view of how many entities a subgraph has. That doesn’t translate to how many table records, but it at least gives you a sense of how many entities are in there. You can use this as a general feel for how big the subgraph is, assuming the Subgraph Studio indexer was able to sync it and is reporting the data.

Abel asked Derek:

Q1: You started leveraging QoS data and mashing it with the network subgraph and your local subgraph deployment table, which is interesting. I would love to learn more about that. How that’s useful? What you’re using it for?

Derek:

With the number of subgraphs we’ve been seeing, it’s been a little tricky to keep up with everything. So, I’ve been thinking, what if we take the other route? Because, a certain portion of our stake we intentionally partition specifically for query fee production.

A large percentage is going toward what will give us the best return on indexing rewards, but then, there’s a small percentage that we’ll put 1,000 GRT allocations. Presumably, that’s all that should be needed in many cases to claim the full query fee amount or near-full query fee.

So trying to get an idea of which of these subgraphs to start syncing and sometimes they have signal, but just in my talks with different consumers, a lot of them don’t yet understand that mechanism well, and now that there’s the Edge & Node upgrade indexer that is already allocating to these guys, especially ones that have been coming over from the hosted service and they’re already at the chain head, a consumer could, in theory, start querying it without any signal and also never really need to put signal because they do have at least one indexer on there.

So, we synced the QoS subgraph, and I was writing some queries joining in other subgraphs and locally what we’ve synced. From this, we can identify which subgraph produced how many query fees in the last 24 hours. We can also determine whether or not we’re allocated to it, whether we’ve synced it, whether we’re syncing it, or whether we haven’t started syncing.

How many indexers are on it to give us an idea of how many query fees we could potentially earn? Knowing that this allocation is not going to produce indexing rewards, it is purely going to be to generate query fees.

Payne from Stake Squid asked the team that built the allocation optimizer:

Q2: Do you know if it has a function that also creates allocations for subgraphs? For example, say you want to allocate to all the subgraphs on the network. Can we make it do that?

Chris from GraphOps:

Yes. If you set the pinned list to a long set of subgraphs, it should optimize over that list, and the resulting output will create allocations for everything. The ones in the optimization set will have the appropriate allocation size, and the ones you know fall outside of the optimization have a 0.1 GRT allocation size.

Hope from GraphOps typed in the chat: You just have to provide all the network names you support as an indexer.

Chris from GraphOps asked:

Q3: Do you think that we need better tools that drive indexer agent?

Marc-André:

In the early days, the design intention of the indexer agent was to handle all these things automatically. Quickly, we all saw that we needed the human in the loop to validate everything. That’s what I’m doing, and I’m revalidating everything. But we need to; if not removing the human altogether from the loop, we need to ensure that the agent, or whatever else, is able to make decisions in the vast majority of the allocation and on allocation decisions.

blockchain indexing, firehose, indexers, subgraphs, The Graph

The Graph Indexer Office Hours #155

Opening remarks

GRTiQ 166

Repo watch

Execution Layer Clients

Consensus Layer Clients

Graph Stack

Graph Orchestration Tooling

Blockchains Operator Upgrade Calendar

Protocol watch

Forum Governance

Contracts Repository

Open discussion

Sunbeam has begun

Indexer automation

Wisdom Nwokocha

Unlock Liquidity & Rewards with Tenderize Liquid Staking on The Graph

What’s the Big Deal with Substreams-Powered Subgraphs?

1 Comment

Leave a comment Cancel reply

The Graph Indexer Office Hours #155

Opening remarks

GRTiQ 166

Repo watch

Execution Layer Clients

Consensus Layer Clients

Graph Stack

Graph Orchestration Tooling

Blockchains Operator Upgrade Calendar

Protocol watch

Forum Governance

Contracts Repository

Open discussion

Sunbeam has begun

Indexer automation

Wisdom Nwokocha

Unlock Liquidity & Rewards with Tenderize Liquid Staking on The Graph

What’s the Big Deal with Substreams-Powered Subgraphs?

You may also like

The Graph Indexer Office Hours #192

The Graph Indexer Office Hours #191

The Graph Indexer Office Hours #190

1 Comment

Leave a comment Cancel reply