The Graph Indexer Office Hours #155

Events By May 03, 2024 1 Comment
TL;DR: This recap of Indexer Office Hours includes the announcement of Geth new release v1.14.0, which includes Firehose support, and a lively discussion about indexer automation.

Opening remarks

Hello everyone, and welcome to the latest edition of Indexer Office Hours! Today is April 30, and we’re here for episode 155.

GRTiQ 166

Don’t miss the GRTiQ Podcast with Connor Howe, founder at Enso Finance, a platform providing financial infrastructure for developers seeking to integrate DeFi interactions into their dApps.

Repo watch

The latest updates to important repositories

Execution Layer Clients

  • Geth New release v1.14.0 :
    • Geth v1.14.0 (Asteria) is a major release with new features and some breaking changes. Please read the release notes before updating.
    • State Trie Representation Change: Default state trie representation has shifted from hash mode to path mode, improving historical state pruning and controlling state growth.
    • Live Tracing Feature: A new live-tracing feature was introduced, allowing transaction tracers to be embedded directly into the block processing pipeline. This enhances flexibility but requires modifications at the source code level.
    • Transaction Propagation Improvement: Changes to transaction propagation to minimize reordering and reduce nonce gaps, improving network consistency.
    • Dropping Pre-Merge Network Support: No longer supports running pre-merge networks, focusing on post-merge network operations driven by a consensus client.
    • Reductions in Automatic Operations: Stops automatic construction of pending blocks and removes support for filtering pending logs, moving to more on-demand operations.
    • Beacon Chain Light Client Integration: Ships with a beacon chain light client that follows the beacon chain using REST API of beacon nodes, suitable for light client operations but not for validation or staking.
    • Go Version Update: Upgraded to use Go v1.22 by default, dropping support for Go v1.20.
  • Matthew from Pinax shared in the chat:
    • The Geth release includes Firehose support!
    • From the release notes: Geth v1.14.0 introduces a brand new live-tracing feature, where one or more transaction tracers might be injected into the block processing pipeline, ensuring that tracing and execution happen in lockstep (#29189). Since Go does not have a cross-platform OS native plugin infrastructure, adding live tracers needs to be done at the Geth source code level, and Geth itself subsequently rebuilt. That said, the advantage is that such tracers have full execution flexibility to do whatever they like and however they like. Please see the live-tracer changelog and docs for details.
      • Live tracing runs in lockstep with block execution, one waiting for the other. You should never run a live tracer on a validating node, or any other where latency is important. The recommended practice is to have tracers collect and export the bare minimum data needed and do any post-processing in external systems where latency is not relevant.
      • The live tracer work required a number of breaking internal API changes. If you had your own native tracers implemented before this change, the changelog contains the necessary steps needed to update your old code for the new APIs.
    • ^^^ this supports Firehose.
    • So any fork of Geth now gets Firehose support!
  • Nethermind: New release v1.26.0 :
    • This release focuses on substantial backend improvements, particularly in state management and sync efficiency. It also includes better support features and an ongoing commitment to security and performance enhancements.
    • State Design Upgrade to Half-Path: Introduces a new state design called half-path, which aims to improve validators’ performance and reduce archive node sync times and database size.
    • Performance Improvements:
      • Block processing times improved by approximately 30% to 50%.
      • State database size reduced by about 25%.
      • Overall database size post-snap sync reduced by approximately 50 GB.
      • State database growth rate significantly decreased.
    • Migration to Half-Path:
      • Default state design for newly synced nodes.
      • Existing nodes remain on the old hash design unless manually switched.
      • Migration options include full resync or full pruning with specific settings adjustments for efficiency.
    • Snap Serving Support: Nethermind now supports snap serving, enhancing network redundancy and reducing reliance on Geth.
    • New Docker Image: Introduction of a chiseled rootless Docker image for enhanced security.
    • Deprecations:
      • Removal of Snappy dependency.
      • Several sync options and metrics deprecated, with a shift to more general, labeled metrics.
  • Celo: New release v1.8.4 :
    • The v1.8.3 release introduced a bug in the effectiveGasPrice calculation for the RPC output. This release fixes that bug. All fee payments have been correct all the time.
  • Arbitrum-nitro New releases:
    • v2.3.4-rc.3 :
      • This release improves EIP-4844 batch posting and adds support for Redis streams as a block validation work queue compared to the previous release candidate.
    • v2.3.4-rc.4 :

Consensus Layer Clients

Information on the different clients

  • Teku: New release 24.4.4 :
    • This is a recommended update containing performance improvements and removes support for the Goerli network –network=goerli.
  • Nimbus: New release v24.4.0 :

Graph Stack

Graph Orchestration Tooling

Join us every other Wednesday at 5 PM UTC for Launchpad Office Hours and get the latest updates on running Launchpad.

The next one is on May 8. Bring all your questions!

Blockchains Operator Upgrade Calendar

The Blockchains Operator Upgrade Calendar is your one-stop solution for tracking hard fork updates and scheduled maintenance for various protocols within The Graph ecosystem.

Simplify your upgrade process and never miss a deadline again.

Add to your calendar!

Protocol watch

The latest updates on important changes to the protocol

Forum Governance

Contracts Repository

Open discussion

Sunbeam has begun

Marcus, Developer Relations at Edge & Node, reminded everyone that the upgrade window is now open (it closes on June 12).

Now is the time to reach out to hosted subgraph owners and remind them to upgrade to the decentralized network.

Also, suggest that they optimize their subgraphs after upgrading:

Indexer automation

Abel moderated a discussion on indexer automation, the issue Marc-André from Ellipfra raised last week. Since Marc-André raised the topic, he was given room to clarify his concerns.

With the large influx of new subgraphs and more coming every week, it’s getting difficult to handle them all and determine which subgraphs to index. The current tooling revolves around the indexer figuring out the optimal allocation size for every subgraph. It doesn’t provide enough information to make the decision-making process quick and easy, and this decision-making process has become a bottleneck for indexers.

Marc-André described his workflow and the factors he considers:

  • Can I expect any query traffic coming from the subgraph?
  • Is the source a reputable project or team?
  • Is the subgraph resource-heavy or difficult to index?
  • Game theory or prediction: what will be the total allocation I will see? I have to predict how much allocation I will see at the end of the day, next week, or at the end of the epoch because I will size my allocation differently if I expect a lot of allocations. If I expect to be the only one indexing that subgraph, my allocation will be very small.
  • Am I the first to allocate? I have to be very careful about sizing it accordingly. Sometimes, I will use Explorer to see if anyone else managed to sync the subgraph.
  • Does it look difficult, or will it take a long time to index this subgraph? How heavy is it? Is it using a lot of resources?
  • Is it killing my RPC node? Is my database on its knees?

Alexis from Semiotic Labs shared his process in the chat:

  1. Run the allocation optimizer without filters, examine all the subgraphs it gives me, and off-chain sync what I like.
  2. Remove the failed ones.
  3. Run the allocation optimizer with a whitelist containing only the subgraphs I got to chain head, then allocate.

All this is handled using his Python wrapper script.

Vincent from Data Nexus shared in the chat:

I have filters in my allocation tool that I use to get close to what we’re looking for. It is limited to network subgraph data though.

Derek from Data Nexus shared:

There is a tool that gives you a view of how many entities a subgraph has. That doesn’t translate to how many table records, but it at least gives you a sense of how many entities are in there. You can use this as a general feel for how big the subgraph is, assuming the Subgraph Studio indexer was able to sync it and is reporting the data.

Abel asked Derek:

Q1: You started leveraging QoS data and mashing it with the network subgraph and your local subgraph deployment table, which is interesting. I would love to learn more about that. How that’s useful? What you’re using it for?

Derek:

With the number of subgraphs we’ve been seeing, it’s been a little tricky to keep up with everything. So, I’ve been thinking, what if we take the other route? Because, a certain portion of our stake we intentionally partition specifically for query fee production.

A large percentage is going toward what will give us the best return on indexing rewards, but then, there’s a small percentage that we’ll put 1,000 GRT allocations. Presumably, that’s all that should be needed in many cases to claim the full query fee amount or near-full query fee.

So trying to get an idea of which of these subgraphs to start syncing and sometimes they have signal, but just in my talks with different consumers, a lot of them don’t yet understand that mechanism well, and now that there’s the Edge & Node upgrade indexer that is already allocating to these guys, especially ones that have been coming over from the hosted service and they’re already at the chain head, a consumer could, in theory, start querying it without any signal and also never really need to put signal because they do have at least one indexer on there.

So, we synced the QoS subgraph, and I was writing some queries joining in other subgraphs and locally what we’ve synced. From this, we can identify which subgraph produced how many query fees in the last 24 hours. We can also determine whether or not we’re allocated to it, whether we’ve synced it, whether we’re syncing it, or whether we haven’t started syncing.

How many indexers are on it to give us an idea of how many query fees we could potentially earn? Knowing that this allocation is not going to produce indexing rewards, it is purely going to be to generate query fees.

Payne from Stake Squid asked the team that built the allocation optimizer:

Q2: Do you know if it has a function that also creates allocations for subgraphs? For example, say you want to allocate to all the subgraphs on the network. Can we make it do that?

Chris from GraphOps:

Yes. If you set the pinned list to a long set of subgraphs, it should optimize over that list, and the resulting output will create allocations for everything. The ones in the optimization set will have the appropriate allocation size, and the ones you know fall outside of the optimization have a 0.1 GRT allocation size.

Hope from GraphOps typed in the chat: You just have to provide all the network names you support as an indexer.

Chris from GraphOps asked:

Q3: Do you think that we need better tools that drive indexer agent?

Marc-André:

In the early days, the design intention of the indexer agent was to handle all these things automatically. Quickly, we all saw that we needed the human in the loop to validate everything. That’s what I’m doing, and I’m revalidating everything. But we need to; if not removing the human altogether from the loop, we need to ensure that the agent, or whatever else, is able to make decisions in the vast majority of the allocation and on allocation decisions.

Author

A passionate, highly organized, innovative Open source Technical Documentation Engineer with 4+ years of experience crafting internal and user-facing support/learning documentation. Leverages a background in computer science to write for highly technical audiences and API docs and is the leader of the technical writing mentorship program.

1 Comment

  1. Wisdom Nwokocha says:

    I enjoyed this session

Leave a comment

Your email address will not be published. Required fields are marked *