The Graph Indexer Office Hours #165

Events By Jul 11, 2024 No Comments
TL;DR: In this recap of IOH #165, the discussion was an open session for indexers to share challenges and learn from the group how others are tackling issues. Discover what the future may hold for indexer agent and watch a demo of the allocation wizard in Indexer Tools.

Opening remarks

Hello everyone, and welcome to episode 165 of Indexer Office Hours!

GRTiQ 176

Don’t miss the GRTiQ Podcast with Emmi Aguilar, a member of the Graphtronauts and host of the GrapHER Club Podcast.

The Graphtronauts represent one of the largest communities of delegators and passionate contributors within The Graph ecosystem, and under Emmi’s leadership, they’ve launched a new podcast spotlighting women in web3.

Repo watch

The latest updates to important repositories

Execution Layer Clients

  • Erigon v2.0 New release v2.60.3 :
    • This release brings some improvements and bug fixes, allowing for graceful exit from consensus layer downloading stage and contains bug fixes for eth/tracers.
    • Ana | GraphOps stated that they are running version 2.60.3 successfully on all of their testnets.
  • Avalanche: New release v1.11.9 :
    • This version, backward compatible with v1.11.0, includes updated health metrics using labels, added consensus poll termination metrics, a new –version-json flag, and several bug fixes and improvements to logging, protobuf dependencies, and CI jobs.
  • Arbitrum-nitro New release v3.0.4-beta.1 :
    • This release fixes batch posting for Anytrust DAS chains, and fixes the optional Stylus program lazy recreation feature.

Graph Orchestration Tooling

Join us every other Wednesday at 5 PM UTC for Launchpad Office Hours and get the latest updates on running Launchpad.

The next one is on July 17. Bring all your questions!

Protocol watch

The latest updates on important changes to the protocol

Forum Governance

Contracts Repository

Network Subgraphs

Project watch

Presentation slides

Indexer components

Graph Node

In-progress:

  • Allow subgraphs to query and leverage data from other subgraphs
  • Parallelizing subgraph block range ingestion (performance enhancement)

Up next:

  • Better observability (to have a better sense of where things are slow or breaking)

More regular project watch updates are coming soon.

Open discussion

Whiteboard Activity: Old-School Clinic Style

This session encouraged open sharing of topics with the indexer community to ask questions, get support, or help troubleshoot issues.

Note: Content has been lightly edited and condensed.

[Timestamp 10:42]

Vince | Nodeify posted to the whiteboard:

  • For managing bloating, what platforms do people use? Is everyone just on Grafana and Prometheus?
  • Are there any cool allocation tools out there that I haven’t heard of?
  • What are people using to do their allocations now that the amount of subgraphs is quite large?

Vince expanded on his questions:

If you’re trying to index a large number of subgraphs, some of these grafting situations are pretty challenging. You’ve got five, six, or seven grafts and that might be something you’re on to now, but in a month, when you’re redoing a bunch of allocations and the network has moved on from whatever you’re doing, APR isn’t as great, and you’re looking to optimize, you have all this leftover stuff. After a few months, it’s quickly building up. What are your strategies around that?

  • Are you just buying more disks?
  • Are you spending an entire day pruning and deleting unnecessary items from your database?
  • For monitoring, are people using an ELK stack? Is everyone using Prometheus?
  • Does anyone have any cool dashboards you want to share?
  • With the high number of subgraphs now, what are you using to allocate effectively? Are you doing it the manual way? Are you using Vincent’s Indexer Tools? Are you using a script?

Derek | Data Nexus: We’re pretty religious about the Indexer Tools system that Vincent built.

Jim | WaveFive shared: I did 110 allocations today and I’m completely reliant on the Indexer Tools by Vincent. Without that, I would be in a tough spot right now because it would take too long to do that many allocations.

Is there an Indexer Tools demo specifically for indexer tools connected to my own agent? I haven’t done that yet, and I’m interested to know what that gives me. I know I can do things like off-chain subgraph management.

I’m also interested in a conversation with a wide array of indexers about alpha. We’re at the point now where if you don’t start building tools that allow you to scale your decision-making around allocation queries, you’re just not going to be able to keep up. You’d have to hire a team of people to do your allocations for you, and it doesn’t make sense; it’s never going to scale.

There’s also the question of how much work you do out in the open and how much work you do privately. I think there’s an argument to be made that the Foundation should fund some more tools around the automation space.

How much do we think the indexer agent should take on? How far should our heroes like Ford go with the automation that’s in the indexer agent?

I think there’s probably a good use case for adding graft automation to off-chain subgraphs.

Ford: Yeah, in my opinion, one of the top priorities for indexer agent is auto-handling of graft dependencies. It does not do that right now.

More automation in the indexer agent is something I’ve been thinking about for a long time. One option would be a tighter coupling between the indexer agent and the allocation optimizer, so the agent could automatically call the optimizer to optimize its decisions, or maybe an interface so you could use a different optimizer as well.

The agent has been missing some indication of how expensive a subgraph is and how long it will take to sync. Currently, an indexer has to consider and check this on their own: is it worth it to sync this subgraph? I think we’re getting closer to starting to automate some of that and bring in some information from the network, maybe from the quality of service subgraph, for how long a subgraph is taking to sync.

The top priority right now for the indexer agent is working through the long-running pain points, like auto-handling graft dependencies and making sure it’s easy to collect all query fees on all networks and subgraphs on non-supported networks.

Comments from the chat

Marc-André | Ellipfra: It’s not so much of an optimization tool that we need—I think we have a few available already, but more of a tool to support the workflow… handle grafts, failed subgraphs, reallocate, create small allocations for low-rewards-high-qps subgraphs. Vincent’s tool might be the way. I need to try it.

Cleaning up old deployments

Derek | Data Nexus posted to the whiteboard: What are others doing for cleaning up old deployments/graft bases?

Comments from the chat:

Vincent | Data Nexus: For cleaning up old deployments, I’ll usually use a combo of a psql query we have and xargs to remove them all quickly. Pretty manual process right now.

Marc-Andre’s note: Working on a script to list those so they can be unallocated and eventually dropped. Payne from Stake Squid has a script that removes the indexes and puts the subgraphs on slow storage.

Gemma | LunaNova: Yeah, our tidy-up-dead-subgraphs process probably takes me 1-2 hours.

The case for sharing tools

[Timestamp 26:17]

Jim | WaveFive posed a question for Vincent and Derek: Is Indexer Tools going to be the way that you want to “dog food” automation for Data Nexus? How much of that do you want to do in public, and how much do you want to do privately? Do you see this stuff as alpha, or do you see it as a sort of public good?

Vincent | Data Nexus: We’re pretty squarely in the mindset that we’re in this together as indexers. For The Graph protocol to compete against centralized competitors, we need to work together as a unit, fix the pain points, and optimize queries.

If you keep something that may be considered alpha to yourself, the rest of the network may suffer, and it won’t be consistent.

I always thought Indexer Tools was a public good from the beginning, and I don’t see that changing anytime soon.

Derek | Data Nexus: We have a decent amount of GRT staked in the protocol, and we’re very invested in The Graph being the leader in this industry and expanding its lead. With the current level of traffic, it doesn’t really make sense to hide alpha to the detriment of the network. For our own future benefit, with the stake that we have, it makes sense to help other indexers because that’ll grow the network and ultimately benefit us more than if we hid this type of information.

One note that Vincent didn’t mention is that he submitted a grant request to the Foundation for building Indexer Tools. I’m not sure about its status. We’ve added little features here and there for our internal needs, but if we start including additional feature sets from other indexers, it makes sense to consider some kind of grant.

Regarding the earlier point about connecting with the agent, I absolutely love it. It’s useful for time management. With the indexer tools, we’ve been able to shift some tasks from Vincent’s heavy workload to others on the team, and the process is really clean.

Demo of allocation wizard on Indexer Tools

[Timestamp 37:57]

Derek goes through how the allocation wizard works in Indexer Tools.

He shows how this feature simplifies a highly technical task into one that a non-technical person could do. If you’re looking to expand your team or you have someone non-technical on your team, they could be using this feature to oversee all of the allocations for your indexer.

Jim shared that the tool also makes it easy to balance your allocations and leave space for other indexers.

Alternative to Google Docs

[Timestamp 50:07]

Vincent from Data Nexus shared some links in the chat about dDocs, an alternative to Google Docs powered by Fileverse.

Vincent: A decentralized Google Docs, I feel like that’s something that The Graph protocol, that we, would like to support. I think we would probably like to head in the direction of supporting products like this.

Pain points and feedback for core devs

[Timestamp 52:37]

Vincent | Data Nexus: I’m eagerly awaiting control over deployment names in combo with deployment rules as well. That will be so helpful for managing subgraphs with significant query traffic.

Marc-André | Ellipfra: Handling 100 subgraphs is one thing… 400 subgraphs is painful. We need to be able to support 4000 soon.

Derek | Data Nexus: Regarding grafts, is there a way that we can ‘migrate’ the old deployment rather than doing the copy process?

  • Marc-André | Ellipfra replied: Yeah, why not? Would be so much faster. But sometimes you have two subgraphs that need the same graft.

Latest Indexer Agent release (GitHub repo)

Jim | Wavefive 🌊: The latest indexer agent, some folks are complaining about very aggressive health checking causing issues with allocating?

Josh Kauffman | StreamingFast.io confirmed: We have to jump between versions to get around this. Super annoying.

Ford clarified: Just a config.

Jim | Wavefive 🌊: I’ve been avoiding upgrading but time is ticking on base allocations 😄

Ford: Worth updating defaults though.

Marc-André | Ellipfra: Avoiding upgrading past 0.20.22 due to this issue and other odd things I did not have time to troubleshoot.

Josh Kauffman | StreamingFast.io: Ford, what’s the config line so I can pass that over. I thought we had played with that, but maybe not the right one (or more than one?)

Ford: We did push the release out quickly before our break to fix some things. Need to put out a more detailed changelog.

We can keep discussing requests in the indexer-software channel. Here’s the board for vision into the next projects we are prioritizing.

Author

We're a web3 service provider specializing in blockchain indexing operations. Our mission is to enable creators to achieve their true potential with web3 technology. We want to help developers reliably access blockchain data in a consistent format so you can create amazing experiences for your applications.

No Comments

Leave a comment

Your email address will not be published. Required fields are marked *