TL;DR: In this episode 160 of IOH, the open discussion includes co-authors of the white paper, The Graph as AI Infrastructure. Ani Patel and Sam Green introduce the Inference and Agent Services and answer a multitude of questions from indexers.
Opening remarks
Hello everyone, and welcome to the latest edition of Indexer Office Hours! Today is June 4, and we’re here for episode 160.
GRTiQ 171
Catch the GRTiQ Podcast with James Hendler, the Director and professor at Rensselaer Polytechnic Institute and a pioneer of the Semantic Web.
Repo watch
The latest updates to important repositories
Execution Layer Clients
- Erigon v2.0 New release v2.60.1 :
- Breaking change: 2nd argument blockNrOrHash was removed from eth_estimateGas. Moreover, eth_estimateGas now correctly defaults MaxFeePerGas to block’s base fee.
- Previously eth_callMany and other RPC endpoints could have been wrong because baseFee was overwritten by maxFeePerGas.
- Fixes a regression with json marshalling of checkpoints causing root hash to be 0 and triggering unwinds.
- Fixes “method handler crashed” for debug_traceCall.
- Fixes unintended Polygon bor miner gas price and txpool price limit sanitization on Ethereum.
- Enables DNS p2p discovery on Holesky.
- bor blocks retire: infinity loop fix.
- sfeth/fireeth: New releases :
- v2.6.1 :
- Bumped firehose-core v1.5.1 and substreams v1.7.3.
- Bootstrapping from live blocks improved for chains with very slow blocks or with very fast blocks (affects relayer, firehose and substreams tier1).
- Substreams fixed slow response close to HEAD in production-mode.
- v2.6.0 :
- Substreams engine is now able run Rust code that depends on solana_program in Solana land to decode and alloy/ether-rs in Ethereum land.
- Substreams clients now enable gzip compression over the network (already supported by servers).
- Substreams binary type can now be optionally composed of runtime extensions by appending a +<extension>,[<extesions…>] at the end of the binary type. Extensions are key[=value] that are runtime specifics.
- v2.6.1 :
- Arbitrum-nitro New releases:
- v2.4.0-rc.4 :
- This release disables sequencer profiling by default due to Go tracing issues on ARM compared to the last release candidate.
- v2.4.0-rc.3 :
- This release fixes pebble nodes and adds support for downloading snapshots split across multiple files compared to the last release candidate.
- v2.4.0-rc.2 :
- Halve EIP-4844 batch size to only fill 3 batches.
- This Docker image specifies default flags in its entry point which should be replicated if you’re overriding the entry point: /usr/local/bin/nitro –validation.wasm.allowed-wasm-module-roots /home/user/nitro-legacy/machines,/home/user/target/machines .
- v2.4.0-rc.4 :
- Heimdall: New release v1.0.7-beta :
- This release contains minor improvements and fixes over the previous release.
Graph Orchestration Tooling
Join us every other Wednesday at 5 PM UTC for Launchpad Office Hours and get the latest updates on running Launchpad.
The next one is on June 19. Bring all your questions!
Protocol watch
The latest updates on important changes to the protocol
Contracts Repository
- [WIP] Graph Horizon #944 (open)
- chore: SAM deployments #961 (open)
- chore: add governance tests #979 (merged)
- fix: unit tests #978 (merged)
Open discussion
Ani and Sam from Semiotic Labs, co-authors of the recently released The Graph as AI Infrastructure white paper, joined the session to provide more context and answer the community’s questions.
The white paper presents the case for two new AI services built on The Graph: the Inference Service and the Agent Service.
Inference is when you run an AI model. You are running inference whenever you type something in ChatGPT and hit enter. The Inference Service will enable AI models, such as large language models (LLMs) like ChatGPT or other open models, to be brought online on The Graph. We will enable compute providers, currently indexers, to serve these models.
The Agent Service builds on top of the Inference Service and connects it with The Graph’s data. This will be a Graph-native way to build various AI dApps. In the paper, we also discuss verifiability and things like retrieval-augmented generation (RAG) and knowledge graphs in the appendix. The paper’s main thrust is to introduce these two AI services and show how we can transition The Graph from a compute provider focused on indexing to a general compute provider.
We’ve started building the Inference Service, and later, we’ll build the Agent Service.
The Inference Service takes pre-trained AI models, which we call open. We’re not talking about open-source AI in general but about open models, where all the data and code used to train them are publicly available. This excludes models like Llama, where only the weights are available. We’ll support a common file format for AI models, allowing indexers to run these models on GPUs or CPUs and respond with results.
For support from Semiotic for Inference or Agent services, join the graph-ai Discord channel.
Questions
These are some of the questions posed during the discussion. To hear all of them, listen to the recording.
Q1: About the open format, has it been decided yet, or is it still in the design phase?
- Everything is still in the design phase. When we say AI, we’re not specifically referring to large language models (LLMs); they are included, of course. We’re interested in supporting any AI model, such as those for portfolio optimization.
Q2: Are we looking at data sources outside of blockchain data?
- The Inference Service isn’t necessarily hooked up to any data; it’s just compute. You can think of it as running a model where a user types text, hits enter, and gets a response with no data involved. The Agent Service, however, will integrate with The Graph’s data sources and potentially support data outside of The Graph, though this might not be natively supported in the Agent Service.
Q3: What is the timeline? When should indexers be ready to provide GPU power?
- We’re still in the planning phase and working with our dev team. Until last week, we were focused on subgraph SQL, so we’re still figuring things out. We’ll try to make a public announcement once we have a clearer timeline, but I don’t have a great answer for you right now.
Q4: Does this new announcement change the expectations for indexers?
- In the initial phase of the inference service, we expect that many indexers with CPU setups may invest in GPUs to run AI models. As time progresses, competitive forces will likely lead to some specialization among indexers, with GPUs providing better performance. At a certain point, indexers will need to decide whether to improve their AI offerings or focus on optimizing their data pipelines, which will also be valuable for the Agent Service.
Q5: As indexers, what should we expect as entry requirements to participate in the initial deployment of the AI inference service?
- Regarding entry-level requirements, many useful models can run on CPUs, so there aren’t many initial requirements. However, competition will likely raise the bar over time.
Q6: How will factors like inference speed factor into the ISA and pricing?
- Pricing AI models is more complex than pricing SQL queries. For example, pre-loading a model into memory affects latency, and various optimizations can change pricing dynamics. The decentralization aspect adds complexity since indexers may struggle to price their GraphQL queries. We have some early thoughts on accounting for these variables, but it’s a complicated issue.
- Verifiability is crucial. Historically, achieving determinism in AI results has been challenging, but it is essential for verifiability. Methods to achieve determinism exist, like using specific packages for CPU-based models. Once we have determinism, we can explore verification methods like consensus or fraud proofs. However, using zero-knowledge proofs (ZKP) for LLMs isn’t practical due to the significant time required to generate proofs.
Q7: What are your thoughts on decentralized AI and the dangers it might pose?
- There are two sides to this issue: ensuring censorship resistance and addressing potential risks. AI safety is a significant concern, and we are considering how to tackle these questions. Initially, only specific entities may deploy new models as we work through safety considerations. This is a broad, unsolved problem in AI research, and we will follow the latest trends and incorporate state-of-the-art safety measures.
- Based on these evolving considerations, indexers will ultimately decide whether to support specific models. In the future, indexers will have to decide about supporting models based on factors like safety and rewards. Even with open-source models that might yield high rewards, there’s an inherent risk if those models are considered dangerous. This creates a dilemma for indexers: they could earn significant revenue by running these models, but they also have to weigh the potential risks and ethical considerations if one indexer chooses not to run a risky model, another might, leading to questions about how the protocol should respond to such scenarios.
Q8: Can you talk more about how these inference capabilities can be coupled with data that indexers already have to offer unique capabilities?
- Here’s an example of an agent converting natural language to SQL. (Referring to Figure 2: A simplified view of a SQL agent from the white paper, page 8.)
- If you’ve used our Agentc, you might recognize this process. A user requests data, such as the last ten trades on Uniswap. This query goes to an AI indexer, which uses a model to generate the corresponding SQL query. The SQL Gateway executes this query and retrieves the data, which is then returned to the user.
- This process could be reversed. For instance, if you already have SQL data, you could analyze it using an AI model. The SQL Gateway fetches the data, and the AI Gateway processes it to generate useful insights, perhaps through Python scripts that produce visualizations or statistical results.
- For a more complex scenario, consider this example: A natural language query interacts with multiple agents and data sources. The question, “What are some common trades that people make?” is refined and processed through various steps:
- Geo Agent: Retrieves contextual information from a knowledge graph.
- Vector DB Agent: Uses past queries to generate more accurate SQL.
- SQL Agent: Executes the refined SQL query to fetch the data.
These interactions demonstrate how AI services can be deeply integrated with The Graph’s data.
Q9: What are your thoughts on navigating the trade-off between The Graph becoming more of a general-purpose compute platform versus offering more of a focused set of data services?
- The power of The Graph is in the data that we have. One of the reasons that AI is so compelling is that we have economies of scale in The Graph because we’ve already built this big network of specialized compute providers. We have a large network of people running heavy-duty machines, processing a lot of data.
- We have a payment system, a way to route queries, a market, and a token in place. Adding AI to what we’ve already built is a relatively small step compared to building an entirely new decentralized AI inference and agent protocol from scratch.
Agentc
Ani showed Agentc, a playground designed to demonstrate the capabilities of blockchain data analysis, enhanced by SQL and powered by LLMs.
You can ask Agentc questions, and it will take your natural language question, convert it to SQL, and run it. For example, you could ask, “What were the top 10 trades on Uniswap V3 last month?”
It would go through a process similar to the one in the view of a SQL agent diagram. It takes the natural language question, passes it to an AI model (in this case, GPT-4), generates the SQL, and runs it against the database.
Here’s a recorded demo:
Q10: Do you see the future where consumers could provide LoRA adapters to load on top of a base model like Llama 3?
- Yes, that’s totally possible. It probably won’t be our focus initially, and we’re not really looking at enabling fine-tuning at this point. If we do, it would likely be its own AI service. But in the future, if people want to do stuff like that, I don’t see any reason why we couldn’t enable it.
Q11: Will building the agent/inference service require an entirely new protocol infrastructure, or do you think The Graph’s current build is easily extractable to these new services?
- There are pieces we can reuse and pieces we can’t. For example, a lot of the indexer service and Tap logic can stay the same, although we may need to extend it to be jobs-based instead of request-response-based, which is fairly doable. The gateway, too, has many reusable components. However, we’ll need to change how the ISA works and replace the Graph Node concept entirely for AI services.
- In terms of reusability, we can keep much of the indexer service and code from the gateway, but the Graph Node will be built from scratch for AI.
- The Tap aspect of payments will be reusable, but significant nuances exist in reusing a lot of the payment infrastructure. Prepayments won’t work well in this case, and we’ll need post-payments. Tap is currently structured for prepayments, so this will require adaptation.
Q12: Can you talk more about how team members at Semiotic are using ChatGPT and other AI solutions to work harder, better, faster, and stronger?
- Sam uses it to summarize technical PDFs when he doesn’t want to read entire documents. Sometimes, he asks for pseudocode from technical papers, although that often fails. He uses ChatGPT to create code skeletons for projects. For example, he completed a hackathon project with ChatGPT where you could describe an image or NFT and generate an NFT from that description. When writing, he incorporates feedback by copying what he wrote and the feedback into ChatGPT and asking it to merge them.
- Ani doesn’t see much current utility in large language models. He’s more interested in other aspects of AI, so he doesn’t use tools like Copilot much. I found that I spent too much time reviewing its code, making it slower in the long run. He uses AI for other areas like portfolio optimization, recommender systems, and multi-agent teaming.
Q13: Can you give us any direction on the next step internally to push the ball down the road on this exciting initiative?
- We aim to let product lead development. We’ve started many conversations with people to understand what will help the community the most. We also want to publish content to address misunderstandings about inference and the utility of agents in AI. We need to clarify misconceptions through blog posts or other content.
- From a development perspective, the first step we plan to take is to focus on creating a “walking skeleton”—a basic pipeline that effectively handles one model. This will serve as a foundation we can build upon rather than attempting to tackle the full scope of a decentralized market of inference providers all at once. We’ll focus on practical steps like storing models on IPFS, determining the best formats for easy indexing, and likely containerizing models for ease of use.
Q14: You mentioned a wide scope of many types of AI models. What are the first targeted use cases? LLMs?
- Initially, we’ll likely focus on LLMs because they’re more accessible for users to interact with and understand. This will provide a clearer demonstration of our capabilities compared to more specialized models, which might be harder for users to experiment with.
No Comments