TL;DR: This session of Indexer Office Hours was all about query caching and performance, with special guest David Lutter from Edge & Node.
Opening remarks
Hello everyone, and welcome to the latest edition of Indexer Office Hours! Today is May 14, and we’re here for episode 157.
GRTiQ special release
Want to learn more about the Sunrise Upgrade Program? Don’t miss this special release of the GRTiQ Podcast with Marian Walter, Partnership Manager at Edge & Node.
Repo watch
The latest updates to important repositories
Execution Layer Clients
- Erigon v2.0 New release v2.60.0 :
- For users using the prune=hrtc flag or any prune flags with non-zero prune.r.older value, follow the release notes for steps to upgrade to fix the pruning of logs issue introduced in 2.59.x.
- Erigon 3 has been in R&D for a long time and is getting closer to release. This release, v2.60.0, will be the last significant release based on Erigon 2. There might be patch v2.60.x releases for critical issues, but by and large, we intend all future development to be based on Erigon 3. The code of Erigon 3 now lives in branch main, which is our default branch now. We advise all forks of Erigon to either switch their development to Erigon 3 or use branch release/2.60 if they choose to stay on Erigon 2.
- You can read more about Erigon 3 in Merging Erigon 3 and Erigon 4, and possibly Caplin.
- Geth New releases:
- v1.14.2 :
- This is a maintenance release containing bug fixes.
- When using geth –dev with a custom genesis block, the genesis file must now set difficulty and terminal total difficulty to zero.
- eth_feeHistory was changed to apply a limit on the number of requested percentiles (#29644).
- eth_createAccessList now honors request cancellation and terminates background work (#29686).
- eth_estimateGas takes tx blobs into account for low-balance scenarios.
- v1.14.3:
- Identical to v1.14.2 but needed for publishing v1.14 on the Ubuntu PPA.
- v1.14.2 :
- sfeth/fireeth: New releases:
- v2.5.0 :
- Substreams bumped to v1.6.0
- Index Modules and Block Filter are now supported. See Substreams Foundational Models for an example implementation.
- Note: Upgrading will require changing the tier1 and tier2 versions concurrently, as the internal protocol has changed.
- v2.5.1:
- Substreams bumped to v1.6.1
- Revert sanity check to support the special case of a Substreams with only “params” as input. This allows a chain-agnostic event to be sent, along with the clock.
- Fixes error handling when resolved start-block == stop-block and stop-block is defined as non-zero.
- v2.5.0 :
Graph Orchestration Tooling
Join us every other Wednesday at 5 PM UTC for Launchpad Office Hours and get the latest updates on running Launchpad.
The next one is on May 22. Bring all your questions!
Protocol watch
The latest updates on important changes to the protocol
Forum Governance
Forum Research
Core Dev updates:
Contracts Repository
- [WIP] Horizon: add subgraph data service #946 (open)
- [WIP] Horizon: add escrow and payments #968 (open)
- fix: horizon dispute manager tests #969 (merged)
- feat: remove minimum allocation duration restriction #902 (merged)
Open discussion
David Lutter from Edge & Node joined to talk about query caching and performance.
Before his presentation, he gave tips on how subgraph authors can make their subgraphs more performant:
- Every subgraph should use immutable entities.
- They should also use bytes to store hex data instead of strings, including the ID.
- Declare aggregations in your subgraphs. This has a good impact on indexing speed as well as query performance.
- Declare ETH calls in the manifest to run them in parallel.
Overview of his presentation:
- Query caching and how to configure it
- Protect against slow and memory-hungry queries
- Diagnose database overload
Query caching, slide 1
- Built into graph-node with very conservative defaults
- Caches by block constraint – one entry per block constraint
- Cache normal queries, but not index-node status queries
- Cache key: (block pointer, IPFS hash, query text)
- Block pointer from block constraint
- Query text after variable interpolation
- Immaterial query changes are cache busters (e.g., very fine-grained timestamps)
Graph Node has query caching built in. The default configuration is a little too conservative, and I’ll talk more about the configurations available, but there are actually two caches. Let’s first discuss how a GraphQL query is taken apart.
When we get a GraphQL query that can query data at different block heights, you can put these block constraints in your query. So, one part of the query runs against block 10, and another part runs against block 20.
For caching purposes, we take the query apart and group things by at what block they need to run, and so basically, the stuff for each block height becomes one entry in the cache.
So if you have a query that queries against block 10 for some stuff and then against block 20 for another, those are potentially two cache entries.
The caching is only done for normal GraphQL queries, items like index-node status queries don’t get cached. There are a couple more exceptions, but as a general rule, normal GraphQL queries get cached. Everything else gets ignored for caching.
The cache key is the block pointer at which that query should be executed, the IPFS hash of the deployment, and then the actual query text. If the query has variables, we first interpolate the variables into the query, and that’s the query text that gets used as the cache key.
There is a downside if you have two queries that are basically the same but have some minor changes. They need to be cached independently. Something that happens a lot is people querying for a specific timestamp. They do the timestamps at the second resolution even though they know it’s data that’s aggregated every hour.
If those timestamps are too fine-grained, you’re actually guaranteed cache misses even though the query could be served from the cache.
That’s one thing to raise with people who use subgraphs, when they write their frontends, be conscious of the timestamps you need to use in queries and round them to something fairly coarse to increase your chances of hitting the cache.
Query caching, slide 2
- Two-level cache
- Recent blocks: one cache per network, no change when full
- Older blocks: LFU cache for all networks, evict when full
- Herd cache
- Avoid running same query multiple times simultaneously
The query cache is really two caches. There’s the recent blocks cache for the last few blocks, the highest few blocks that we see queries for, they go into the recent blocks cache. Stuff that’s historical, that looks at block heights that were a week ago or 30 days ago, they go in a different LFU cache. LFU means Least Frequently Used.
That’s the eviction strategy. So we stuff things into that cache, and then when the cache gets full, we throw stuff out that hasn’t been used in a while. The recent blocks cache, there’s one for each network, and when that recent blocks cache is full, that’s just too bad. We don’t evict from it; we just wait until the chain has progressed, and then, for that block, we just throw the entire cache away.
The older cache for the older blocks does evict when it gets full and throws out things that haven’t been used for a while.
There’s actually a third cache, and I’m just mentioning that here because you might see that lingo somewhere. It’s called the herd cache, and that addresses the problem of getting the same query twice at almost the same time.
The first one might still be running SQL queries and be processing, and now you get the second query that’s identical, and instead of doing all the work again for that, we just stuffed that into this herd cache, which basically just says the second query should just wait until the first query is done and then reuse the result. That’s just a clever mechanism to avoid some database load.
Query cache configuration
The first column is the name of the environment variable, the second column is the default value, and the third column is what we use in the hosted service. What we have set there is just an example of what you might want to set.
GRAPH_CACHED_SUBGRAPH_IDS | * | * | Which subgraphs to cache (Qm hashes) |
GRAPH_QUERY_CACHE_BLOCKS | 1 | 6 | How many blocks for recent blocks cache |
GRAPH_QUERY_CACHE_MAX_MEM | 1000 | 3000 | Total cache size (recent and LFU separately) |
GRAPH_QUERY_CACHE_MAX_ENTRY_RATIO | 3 | 3 | Max size of one cache entry (shard size / this) |
GRAPH_QUERY_BLOCK_CACHE_SHARDS | 128 | 8 | Number of cache shards recent blocks cache |
GRAPH_QUERY_LFU_CACHE_SHARDS | ⬆ | ø | Number of cache shards LFU cache |
GRAPH_QUERY_CACHE_STALE_PERIOD | 100 | 1000 | Evict from LFU after this many queries |
These are all the environment variables I could find how you configure caching. The first one is this GRAPH_CACHED_SUBGRAPH_IDS. You can either cache queries for all subgraphs, or if you set that to a comma-separated list of Qm hashes, then it will only cache for those deployments.
You can also use that to turn off caching completely; just set that variable to something that’s not a Qm hash. If you set that to the string none, then nothing will get cached.
The first column with values is the default for these settings, and the second column is the setting that’s being used by a fairly big Graph Node installation that I’m very familiar with.
Question in the chat: Does it work as in-memory cache only?
Answer: Yes, it’s only in-memory cache. That caching doesn’t write to disk anywhere.
The next setting, the GRAPH_QUERY_CACHE_BLOCKS, is this recent block cache. For how many blocks does it keep data? If the chain head is at block 20 and you set this to one, then we only cache queries that go against block 20. You set it to six, like in the hosted service, where we actually cache for the last six blocks.
GRAPH_QUERY_CACHE_MAX_MEM sets how much memory the cache can use. It’s in megabytes, so a thousand means a gigabyte. That size is used for the recent blocks cache and the LFU cache, so the total memory use will be twice that number.
The GRAPH_QUERY_CACHE_MAX_ENTRY_RATIO is a defense against some queries that can be really big. They might just plug up your cache, and it might be better not to put really big things in the cache. This ratio says if the query result is bigger than the size of a cache shard divided by this number, then we don’t even think about caching it.
The cache shard setting for GRAPH_QUERY_BLOCK_CACHE_SHARDS and GRAPH_QUERY_LFU_CACHE_SHARDS has to do with sort of locking internally. Every time we do something with a cache, we need to get a mutex lock in the code and under heavy load, you can actually have really bad contention around these locks.
So, instead of having one lock, we have shards of many locks. In the default setting, we would have 128 locks and use each shard of the cache for different queries based on the cache key. If the query load isn’t all that high, setting eight is plenty. Setting this really high reduces the size of each cache shard, which reduces the size of queries that you can cache.
The formula to figure out how big a cache shard is:
GRAPH_QUERY_CACHE_MAX_MEM / (# networks * GRAPH_QUERY_CACHE_BLOCKS * GRAPH_QUERY_BLOCK_CACHE_SHARDS)
So a query that’s bigger than that size won’t ever be cached because it won’t fit in the cache.
The GRAPH_QUERY_CACHE_STALE_PERIOD determines how frequently we try to evict things from this LFU cache from the cache for sort of historical stuff.
Unfortunately, it’s not possible right now to set the maximum memory for different subgraphs, or it might also be desirable to set it for different networks, but the maximum is just a global setting.
Cache logs
DEBG Rotating query cache, stats for last block,
avg_insert_time_ms: 0.00, fill_ratio: 0.01,
dead_inserts: 4, avg_hits: 2.14, entries: 7,
network: mainnet, shard: 60,
query_id: …, subgraph_id: …, component: GraphQ1Runner
DEBG Evicted LFU cache, evict_time_ms: 0,
accesses: 9184, hit_rate: 27%, stale_update: false,
weight_evicted: 465, weight: 23437194, entries_evicted: 1, entries: 2359, network: …, shard: 72,
query_id: …, subgraph_id: …, component: GraphQ1Runner
This is just an example of logs you’ll find when caching is on, and we do things with a cache. The first one is the rotating query cache that happens for the recent blocks cache.
Once we determine that a certain block height shouldn’t be in that recent block cache anymore because we’ve seen queries for higher block numbers, we produce a log message that tells you a little bit about how well that cache worked, like average hits. Dead inserts mean we put things in the cache, but they were never used, and so that tells a little bit about the recent blocks cache.
Similarly, in the second example, the message at the bottom is when we evict from that LFU cache, it says how many accesses to that cache there were, how much got kicked out, weight, how big that cache is. All those numbers are in bytes.
Cache metric query_cache_status_count
Label cache_status
hit | Found in cache |
insert | Not in cache, but inserted |
shared | Waited for herd cache |
miss | Not in cache, not inserted |
Cache hit percentage:
sum (rate (query_cache_status_count(cache_status-"hit"}[1m]))
sum (rate (query_cache_status_count[1m]))*100
Hit means we found it in the cache. Insert means it wasn’t in the cache yet, but we put it there. Shared means that the query came from the herd cache. So, while we got that query, we were running that same query already, and we just waited until the first one finished. Miss means it wasn’t in the cache, and we didn’t insert it for whatever reason.
For example, if you turn caching off completely with a subgraph ID setting, you should only have values in the miss label. One thing we put on a dashboard internally is the cache hit percentage. It’s unfortunately not huge; it’s about 20% or so. What we see is a 20—30% cache hit ratio in the hosted service.
Protect against slow and memory-hungry queries
GRAPH_GRAPHQL_MAX_FIRST | 1000 | 1000 | Max first in GraphQL query |
GRAPH_GRAPHQL_MAX_SKIP | ø | 5000 | Max skip in GraphQL query |
GRAPH_GRAPHQL_MAX_DEPTH | ø | 50 | Max nesting depth in GraphQL query |
GRAPH_GRAPHQL_QUERY_TIMEOUT | ø | 105 | Max time in s for entire GraphQL query |
GRAPH_SQL_STATEMENT_TIMEOUT | ø | 102 | Max time in s for one SQL query |
The first one is about long-running queries. Max first and max skip, you have to set fixed values. Basically, max first and max skip limit those fields in GraphQL queries. You can get only a thousand things; you can only skip over 5,000 things because everything else is too slow. Max depth is how deeply a GraphQL query can be nested.
You can set two timeouts. The GraphQL query timeout is how long an entire GraphQL query is allowed to take, and the SQL statement timeout is how long an individual SQL query, for a GraphQL query, is allowed to take.
The values: the first column with values is the default, so the timeouts are unlimited by default, and the second column is what we use in the hosted service.
The values in the hosted service come from our having Cloudflare in front. I think that just times out queries after 100 seconds anyway, so timing out on the Graph Node side after a little more than 100 seconds just saves us a bunch of unnecessary work.
GRAPH_GRAPHQL_WARN_RESULT_SIZE | Log warning |
GRAPH_GRAPHQL_ERROR_RESULT_SIZE | Abort query with error |
Metric: sum by (le) (query_result_size_bucket)
**WARN Large query result,
query_id: …, size: 110723418, query_id: …, subgraph_id: …, component: GraphQlRunner**
In the next Graph Node release, all SQL queries will only query for the specific columns that are needed for the GraphQL query.
Diagnose database overload
You can listen to the last section on diagnosing database overload at 32:52 in the recording and view the presentation slides.
Questions
Q: Do we use this cache behind an indexer service so it can count paid queries?
- Answer: This cache is built into Graph Node. It’s completely transparent to anything that queries Graph Node. Yes, to the outside world, it doesn’t make a difference whether the query was served from the cache or you had to go to the database to calculate the result.
Q: What does the up arrow mean in the query cache configuration table?
- Answer: The default for this LFU cache shards is block cache shards. If that’s not set, we just use whatever block cache shards is set to.
Q: Did I hear correctly that the load manager is not used on the hosted service?
- Answer: No, the load manager is used on the hosted service. It actually kicks in quite a bit, but it’s just this jailing mechanism that we don’t use on the hosted service because it seemed a little scary.
Q: About scalability, can we scale it horizontally?
- Answer: You can have multiple Graph Nodes that serve queries.
Q: Can you talk more about the query semaphore wait mechanism? What are “permits”?
- Answer: So a semaphore, when you set it up, has a certain number of permits. Say you set it up with like 10 permits, then 10 things can acquire that semaphore and do whatever code is protected by the semaphore, and then the 11th one gets blocked and has to wait until somebody who has a permit hands it back, and then they can proceed.
- So basically, a semaphore is a lock that a limited number of people can acquire, and so by setting the number of permits for the semaphore to the number of connections we have in the pool, the idea is that because we acquire the semaphore fairly early on in query processing, one GraphQL query gets the semaphore and does its thing. That way, it basically has a reserved connection in the pool even though it’s not holding onto the connection the whole time. The mechanism is supposed to ensure that a GraphQL query, once we start processing it, doesn’t get blocked while waiting for a database connection.
No Comments