TL;DR: In this week’s recap of Indexer Office Hours, special guest John Graber joins to answer indexer questions about PostgreSQL. This session is loaded with insights on best practices for configuration tuning, database optimization, third-party monitoring tools and so much more.
Opening remarks
Hello everyone, and welcome to episode 172 of Indexer Office Hours!
GRTiQ 183
Catch the GRTiQ Podcast with Garvit Goel, Founder at Electron Labs, a web3 solution focused on solving the final bottleneck in Ethereum’s scaling roadmap using zero-knowledge proofs.
Repo watch
The latest updates to important repositories
Execution Layer Clients
- sfeth/fireeth: New release v2.7.0 :
- This release enhances Firehose and Substreams by introducing new endpoint information services and implementing a well-known chain list, which automates and standardizes chain identification.
- The update also includes chain-specific advertising features, improved startup validation, performance optimizations for Substreams, and various bug fixes, all aimed at streamlining setup processes and reducing configuration errors across different blockchain networks.
- Arbitrum-nitro New release v3.1.1 :
- This release significantly enhances Stylus support, adds fast confirmation capabilities, introduces experimental PathDB support, and improves Anytrust DAS and validator stability.
- The update also includes various configuration changes, metrics improvements, and internal optimizations aimed at enhancing performance and security across different aspects of the Arbitrum network.
Consensus Layer Clients
Information on the different clients
- Prysm: New release v5.1.0 :
- This Prysm release introduces significant Electra upgrade-related changes, adds new features like multiple beacon node support in the REST API, and implements various performance optimizations and bug fixes.
- Notable deprecations include the planned removal of gRPC gateway and renaming of associated flags, while key bug fixes address issues with sync committee performance, slasher database disk usage, and various API-related improvements.
Protocol watch
Forum Governance
- GIP-0066: Introducing Graph Horizon – A data services protocol
- GIP-0068: Subgraph Service – A subgraph data service in Graph Horizon
Contracts Repository
- View 34 pull requests from this week in the Indexer Office Hours agenda.
Project watch
- Indexer components release v0.21.4
- Hopefully, we’ll have some exciting Graph Node features to share shortly
Announcement
Nick from The Graph Foundation encouraged everyone to attend The Graph core devs call on Thursday, August 29 at 5 PM UTC.
During the call, there were updates relevant to indexers and developers on Horizon and exploring subgraph data services in a Horizon environment.
- More on Graph Horizon
- Core Devs Call recordings (in case you missed it)
Open discussion
John Graber, Principal DBA at Edge & Node, answers questions from the indexer community on PostgreSQL.
Much of the content below is directly from John’s Markdown presentation. Some of the comments have been lightly edited and condensed.
PostgreSQL for Fun & Profit [11:05]
About John
- Moved to Postgres from Oracle in 2013
- Long neglected and recently resurrected GitHub: graphbear – Overview
Today’s agenda
Take our first steps together to help us all, myself included, become better graph node DBAs.
- Running Postgres is one thing.
- Understanding how to best deploy and maintain it to run indexer workloads is another.
- We are all probably guilty of “elephant abuse” in one way to another, even if we don’t know it yet.
Share from an infrastructural and architectural perspective (at a basic 101 level to start).
- Begin with the low-hanging fruit:
- Initial focus on the submitted questions that have the biggest impact for the least effort and least risk.
- Stick to the basics as much as possible (for now):
- “Rules of thumb” as much as possible.
- We need to develop simple, solid foundations first.
- Throw in any simple tips that come to mind along the way.
We’ll continue to learn and improve over the months and years as we build out configurations that are specifically good for graph nodes on Postgres.
DISCLAIMER: Do NOT apply what we discuss today en masse and without testing!
Questions collected from indexers
What are some best practices for PostgreSQL configuration tuning? [19:08]
- Implement the “rule of thumb” recommendations for the OS and PostgreSQL gradually.
- Take the time to observe the outcome of each change before moving on to others.
- Determine if your change has benefitted the system or not:
- Are your queries running faster?
- How is the memory being used?
- How does the CPU load look?
- Get the system configuration and architecture dialed in before spending too much time on queries.
- Query plans will often change as a result of config changes.
- Indexers are in a unique space: you don’t get to tune queries, and as you dial things in, your problem queries will change.
- Continuous improvement through periodic review of metrics and reports to inform further config changes.
Are there any exercises in database optimization that deliver the majority of gains for the least effort and risk? [22:35]
The basic memory parameters are a nice way to start:
- shared_buffers
- 0.25 x RAM
- effective_cache_size
- 0.75 x RAM
- This is used by the planner; it’s not committed memory
- maintenance_work_mem
- (0.15 x RAM) / autovacuum_max_workers
- max workers commonly = 6
- work_mem
- (0.25 x RAM) / max_connections
- Be careful with this one
- Increase slowly
- max connections is a tricky one with graph nodes (at present)
- Ideally, the max connections is one connection per CPU, but that is too expensive for most, so maybe two or three per CPU, and beyond that, you want to have a pooler in front of it. We do have pooling within graph node itself, so I want to be careful about saying there’s a blanket calculation for how many max connections you should have. I don’t know how your systems are running, so I can’t be certain what to recommend.
- work mem is important because every operation in a query can get this much RAM for itself, so we go with this rule of thumb on a connection basis because you don’t actually know how many operations a particular query will have at one time.
Drop unused indexes
- We slap an index on everything when a new subgraph is deployed, but end up with a tremendous overhead of unused indexes that impact each schema and the size and speed of your database.
- Unused indexes also cause a significant load on the autovacuum process, and you want to avoid that.
- Drop unused indexes! (If you get nothing else from today besides this, you’re still winning.)
- I’m hoping to put something together to automate dropping unused indexes at Edge & Node and will share lessons learned. Maybe this is something we’ll come up with as a community that we can all run on a periodic basis.
Disk layout
Requires a little more work, but it’s worth it. Try it and see how it goes for your system.
- You probably want/need to increase the OS read_ahead_kb for disks hosting data files.
- cat /sys/block/<device name>/queue/read_ahead_kb is commonly 128.
- echo 4096 > /sys/block/<device name>/queue/read_ahead_kb will set it on the fly.
- Needs to be added to /etc/rc.local to make it permanent.
- Increasing disk read ahead improves I/O throughput by reducing the number of requests to disk.
At the very least, pgwal (formerly pgxlog) and pgdata should be different disks and mount with noatime.
- pgwal should be the fastest disk (but don’t skimp on speed for pgdata either).
- pgwal doesn’t have to be a huge disk (a terabyte would be a safety net).
- Nice-to-haves:
- Indexes in their own tablespaces and put that tablespace on its own disk.
- pglog on its own disk (though putting it on the OS volume usually isn’t a bad second option).
Other postgresql.conf parameters
- Not diving into settings for WAL, autovacuum, bgwriter, etc. in this presentation.
- Usually a bigger discussion than we have time for today.
- Happy to share my usual defaults as a starting point for your testing.
Kernel parameters
- Yes, there are OS kernel parameters that can be modified for DB workloads.
- A little more risk in this area, and it’s been several years since I last looked into them.
- Also happy to share my old settings with a #YMMV (your mileage may vary) tag.
Pause for questions from the chat [34:27]
Vince | Nodeify: Do you recommend EDB-tuned profile?
John | Edge & Node: I haven’t tried it lately, so I’d have to look at it to see where it’s at. I’m cautious of anything that automatically comes back with “these are what your parameters should be.” That said, it’s a nice reference point, you could run it and see what it gives you and work from there, but I wouldn’t take it as absolutely true.
Vince shared some of the recommendations from the EBD-tuned profile in the chat:
mkdir /etc/tuned/edbpostgres
# Create the profile file:
echo "
[main]
summary=Tuned profile for EDB PostgreSQL Instances
[bootloader]
cmdline=transparent_hugepage=never
[cpu]
governor=performance
energy_perf_bias=performance
min_perf_pct=100
[sysctl]
vm.swappiness = 10
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 250
vm.dirty_ratio = 10
vm.dirty_background_ratio = 3
vm.overcommit_memory=0
net.ipv4.tcp_timestamps=0
[vm]
transparent_hugepages=never
" > /etc/tuned/edbpostgres/tuned.conf
John: I like this, okay, yeah, vm.swappiness is 10, the dirty_expire_centisecs is 500. We used to recommend 499 to be specific. So yeah, I think these are useful as a starting point, but don’t implement it wholesale. For these kernel parameters, throw them on a system you can test before implementing.
Hau | Pinax: Is Edge & Node still tuning your Postgres today? Like never-ending improvements?
John: We’re just getting going with the tuning. We’re now just reaching a point where after Sunrise, we better understand the loads on our system. So yes, we will be making never-ending improvements, but they will slow down eventually. Our goal is to standardize to a size and configuration of server and Postgres settings to go with it.
stake-machine.eth: Best file sharing still XFS?
John: I am personally very interested in trying ZFS. I have been for years but I’ve never had the opportunity to implement it anywhere. It seems like it would be a fantastic fit for Postgres. But everything I’ve ever deployed was either on Ext4 or XFS with no issues.
Matthew Darwin | Pinax: We’re mostly using ZFS.
Back to questions collected from indexers
What, if any, third-party monitoring tools do you recommend for PostgreSQL? [38:26]
We’re using Grafana, and the dashboards we have tend to be homegrown. I would like to move to community-created dashboards that we can implement in Grafana.
- Crunchy Data is a core contributor in the Postgres space and they have a suite of tools including pgBadger, which is near and dear to my heart.
- Take a look at Crunchy Data’s pgMonitor: PostgreSQL Monitoring, Metrics Collection and Alerting Resources.
- Sidenote: Crunchy Data’s pgBackRest tool is also my favorite backup and restore solution.
- Haven’t tried Cybertec’s pgwatch2: PostgreSQL metrics monitor/dashboard, but it looks interesting.
- Check out the CEO of Cybertec’s blog posts: Articles by Hans-Jürgen Schönig.
How can we continuously identify slow queries and bottlenecks in our graph database? [41:38]
When you need a live view to investigate why something is so slow right now:
- Dashboards in something like Grafana (such as pgMonitor provides, if I remember correctly) can be very helpful to take a first look.
- Is my memory being chewed up?
- Is it CPU load?
- If you want to get your hands dirty in psql or PGAdmin4:
- pg_stat_activity is a go-to view to query initially.
- What’s idle? What’s waiting? What’s the longest running?
- You could write your own scripts (or ask an AI to do it?) to report on things like locking, like an ancient one I have in my Gists.
Otherwise:
Why pgBadger?
- Easy to set up as standalone
- Also included in pgMonitor suite of tools
- It’s the key to continuous improvement
- Hands down my favorite tool for finding:
- Slow queries
- Frequent—high load—queries
- Queries spilling onto disk (temp files)
pgBadger goes through your log files. So you can ship your log files to another server where you’ve got pgBadger set up, so you’re not loading your database, and it will chew through your log files and tell you:
- Top-end queries by time
- Top-end queries by frequency
- Temp file sizes that are being created on disk
This is the continuous improvement part.
pgBadger is one of the most important tools for anyone who’s running Postgres.
It’s free, and it really helps you zoom in on what your database is doing every day and how you can improve it. I highly recommend it.
What are some common causes of performance degradation in PostgreSQL? [46:25]
Degredation over time is generally caused by:
- Unused indexes
- autovac settings that can’t keep up, which result in:
- Bloat
- Poor statistics
- Transaction ID (xid) wraparound will result in catastrophic failure
- If you see something in your log files about emergency vacuum to prevent transaction ID wraparound, I would dig into that. But most dashboards show you where you’re at on those transaction IDs.
Should we invest time in learning to optimize query execution plans? [48:26]
- Not really, no.
- In Postgres, there isn’t a way to specifically influence what the query plan should be as you might with something like hints in Oracle.
But that said…
Try to get to the point where you can at least have a basic understanding of the output of:
- explain analyze
- There’s a site, https://explain.depesz.com/, where you can paste in your explain analyze plan and it will give you a visual of where you’re losing the most time.
- pgBadger
So that you have an idea of indexes you might try—and/or (at a higher level) if cranking up statistics might help.
Lastly,
There are configuration parameters that can also affect plans, like the enable_* settings:
- Can be set for the whole instance or at the session level.
- I would not recommend playing with these.
There are also some *_cost parameters that can be set:
- But again, not something I’d usually spend time on upfront.
- I tend to maybe look at tweaking these when everything else has been dialed in.
What’s the best way to set up and use PostgreSQL’s query planner statistics? [51:45]
- Short answer: You don’t. Postgres handles this for you.
- While there are ways to affect the planner, I would not recommend experimenting in that space yet.
- I can share some common postgresql.conf settings for this later.
- As long as your stats are up to date because your autovac workers are keeping up, that’s about it.
- For heavily updated tables, you might consider a cron job to manually run analyze or vacuum analyze on those specific tables.
- It’s not often that you need to crank the statistics settings instance-wide, but sometimes it might help to set it at the table level.
Pause for questions from the chat [54:01]
Vince | Nodeify: What do you think about tools like this? PostgreSQL Tuner
John | E&N: I’m kind of neutral on that one as well. As I said, I shy away from just implementing these kinds of tools. If you want to run it and let it give you suggestions or recommendations, go for it.
Postgres tuning scripts don’t give me a warm and fuzzy feeling, but I acknowledge that time and effort went into them, and they do produce some useful advice or starting points, especially if you don’t know where to start from. Be careful and implement slowly.
stake-machine.eth: Is it a problem if “Table blocks has not been auto analyzed for 10 days”?
Derek | Data Nexus: Largely depends on the number of writes happening to that table. If it’s a trim 100-record table, that shouldn’t be an issue. If it’s constantly writing millions per day, it would make sense to check it out.
John: It depends on the activity on that table. If the table hasn’t been being updated frequently, that might be fine. If it’s been analyzed and it hasn’t had a ton of updates and there are settings that determine what a ton of updates is, sometimes that is done by percentage of rows in the table before it needs to be autovacuumed. But yeah, I would check into it and see if it’s a table that does or does not get updated frequently. If it does and it hasn’t been autoanalyzed in 10 days, then I would start looking into it. Maybe I’ll run vacuum analyze on it manually and at least get the statistics up to date and then maybe dig in further to see why it wasn’t getting analyzed. Is it my autovacuum settings?
Oh, I see Derek has already chimed in on that. Yep, exactly.
Matthew Darwin | Pinax: Question to indexers: (Derek, Marc-André), do you see points in the day where there is lower than normal query volume?
Marc-André | Ellipfra: Surprisingly, not hugely. Weekends are definitely slow. And I find this surprising too.
Matthew: This kind of pattern tells me there are more people working on crypto in “their day job” rather than “as a hobby.”
Derek | Data Nexus: Our biggest variances come from when we run into node issues (Arbitrum falling behind, etc.), but it seems to be less related to time of the day and user interactions. Some consumers have automated systems to query the subgraph at a specified rate.
Marc-André | Ellipfra: What does a good streaming replication setup look like? Synchronous/unsynchronous? Third party or pg-native?
John: For streaming replication, I have never done anything but synchronous. I always want my read replica up to date. Depending on what the business needs are, you might consider removing the need for the primary to have confirmation that it’s been written on the replica before the primary considers the transaction committed. That can speed things up on the primary, but there’s a remote chance that the primary dies and something didn’t made it to the replica. It depends on your business requirements.
I’ve never used anything other than pg-native replication. I’m always thinking of my replica as a read replica and usually want to be able to failover to it in case of emergency.
John didn’t have time to cover all the questions, so this discussion will be continued… 😃
No Comments