TL;DR: Pinax datasets simplify accessing blockchain data, making it easy for analysts to focus on analysis without needing specialized tools.
Accessing blockchain data for analysis is complex, especially for analysts without specialized blockchain data extraction knowledge, but demand in this space is growing. Pinax will soon offer datasets that make it effortless for anyone to access blockchain data for analysis and statistics.
Are you a data analyst who wants to dive into blockchain without the hassle of learning specialized tools? We’ve got you covered!
What are datasets?
Pinax datasets assemble blockchain data in a format familiar to data analysts—one that is easy to access and eliminates the need for specialized tools.
The idea with datasets is we do that work for you. We extract the data using Firehose and Substreams, and all the raw blockchain data for a particular chain is put into a format that’s easily accessible for a data analyst.
Daniel Keyes, CEO and co-founder of Pinax
We offer advanced data streaming services like Firehose (extracts the data) and Substreams (transforms the data), but these still require a high level of expertise to operate.
Pinax’s datasets bridge this gap by providing ready-to-use data formats that are stored in S3 buckets as Parquet files. These are easily integrated into existing databases and compatible with all major database management systems.
Why datasets?
Traditional methods for extracting blockchain data pose significant barriers. They require infrastructure operator expertise that many data analysts lack.
Pinax datasets handle the complex extraction process, making the data readily available in an accessible format. This approach lowers the barrier for data analysts, allowing them to save time and focus on analysis rather than data extraction.
With datasets, we extract the data and format it for easy access. This saves analysts the time and effort required to run complex tools. We update the datasets daily and plan to offer more granular updates over time.
Daniel Keyes
Why S3 buckets as Parquet files?
We store data in S3 buckets as Parquet files to offer efficient storage with reduced costs due to Parquet’s columnar format and superior compression. This method improves query performance by allowing selective column reading, making it ideal for big data analytics. Also, S3’s scalability and integration with AWS services enable seamless, cost-effective data processing at scale.
S3 is a widely adopted standard across all major data analysis platforms, which makes it easy to integrate into existing workflows and data pipelines. This broad compatibility ensures that data analysts can seamlessly incorporate S3 into their current infrastructure, increasing efficiency and enabling the scaling of operations without introducing significant overhead or complexity.
Why use Pinax for datasets?
Pinax specializes in blockchain data extraction across multiple chains, leveraging our extensive experience to offer datasets in a uniform, consistent format. Our expertise ensures that the data you receive is reliable, comprehensive, and ready for analysis without the need for you to do any complex setup.
By choosing Pinax, you’re partnering with a company uniquely positioned to simplify blockchain data access and provide high-quality datasets tailored to your needs, packaging your data in easily accessible S3 buckets as Parquet files.
Our specialty is that blockchain extraction part, which most companies actually do not want to do because you have to run infrastructure. You have to run an archive node. You have to run servers. People just want the final data product.
Denis Carrière, CTO and co-founder of Pinax
What sets us apart?
Pinax stands out by specializing in blockchain data extraction and using its own bare metal infrastructure. This allows us to deliver high-quality data at a lower cost.
With Pinax, users benefit from consistent data across all EVM chains and full compatibility with major databases via S3 bucket Parquet files. The potential for verifiable data through cryptographic proofs further enhances the reliability of our datasets.
Our cost efficiency comes from running our own infrastructure and using Firehose and Substreams for data extraction. This allows us to offer high-quality data at a lower cost compared to others relying on cloud services.
Daniel Keyes
Who will use Pinax datasets?
Our datasets are designed for a wide range of users, from blockchain developers to traditional data analysts. Initial users include companies like Messari, which has benefited from the ease of access and quality of data provided by our datasets.
Whether it’s macro analytics for platforms like Messari and Dune, forensic investigations, accounting, or AI chatbot training, Pinax datasets provide consistent, high-quality data across all EVM chains.
Messari is one of our first customers. They currently pull data from various providers, which was time-consuming and costly. With datasets, they can focus on analysis, using our consistent and reliable data.
Denis Carrière
Broader applications in the future
Datasets have potential applications beyond blockchain analysis, including forensic investigations and AI model training. The ability to provide comprehensive historical data makes datasets a valuable resource for various use cases, such as finance analytics or tax preparation software.
We see datasets being used in blockchain forensics to trace stolen funds and in AI to train models on historical blockchain data. These applications highlight the versatility and value of datasets.
Denis Carrière
Try our prototype and stay tuned
Are you interested in exploring our Ethereum datasets prototype? Contact us via email or join our Discord server to get access and connect with our community!
Follow us on X and LinkedIn so you’ll be the first to hear news and updates about our dataset product. Partner with us to access consistent, reliable blockchain data across multiple chains and elevate your data analysis capabilities.
💡 This article answers questions like:
- What are Pinax datasets, and how do they simplify access to blockchain data?
- Why is accessing blockchain data traditionally complex for data analysts?
- How do Pinax datasets bridge the gap between raw blockchain data and data analysis?
- What technologies do Pinax datasets use to extract and format blockchain data?
- What are the benefits of using Pinax datasets for blockchain analysis, and who are they for?
Can’t wait to see how this unfolds and the places its utilized! Very exciting