This page outlines the performance characteristics and tested limits of Parseable in real-world scenarios. It helps you understand how Parseable scales with data volume, query concurrency, and storage size. Whether you're planning capacity or benchmarking against your workloads, this page gives you actionable insights into ingestion throughput, query latency, resource utilization, and architectural boundaries.
We include:
- Benchmark Results and Performance Metrics: Ingestion speed, query response time, and system resource consumption under load.
Performance Metrics
As part of our continuous performance improvement efforts, we benchmarked ParseableDB against other popular OLAP systems using the ClickBench suite.
The goal: reliably quantify ParseableDB’s query and ingestion performance in a standardized environment.
Parseable now ranks among the fastest databases on ClickBench, outperforming several OLAP systems in structured query benchmarks.
Environment
We ran the benchmarks on the two most popular instance types used in ClickBench:
-
c6a.4xlarge
– 16 vCPUs, 32 GiB RAM -
c6a.metal
– 192 vCPUs, 384 GiB RAM
Dataset
-
Source: ClickBench dataset (~22 GB compressed)
-
Decompressed size: ~216 GB
-
Total rows: 100 million
-
File format:
hits.json
(JSON Lines)
Ingestion Preparation
-
Split Data
-
Original
hits.json
(216 GB) is split into 39,999 files, each with 2,500 rows -
Rationale: Smaller files improve ingestion parallelism and avoid large object overhead
-
-
Format Conversion
- Each file is converted into a valid JSON array for optimal ingestion into ParseableDB
Benchmark Workflow
1. Ingest Data
./ingestion.sh
This script performs parallel ingestion of all prepared JSON files into ParseableDB.
2. Execute Queries
We used the standard 43 queries provided in queries.sql
from ClickBench.
./run_query.sh
Each query is executed three times:
-
Cold Run: First execution with page cache cleared
-
Hot Run: Average of next two runs with page cache benefits
All timings are recorded in result.csv
.
Results
Parseable demonstrates:
-
Fast ingestion of JSON-based semi-structured data
-
Low-latency query execution, even on large datasets
-
Efficient memory and CPU usage due to tight internal schema-on-read pipeline
Observability vs OLAP Benchmarks
While ClickBench provides a standardized baseline, it's important to note:
-
The ClickBench dataset (~216 GB) is small compared to real-world observability workloads (often >5 TB/day)
-
Queries in ClickBench are analytical and not optimized for log search or time-series analytics
-
Observability workloads require high ingestion throughput, support for semi-structured and unstructured data, and low-latency textual search
Limits
For optimal performance, we recommend the following specifications for each node type:
-
Prism (leader) - 4 vCPU, 8 GiB memory
-
Query - 16 vCPU, 32 GiB memory
-
Ingest - 8 vCPU, 16 GiB memory
-
Index - 16 vCPU, 32 GiB memory