Apache IoTDB at the Core: New Records in TPCx-IoT Benchmarking

On August 6th, 2024, the Transaction Processing Performance Council (TPC) released TPCx-IoT benchmark results, highlighting TimechoDB (based on Apache IoTDB) for setting new records in both performance and system costs.[1]

TPCx-IoT_top_results.png

TPCx-IoT Top Performance Results

What Is TPC?

The TPC is a non-profit organization established in August 1988. It focuses on creating industry standards for data-centric workloads and providing vendor-neutral performance data to the industry.[2] Over the past decades, the TPC has had a significant impact on the computing industry’s use of industry-standard benchmarks, and its benchmark test is by far the most authoritative yardstick of the functionality and performance of databases worldwide.

The purpose of TPC benchmarks is to provide relevant, objective, and verifiable performance data to industry users. To achieve that purpose, the TPC Benchmark Specifications require that benchmark tests be implemented with systems, products, technologies and pricing that:

  • Are commercially available;

  • Are generally available to all users; 


  • Are relevant to the market segment that the individual TPC benchmark models; 


  • Would plausibly be implemented by a significant number of users in the market segment the benchmark models.

Why Is TPCx-IoT Important?

IoT adoption across industries has triggered a massive influx of data, which requires analysis for actionable insights. Typical IoT topology consists of three tiers: edge devices, gateway systems and backend data center. While existing workloads address backend data centers, there has been no realistic or proven benchmark for comparing solutions at the gateway level. The TPC developed the TPC Express BenchmarkTM IoT (TPCx-IoT) to fill this gap.[3]

TPCx-IoT provides an objective measure of hardware, operating system, data storage and data management systems to provide the industry with verifiable performance, price-performance and availability metrics for systems which are meant to ingest and persist massive amounts of data from large number of devices, and provide real-time insights, typical in IoT gateway systems running commercially available software and hardware.

Using the operational model of a typical electric utility provider with thousands of power substations, TPCx-IoT provides verifiable performance, price-performance and availability metrics for commercially available systems that typically ingest massive amounts of data from large numbers of devices, while running real-time analytic queries. Its flexible design allows TPCx-IoT to be used to assess a broad range of system topologies and implementation methodologies in a technically rigorous and directly comparable manner.

Benchmark Results: Leading Databases and TimechoDB’s Record Performance

The TPCx-IoT benchmark currently supports notable NoSQL databases including Machbase, known for its scalability, Lindorm, a cloud-native database, and HBase, a popular distributed big data store. TimechoDB, based on Apache IoTDB, was newly tested and has ranked at the top in TPCx-IoT performance tests.

The exceptional performance is attributed to the robust design of Apache IoTDB, which incorporates several advanced technologies. The key factors include but are not limited to the following specialities[4]:

  • LSM-Based Storage Engine: IoTDB's specialized LSM (Log-Structured Merge) storage engine is optimized for write-heavy IoT workloads, particularly for the vast amounts of time series data typical in IoT applications. It defers sorting and encoding compression until just before data is flushed to disk, creating a pipeline that fully leverages CPU and disk resources.

  • High-Performance Batch Write Interface: IoTDB natively supports a high-performance batch write interface that buffers data to minimize the overhead associated with frequent function calls and data structure allocations. By batching, data of multiple devices over a period of time in TPCx-IoT can be ingested in a single transmission.

  • Compact TsFile Format: The novel Apache TsFile format, based on columnar storage, employs efficient encoding and compression algorithms for each column of data. It enables fast sequential reading and writing of time series data, making it ideal for workloads that require high throughput.

  • Multi-Level Statistical Information: Both in-memory and within TsFiles, IoTDB maintains multi-level statistical information that is well-suited for range queries in IoT scenarios.

An Overview of the IoTDB System

TimechoDB’s test results further demonstrate the growing importance of optimized data management for IoT applications. Based on the advanced design of IoTDB, TimechoDB has made further optimizations on multiple dimensions including functionality, usability and stability, achieving outstanding performance on the TPCx-IoT benchmark, demonstrating its suitability for demanding IoT data management tasks. This combination ensures that TimechoDB not only excels in technical benchmarks but also provides reliable, high-performance data storage and retrieval solutions that meet the practical demands of real-world IoT applications.

Conclusion

The TPCx-IoT benchmark results confirm TimechoDB’s ability to deliver industry-leading performance in handling IoT data. Its combination of innovative technologies, such as the LSM-based storage engine and the TsFile format, enables efficient data ingestion, storage, and real-time analytics, which are essential in modern IoT applications. These results further highlight the growing need for optimized, scalable solutions to manage the enormous influx of time series data in IoT environments. TimechoDB, based on Apache IoTDB, stands at the forefront of this evolution, providing a reliable and powerful database solution for enterprises looking to gain real-time insights from their IoT data.

Reference

[1] TPCx-IoT Top Performance Results: https://www.tpc.org/tpcx-iot/results/tpcxiot_perf_results5.asp?version=2

[2] Origins of the TPC: https://www.tpc.org/information/about/history5.asp

[3] Introduction of TPCx-IoT: https://www.tpc.org/tpcx-iot/default5.asp

[4] Apache iotdb: A time series database for iot application : https://dl.acm.org/doi/10.1145/3589775