Case Study: Optimizing Changan Automobile's V2X Platform with Apache IoTDB

(This article is provided by the Changan Automobile Intelligent Technology Research Institute.)

Business Scenario Introduction

Changan Automobile is one of the leading automotive manufacturers in China, renowned for its extensive product line and innovative technologies. Changan Automobile not only offers a variety of passenger and commercial vehicles but also stands at the forefront of intelligent connected vehicle technology, particularly in the development of vehicle-to-everything (V2X) platforms.

The Changan Automobile Intelligent Research Institute plays a critical role in Changan Automobile's transformation towards intelligent technology and has developed V2X services as a core component of Changan Automobile's intelligent strategy. Utilizing cloud computing, big data, the Internet of Things (IoT), and artificial intelligence (AI), this platform enables vehicles to interconnect with the external environment, other vehicles, and traffic infrastructure. Its core platform, VOT, supports real-time online connectivity for millions of vehicles, millisecond-level communication, and comprehensive ecosystem integration. With these features, VOT enables real-time data collection, massive data analysis and computation, real-time vehicle fault warnings, and ensures safe driving, significantly enhancing the user experience.

The V2X services include:

  • Core V2X Platform VOT: Designed on a large-scale cloud-native architecture, this platform encompasses core services like remote vehicle control, vehicle status monitoring, event communication, service orchestration, and a rules engine. Utilizing the IoT-native time-series database IoTDB, it ensures stable connectivity for millions of vehicles, handles data concurrency of millions of points per second, and offers high terminal compatibility. It serves as the cloud brain for all Changan vehicles.

  • Data Analysis Platform: Upgraded with Apache Doris, this platform supports real-time processing of data at a scale of billions of records per day and enables queries on billions of data points with second-level response times. This platform has significantly improved the user experience, provided real-time vehicle fault warnings, and ensured safe driving for Changan Automobile.

  • Yunqi Lakehouse Big Data Platform: The company has developed a connected vehicle big data platform based on the Yunqi Lakehouse. This platform addresses challenges such as high costs, difficulties in data usage, and complex maintenance upon massive data volumes and rapid business growth.

Changan V2X services.png

Overview of Changan Automobile's V2X Services

With the rapid expansion of Changan Automobile's major brands (including Avita, DeepBlue, Qiyuan, etc.) and the exponential growth in intelligent connected vehicles, the VOT platform is under unprecedented pressure. This surge not only challenges the platform's ability to handle concurrent data processing but also increases costs, reduces efficiency, and escalates storage expenses for both real-time and historical data. Vehicle condition data, a core component among various vehicle data, has seen an exponential rise in reporting volume due to the vast number of connected vehicles. Currently, the daily real-time uplink volume of vehicle condition data has reached an astonishing 200 TB, driven by the high number of active daily users. Apache IoTDB, as the core data storage engine of Changan Automobile's V2X services, plays a vital role by supporting high-concurrency read and write operations and managing the long-term storage of historical data.

Business Requirements and Pain Points

Low Performance in Massive Concurrent Writes

Currently, with about 2 million active users during off-peak times, the V2X platform's real-time vehicle condition data upload concurrency has stabilized at hundreds of thousands. The varying needs for vehicle condition templates across different models make dynamic storage an urgent issue.

Compared to traditional vehicles, data interactions for intelligent vehicles have increased by several tens of times. With nearly 10 million daily active users, Changan Automobile's V2X platform endures over 500,000 data transmissions per second. Under this immense data pressure, traditional databases face the dual challenges of high server resource load and low write performance.

Poor Flexibility in Storage and Query

Changan Automobile's previous vehicle condition data storage engine, HBase, showed significant disadvantages when facing these challenges. Its data model is based on row keys, column families, and timestamps, requiring all data access patterns to be designed around this model. If the data access pattern does not align with HBase's data model, query efficiency can decrease. Furthermore, HBase does not support join operations or complex transactions like traditional relational databases, making it less than ideal for applications requiring complex queries. Additionally, HBase queries often involve full table scans, consuming substantial resources and time in large tables. While this issue can be mitigated by using filters to reduce the scanned data volume, it remains a performance bottleneck.

High Cost of Historical Data Storage

HBase, as a column-based storage solution, is suitable for sparse data storage but is inefficient for handling high-frequency updates and small-batch random read-write operations. Although HBase supports various compression algorithms like GZIP and Snappy to reduce storage space usage, these operations can increase CPU usage and decrease data read-write performance, making it unable to meet the real-time processing demands of large data volumes.

Strained Central Computing Resources

Changan Automobile's original vehicle condition data architecture was based on purely cloud-based HBase storage, heavily relying on the Hadoop ecosystem, which is not lightweight. All computation costs are tightly bound to this ecosystem, placing significant load on cloud core resources. Additionally, HBase's single-master node cluster architecture, though able to connect other regions during failures, has a long recovery time for the master node, leading to decreased performance in the computation chain. This centralization of computing pressure on the cloud makes it difficult to deploy Hbase at edge nodes due to its complex architecture.

Reasons for Choosing Apache IoTDB

Robust Concurrent Processing with Dynamic Templates

Unlike HBase, which uses fixed templates, IoTDB offers a time-series storage structure that supports CRUD operations on dynamic metadata templates. This dynamic capability allows for the sharing of physical quantity metadata, optimizing both storage and usage costs. IoTDB also supports high concurrency, with a single server capable of handling tens of thousands of concurrent connections per second and high write throughput. A single core can handle tens of thousands of write requests per second, and a single server can achieve write performance of tens of millions of data points per second. In a clustered environment, the write performance can scale linearly, reaching hundreds of millions of data points per second.

Efficient Real-time Read/Write and Compression

IoTDB employs advanced time-series data compression techniques, such as Gorilla encoding, which achieve a high compression ratio while maintaining fast data read and write. This reduces the storage burden for historical vehicle condition data and meets the real-time data usage requirements of the V2X platform.

Edge-Cloud Computing Architecture

IoTDB's lightweight architecture is suitable for edge devices, offering efficient data management and storage capabilities. At edge nodes, IoTDB supports low-latency queries, making real-time data analysis possible. Data from terminal devices is collected, processed, and stored in real-time at the edge using IoTDB, with subsequent analysis tasks performed. The processed data can then be uploaded to the cloud IoTDB, meeting the demands for large-scale data storage, high-speed data ingestion, and complex data analysis in the V2X field.

The deployment of IoTDB on the edge and in the cloud allows for time-series data management across various environments, enhancing data quality and reducing cloud computing costs.

Overview of IoTDB Time Series Data Management Process

The original solution for Changan Automobile's V2X platform involved a straightforward method of reporting vehicle conditions. Real-time vehicle condition data was forwarded through gateways and stored in Redis, while historical vehicle data was stored in HBase.

The new IoTDB-based solution adopts an edge-cloud collaborative computing approach. In this model, some vehicle condition data is processed at the terminal, where it can be integrated, subjected to simple calculations, and temporarily stored based on specific requirements (such as converting data formats to meet national standards or integrating periodic data). Configured data is then uploaded to the cloud, where it is distributed via a rules engine. Leveraging IoTDB's high real-time performance, the system simultaneously handles real-time data pushing, stores real-time data in Redis, archives historical data in IoTDB, and provides unified query interfaces for data access.

Changan VOT platform architecture.png

VOT Platform Architecture

Benefits of Implementing IoTDB

High Concurrent Write Capability for Vehicle Condition Reporting

For Changan Automobile's scenario of real-time reporting and querying of vehicle condition data for millions of online vehicles, IoTDB achieves a write capacity of over 8 million entries per second and supports horizontal scaling to handle even greater loads.

Changan concurrency.png

Currently, Changan Automobile's VOT platform connects to 2 million vehicles in real-time, generating a daily data volume of up to 150 billion records. At this scale, the new IoTDB-based system maintains a write latency in the millisecond range, ensuring fast and reliable data ingestion.

The platform accumulates approximately 200TB of data daily. After efficient real-time storage processing by IoTDB, the data volume is significantly compressed to about 30 TB, achieving a compression ratio of around 10:1. With this current data volume (covering nearly 90 days), IoTDB demonstrates outstanding performance in big data processing and storage.

Efficient Historical Data Queries

For the trillions of historical vehicle condition records at Changan Automobile, IoTDB keeps query latency under 50 milliseconds, fully meeting performance requirements.

Changan queries.png

The VOT platform's data processing architecture is designed to handle high concurrency and large data volumes effectively. By leveraging IoTDB's robust ecosystem integration, the platform uses advanced data indexing and query optimization techniques to enable rapid data retrieval and analysis.

Moreover, the platform integrates machine learning algorithms for intelligent prediction and vehicle maintenance, further enhancing the efficiency and accuracy of data processing. The application of these technologies not only improves data processing speed and reduces operational costs but also provides users with a more stable and reliable service experience.

Changan Qiyuan App en.jpg

Qiyuan App Homepage