NECSTFridayTalk – Leveraging Structural Indexes for High-Performance JSON Data Processing on GPUs
NECSTFridayTalk
Luca Danelutti
NECSTLab Researcher
DEIB - NECSTLab Meeting Room (Building 20)
On Line via Facebook
April 28th, 2023
12.30 pm
Contacts:
Marco Santambrogio
Research Line:
System architectures
Luca Danelutti
NECSTLab Researcher
DEIB - NECSTLab Meeting Room (Building 20)
On Line via Facebook
April 28th, 2023
12.30 pm
Contacts:
Marco Santambrogio
Research Line:
System architectures
Sommario
On April 28th, 2023 at 12.30 pm "Leveraging Structural Indexes for High-Performance JSON Data Processing on GPUs" a new appointment of NECSTFridayTalk, will be held by Luca Danelutti, graduated in Computer Science at the Politecnico di Milano and Researcher at the NECSTLab, in DEIB NECSTLab Meeting Room.
Big data and machine learning have become increasingly popular, leading to the need for a standard format for data transmission. JSON (JavaScript Object Notation) is a widely used data format for representing structured data in different applications. However, with the growing size and complexity of JSON documents, the performance of JSONPath queries and JSON file parsing has become a bottleneck. To address this, we propose GpJSON, an engine that leverages the massively parallel architecture of GPUs to process large JSON documents in parallel from high-level programming languages. Our solution leverages custom additional data structures, such as structural indexes, and a batching approach to accelerate the parsing and querying processes. Comparing it to current state-of-the-art JSON engines and libraries available in high-level programming languages demonstrates that our solution is 3x to 20x times faster with the same or easier interface.
Big data and machine learning have become increasingly popular, leading to the need for a standard format for data transmission. JSON (JavaScript Object Notation) is a widely used data format for representing structured data in different applications. However, with the growing size and complexity of JSON documents, the performance of JSONPath queries and JSON file parsing has become a bottleneck. To address this, we propose GpJSON, an engine that leverages the massively parallel architecture of GPUs to process large JSON documents in parallel from high-level programming languages. Our solution leverages custom additional data structures, such as structural indexes, and a batching approach to accelerate the parsing and querying processes. Comparing it to current state-of-the-art JSON engines and libraries available in high-level programming languages demonstrates that our solution is 3x to 20x times faster with the same or easier interface.
The NECSTLab is a DEIB laboratory, with different research lines on advanced topics in computing systems: from architectural characteristics, to hardware-software codesign methodologies, to security and dependability issues of complex system architectures.
Every week, the “NECSTFridayTalk” invites researchers, professionals or entrepreneurs to share their work experiences and projects they are implementing in the “Computing Systems”.
Event will hold on line by Facebook.