Big Data Benchmarking


Tilmann Rabl
Postdoctoral Researcher
Middleware Systems Research Group
University of Toronto


Date and Location: Monday, March 31st, 2014, 11:15am - 12:15am @ SB 225.

Abstract

In many fields of research and business ever growing amounts of data are stored and processed. The pace of storage price decreases and of the discovery of methods for monetizing large data analysis has come as a surprise to traditional database system vendors. This has led to the development of big data systems. Big data tasks are typically end-to-end problems, but due to the pace of development and the lack of standards a plethora of system components has been developed and an endless number of combinations is deployed. This makes comparing big data systems a hard task.

In his talk, Tilmann will present his work on big data benchmarking and data generation. One of the leading efforts in this field is BigBench, an end-to-end benchmark for big data analytics. It comprises a set of queries that are specific for big data workloads and a data model that contains structured, semi-structured and unstructured data. Tilmann will give further details on the data generation for BigBench that is done using PDGF. Finally, he will give an overview of other activities in this area.

Biography

Tilmann Rabl is a postdoctoral researcher at the Middleware Systems Research Group at the University of Toronto. His research focuses on big data storage management, new hardware for big data systems, big data analytics, database systems architecture and benchmarking. During his PhD studies, he developed the Parallel Data Generation Framework (PDGF), a generic data generator for benchmarking. For his work on data generation he received a Technical Contribution Award by the Transaction Processing Performance Council (TPC). PDGF is basis of the data generator for a new TPC benchmark for data integration and has been commercialized by the start-up company bankmark. In his doctoral research, Tilmann focused on data distribution in distributed databases. His doctoral thesis was nominated for the SPEC Distinguished Dissertation Award 2012 and received an honorable mention. Tilmann is co-founder and member of the steering committee of the Workshop on Big Data Benchmarking Series and the Big Data Benchmarking Community.