Session 2.3 - Doctoral Symposium: Presentations by Ph.D. Students

Chair: Dr. Michael Rodeh, VP, Infrastructure Technologies, IBM Corporate

Speaker: Nathaniel Azuelos
, Technion.


Abstract: With the multiplication of cores on a single die, one can look at modern chips like uniprocessors of yore. In these new architectures, each processor can be seen as the equivalent of a pipeline stage. As such, techniques that heretofore were developed to accelerate and improve performance in hardware architectures can find their equivalent on a coarser level. Indeed, to each core can be assigned a roughly sized piece of code, and data would travel from core to core according to the dependencies between code blocks. The dataflow programming model is very appropriate to that effect, and has been applied in a variety of ways. We propose to leverage the advantages of a flexible dataflow programming model to implement a variety of speculation techniques, such as coarse-grain branch prediction, value speculation, speculative software pipelining and transactional memory. At such a coarser grain, performing approximate prediction is often sufficient and provides very valuable flexibility. We thus associate to our speculation mechanism a measure of tolerance, where a modest, user-defined  level of inaccuracy is permitted. The user provides a measure of accuracy, and the speculation is confirmed or infirmed by the accuracy measure. Throughout all these techniques, intelligent scheduling policies are of paramount importance, as well as efficient use of dynamic memory and the system optimizations they consequently require. We also broach the subject of fault tolerance in this context. Throughout, we implement our techniques and optimizations atop the OpenMP framework, namely the Barcelona Supercomputing Center's NANOS runtime environment and Mercurium compiler. We hope our work will lead to increased performance and efficiency in the rapidly-evolving world of parallel computing.


Lecture Title:  A Parallel Twig Join Algorithm for XML Processing using a GPGPU

Speaker: Lila Shnaiderman
, Technion.


Abstract: “With an increasing amount of data and demand for fast query pro-cessing, the efficiency of database operations continues to be a challenging task. A common approach is to leverage parallel hard-ware platforms. With the introduction of general-purpose GPU (Graphics Processing Unit) computing, massively parallel hard-ware has become available within commodity hardware. XML is based on a tree-structured data model. Naturally, the most popular XML querying language (XPath) uses patterns of se-lection predicates on multiple elements, related by a tree structure. These are often abstracted by twig patterns. Finding all occurrences of such a (XML query) twig pattern in an XML document is a core operation for XML query processing. We present a new algorithm, GPU-Twig, for matching twig pat- terns in large XML documents, using a GPU. Our algorithm uses the data and task parallelism of the GPU to perform memory-intensive tasks whereas the CPU is used to perform I/O and resource man-agement. We therefore efficiently exploit both the high-bandwidth GPU memory interface and the lower-bandwidth CPU main mem-ory. We present the results of an extensive experimentation of the GPU-Twig algorithm on large XML documents using the DBLP and XMark benchmarks. Experimental results indicate that GPU- Twig significantly reduces the running time of queries in compar-ison with other algorithms on CPU based platforms and multicore based platforms under various settings and scenarios.”


Lecture Title:  Ruby on Semantic Web --- solving the impedance mismatch between Object Oriented Programming and the Semantic Web

Speaker: Vadim Eisenberg
, Technion.
Supervisors: Yaron Kanza


Abstract:Two of the hardest problems of developing data-processing applications are: (1) integrating data from heterogeneous sources, and (2) handling the inherent discrepancies between data models of the sources and models of programming languages, e.g., the object-relational impedance mismatch. The Semantic Web is a set of technologies (RDF, RDFS, OWL, SPARQL) that facilitate data integration. However, it does not solve the impedance mismatch problem. It merely exchanges object-relational impedance mismatch with object-RDF impedance mismatch. In the talk I will illustrate a solution to the impedance mismatch problem - a hybrid of RDF and objects.
Essentially, we modify an object-oriented language so that RDF data items become first-class citizens of the language, and objects of the language become first-class citizens of RDF. The benefits of the hybrid model and of the modified programming language are as follows: (1) it becomes natural to use the language as a persistent programming language, where persistence issues are handled implicitly; (2) tools from both models, (such as optimizers, syntax checkers, query and reasoning engines) can be applied to the artifacts of the unified model; and (3) programming is done over a single unified conceptual model.