Differences
This shows you the differences between two versions of the page.
— |
news:timo_bingmann_visiting [2018-10-02 13:02] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Timo Bingmann visiting ====== | ||
+ | |||
+ | Timo Bingmann from KIT will be visiting from October 10 to October 11 and give a talk on "//Thrill: High-Performance Algorithmic Distributed Batch Data | ||
+ | Processing with C++//" in OH14 R202 on October 10 at 14:00. | ||
+ | |||
+ | **Abstract** - We present on-going work on a new distributed Big Data processing | ||
+ | framework called Thrill. It is a C++ framework consisting of a set of | ||
+ | basic scalable algorithmic primitives like mapping, reducing, sorting, | ||
+ | merging, joining, and additional MPI-like collectives. This set of | ||
+ | primitives goes beyond traditional Map/Reduce and can be combined into | ||
+ | larger more complex algorithms, such as WordCount, PageRank, k-means | ||
+ | clustering, and suffix sorting. These complex algorithms can then be run | ||
+ | on very large inputs using a distributed computing cluster. Among the | ||
+ | main design goals of Thrill is to lose very little performance when | ||
+ | composing primitives such that small data types are well supported. | ||
+ | Thrill thus raises the questions of a) how to design algorithms using | ||
+ | the scalable primitives, b) whether additional primitives should be | ||
+ | added, and c) if one can improve the existing ones using new ideas to | ||
+ | reduce communication volume and how they can be scaled to very large | ||
+ | cluster sizes. | ||
+ | |||