Special Seminar: Sherif Sakr

2017-08-14 11:00:00 2017-08-14 11:30:00 Europe/Helsinki Special Seminar: Sherif Sakr Department of Computer Science http://cs.aalto.fi/en/midcom-permalink-1e755b5c80c73a055b511e7826b91590263605a605a Maarintie 8, 02150, Espoo

Department of Computer Science

14.08.2017 / 11:00 - 11:30
lecture room 1171-72, Maarintie 8, 02150, Espoo, FI

Sherif Sakr will give a presentation with the title "Big Data: State-of-the-art: Research Challenges and Opportunities" in lecture room 1171-72, TUAS building.


For about a decade, the Hadoop framework has been recognized as the defacto standard of big data analytics and processing systems. The Hadoop framework was popularly employed as an effective solution that can harness the resources and power of large computing clusters in various application domains. Due to its wide success, popular technology companies (e.g., IBM, Oracle and Microsoft) have decided to support the Hadoop framework in their commercial big data processing platforms. Additionally, many emerging startups such as Cloudera, MapR, Trifacta, Platfora among many others have designed their services and solutions based on the Hadoop framework. However, recently, both the research and industrial worlds identified various limitations in the Hadoop framework and thus it started to be acknowledged that the Hadoop framework cannot represent the one-size-fits-all solution for the various big data analytics challenges. In this talk, we present our view on Big Data 2.0 processing platforms which represent a new generation of engines (e.g., Spark, Flink, Giraph, GraphLab, Storm) that are domain-specific, dedicated to specific verticals (e.g. structured data, big graphs, data streams) and slowly replacing the Hadoop framework in various contexts. We classify these system, discuss their technical details and adequate application scenarios. We present an overview of our developed big data systems and research results in the last few years. In addition, we highlight some of the open challenges in the domain of big data processing systems.