Enabling Real-Time Analytics for Hadoop Data Lakes with GridGain

This webinar discusses how an in-memory computing platform such as GridGain or Apache Ignite can modernize existing data lake architectures, enabling real-time analytics that spans operational, historical, and streaming data sets.

Data lakes, such as those powered by Hadoop, are an excellent choice for analytics and reporting at scale. Hadoop scales horizontally and cost-effectively and fulfills long-running operations spanning big data sets. However, the continual growth of real-time analytics requirements — where operations need to be completed in seconds rather than minutes, or milliseconds rather than seconds — has brought new challenges to Hadoop based solutions.

In this session, Denis Magda, GridGain VP of Product and Apache Ignite PMC Chair, describes:

  • How to choose the right deployment mode and responsibilities when working with GridGain and Hadoop
  • How to determine which operations should be handled by GridGain and which should be sent to Hadoop
  • How to use Spark DataFrames to run federated (aka cross-database) queries that span GridGain and Hadoop
  • How to perform initial data loading from Hadoop to GridGain
  • How to set up bi-directional synchronization between Hadoop and GridGain

 

Speakers
Denis Magda
Denis Magda
VP, Developer Relations in R&D at GridGain; Apache Ignite committer and PMC member

Denis Magda is an open-source software enthusiast who began his journey by working first with the technology evangelism group of Sun Microsystems and then with the Java engineering team of Oracle. During his years at Sun and Oracle, Denis became a seasoned Java professional, deepening and expanding his knowledge of the technology by contributing to the Java Development Kit, architecting Java solutions, and building local Java communities. Denis now continues his journey by supporting the Apache Software Foundation and working with GridGain Systems. For the foundation, he contributes to Apache Ignite as an Apache Ignite committer and a member of the Project Management Committee. As the head of the GridGain Developer Relations team, Denis works with software engineers and architects to help them develop their expertise in in-memory computing. You will find Denis at conferences, workshops, and other events sharing his knowledge about Apache Ignite, distributed systems, and open-source communities.