Distributed Data Lakehouse: Are You Building One?

A fast growing data industry has led to fragmented solutions and unprecedented complexity of data platforms. We’ve seen data silos across data centers, regions, and clouds. There’s a strong demand for a simplified solution that can provide unification of data lakes, efficient data access and management. Alluxio is a large distributed system that is a new layer between compute engines and storage systems. It provides complete virtualization across all data sources, to serve data to applications who do not need to care about the location of data.

In this talk, we talk about an approach to architect an efficient data platform for multiple data pipelines with Spark, Presto, TensorFlow, PyTorch and Alluxio which is portable across environments, private or public clouds, for optimal cost and performance. We will also dive into a few examples of production level adoption to show how such architecture is used at scale.

Agenda

8.30-9.15am PST- Building a distributed data lakehouse

9.15-9.30am PST- Q & A session

 

Register Here

The event is finished.

Date

Jan 25 2023
Expired!

Time

8:30 am - 9:30 am

Organizer

VirtueTech Inc
VirtueTech Inc