Oracle In-database Hadoop: When MapReduce Meets RDBMS

The MapReduce programming model lets developers without experience with parallel and distributed
systems utilize the resources of a large, multi-CPU system. The Oracle RDBMS has had support for the MapReduce paradigm for years through SQL analytics, user defined pipelined table functions and aggregation objects. The Apache Hadoop implements the MapReduce model.

In this session, we describe a prototype of Oracle in-database Hadoop implementation that lets you
write and execute Hadoop compatible applications written in Java directly in the database.
The major advantages of our implementation include:
(1) source compatibility with Hadoop,
(2) minimal dependency on the Apache Hadoop infrastructure,
(3) seamless integration of MapReduce functionality in Oracle SQL
(4) better parallelism and efficiency due to data pipelining (i.e., table functions) and no intermediate materialization.

View all 161 sessions

Kuassi Mensah

Oracle Corporation

Kuassi is the Director of Product Management in the Oracle Database organization.

He looks after Java applications and frameworks' connectivity to the Oracle database, including SpringBoot, connections pooling (UCP, HikariCP, Java in the database, MicroServices and Serverless functions, asynchronous and Reactive DB access, zero downtime, multi-tenancy, and sharded DB, turning Database tables into Hadoop and Spark data sources, and the DB Kubernetes Operator.

Oracle In-database Hadoop: When MapReduce Meets RDBMS

Kuassi Mensah

Presented by

Sponsored by

Media