DataFrame - a Swiss Army Knife of Java Data Processing

Track: Core Java

Abstract

Can we use big data techniques without big data infrastructure? As Java developers, we deal with data processing all the time: analyzing app logs, extracting data from Excel, copying tables between databases, to give some examples. Yet, the “standard” Java falls short in processing capabilities compared to more complex and heavy tools like Spark or Flink.

This talk is about “DataFrame” - a 2-dimensional in-memory table structure that provides filtering, column / row operations, joins, aggregations, window functions, etc. I will use an open source DFLib library (https://dflib.org) and a Jupyter notebook to demonstrate how to do data processing and visualization in a Java app with DataFrames without much fuss.

Andrus Adamchik

Andrus is a passionate open-source developer and a member of the Apache Software Foundation. He started programming in Java back in 1998, and since then founded a number of open-source projects: Apache Cayenne - a developer-friendly ORM, Bootique.io - a lightweight Java app platform, Agrest.io - a framework for dynamic REST services, and DFLib - DataFrame structure for Java. In his day job, Andrus is an IT entrepreneur, running a software company called ObjectStyle.

Feedback