AI’s Big Leap: From Sidekick to Autonomous, Multi-repo Refactoring Powerhouse

Abstract

Generative AI can be a force multiplier for developers, but it also comes with limitations. Developers are expected to co-create with AI, and check the generated output, or risk hallucinations running wild. This can aid development at a local machine, but what happens when you try to apply these tools on a massive scale?

For mass-scale code operations, AI needs to have agency, able to operate with some degree of autonomy. In this session, we’ll cover how you can use use techniques such as retrieval augmented generation (RAG), the richest code data source for Java called the lossless semantic tree (LST), and OpenRewrite rules-based recipes to drive more efficient and accurate AI model output for refactoring and analyzing large codebases.

We’ll specifically address how you can use AI embeddings as a powerful tool to visualize, analyze, and even do smarter sampling for your codebase. We’ll demonstrate using embeddings to perform searches, cluster data, get a birds-eye view of your codebase, as well as diagnose issues and recommend OpenRewrite recipes to fix the problem. We'll also show you how GenAI can be helpful at scale: by assisting with writing OpenRewrite deterministic recipes.

Come learn the technical underpinnings for reliably using AI at scale for code modernization.

Justine Gehring

Justine is a researcher in the field of machine learning for code (ML4Code) and Graph Neural Networks (GNNs). She obtained her master’s from McGill and Mila where her research focused on generating code under challenging circumstances such as library-specific code. Presently, Justine is a research engineer at Moderne, focusing on leveraging AI for large-scale code refactoring and impact analysis. She also oversees Moderne's partnership with Mila - Quebec Artificial Intelligence Institute.