Justin Lubin

Research Mission To enable scientists to write code with only domain expertise, not programming expertise.

Hi there! I’m Justin! I am a PhD candidate in computer science advised by Sarah E. Chasins at UC Berkeley. I worked with Ravi Chugh as an undergrad at UChicago. I am very grateful for their excellent mentorship and strive to pay it forward!

Research

I co-design programming systems with domain experts. These systems empower them to write the code they need by themselves. To make new user interactions possible in these systems, I develop programming language theory informed by what I learn from deeply embedding with domain experts, as well as my qualitative and quantitative human-computer interaction research.

I’m currently immersed in the world of biology to foster a substantive, ongoing, and reciprocal relationship between the fields of programming languages and experimental biology. I have been collaborating most closely with the lovely folks of the Nuñez Lab to advance research in biology and to answer the following question with them:

How can we work with scientists to build a programming system for them and with them to empower them to write the code they need without having programming expertise?

Conference and Journal Publications

For more, please see my CV or Google Scholar! (∗ = equal contribution, † = research mentee)

PLDI ’25
Programming by Navigation
- Justin Lubin
- Parker Ziegler
- Sarah E. Chasins
- pdf
- presentation
- errata
- supplement
- repo
Abstract

When a program synthesis task starts from an ambiguous specification, the synthesis process often involves an iterative specification refinement process. We introduce the Programming by Navigation Synthesis Problem, a new synthesis problem adapted specifically for supporting iterative specification refinement in order to find a particular target solution. In contrast to prior work, we prove that synthesizers that solve the Programming by Navigation Synthesis Problem show all valid next steps (Strong Completeness) and only valid next steps (Strong Soundness). To meet the demands of the Programming by Navigation Synthesis Problem, we introduce an algorithm to turn a type inhabitation oracle (in the style of classical logic) into a fully constructive program synthesizer. We then define such an oracle via sound compilation to Datalog. Our empirical evaluation shows that this technique results in an efficient Programming by Navigation synthesizer that solves tasks that are either impossible or too large for baselines to solve. Our synthesizer is the first to guarantee that its specification refinement process satisfies both Strong Completeness and Strong Soundness.
PLDI ’25
Fast Direct Manipulation Programming with Patch-Reconciliation Correspondence
- Parker Ziegler
- Justin Lubin
- Sarah E. Chasins
- pdf
- presentation
- supplement
- repo
Abstract

Direct manipulation programming gives users a way to write programs without directly writing code, by using the familiar GUI-style interactions they know from direct manipulation interfaces. To date, direct manipulation programming environments have relied on two core components: (1) a patch component, which updates the program based on a GUI interaction, and (2) a forward evaluator, which executes the patched program to produce an updated program output. This architecture has worked for developing short-running programs—i.e., programs that reliably execute in <1 second—generating outputs such as SVG and HTML documents. However, direct manipulation programming has not yet been applied to long-running programs (e.g., data visualization, mapping), perhaps because executing such programs in response to every GUI interaction would mean crossing outside of interactive speeds. We propose extending direct manipulation programming to long-running programs by pairing a standard patch component (patch) with a corresponding reconciliation component (recon). recon directly updates the program output in response to a GUI interaction, obviating the need for forward evaluation.

We introduce corresponding patch and recon procedures for the domain of geospatial data visualization and prove them sound—that is, we show that the output produced by recon is identical to the output produced by forward-evaluating a patch-modified program. recon can operate both incrementally and in parallel with patch. Our implementation of our patch-recon instantiation achieves a 2.92× median reduction in interface latency compared to forward evaluation on a suite of real-world geospatial visualization tasks. Looking forward, our results suggest implementations based on patch-recon correspondence are a viable path for extending direct manipulation programming to additional programming domains.
PLDI ’24
Equivalence by Canonicalization for Synthesis-Backed Refactoring
- Justin Lubin
- Jeremy Ferguson
- ∗†
- Jacob Yim
- ∗†
- Kevin Ye
- ∗†
- Sarah E. Chasins
- pdf
- presentation
- supplement
- repo
Abstract

We present an enumerative program synthesis framework called component-based refactoring that can refactor “direct” style code that does not use library components into equivalent “combinator” style code that does use library components. This framework introduces a sound but incomplete technique to check the equivalence of direct code and combinator code called equivalence by canonicalization that does not rely on input-output examples or logical specifications. Moreover, our approach can repurpose existing compiler optimizations, leveraging decades of research from the programming languages community. We instantiated our new synthesis framework in two contexts: (i) higher-order functional combinators such as map and filter in the statically-typed functional programming language Elm and (ii) high-performance numerical computing combinators provided by the NumPy library for Python. We implemented both instantiations in a tool called Cobbler and evaluated it on thousands of real programs to test the performance of the component-based refactoring framework in terms of execution time and output quality. Our work offers evidence that synthesis-backed refactoring can apply across a range of domains without specification beyond the input program.
UIST ’22
Exploring the Learnability of Program Synthesizers by Novice Programmers
- Dhanya Jayagopal
- ∗†
- Justin Lubin
- ∗
- Sarah E. Chasins
- pdf
- presentation
- slides
Abstract

Modern program synthesizers are increasingly delivering on their promise of lightening the burden of programming by automatically generating code, but little research has addressed how we can make such systems learnable to all. In this work, we ask: What aspects of program synthesizers contribute to and detract from their learnability by novice programmers? We conducted a thematic analysis of 22 observations of novice programmers, during which novices worked with existing program synthesizers, then participated in semi-structured interviews. Our findings shed light on how their specific points in the synthesizer design space affect these tools’ learnability by novice programmers, including the type of specification the synthesizer requires, the method of invoking synthesis and receiving feedback, and the size of the specification. We also describe common misconceptions about what constitutes meaningful progress and useful specifications for the synthesizers, as well as participants’ common behaviors and strategies for using these tools. From this analysis, we offer a set of design opportunities to inform the design of future program synthesizers that strive to be learnable by novice programmers. This work serves as a first step toward understanding how we can make program synthesizers more learnable by novices, which opens up the possibility of using program synthesizers in educational settings as well as developer tooling oriented toward novice programmers.
OOPSLA ’21
How Statically-Typed Functional Programmers Write Code
- Justin Lubin
- Sarah E. Chasins
- pdf
- presentation
- slides
- poster
Abstract

How working statically-typed functional programmers write code is largely understudied. And yet, a better understanding of developer practices could pave the way for the design of more useful and usable tooling, more ergonomic languages, and more effective on-ramps into programming communities. The goal of this work is to address this knowledge gap: to better understand the high-level authoring patterns that statically-typed functional programmers employ. We conducted a grounded theory analysis of 30 programming sessions of practicing statically-typed functional programmers, 15 of which also included a semi-structured interview. The theory we developed gives insight into how the specific affordances of statically-typed functional programming affect domain modeling, type construction, focusing techniques, exploratory and reasoning strategies, and expressions of intent. We conducted a set of quantitative lab experiments to validate our findings, including that statically-typed functional programmers often iterate between editing types and expressions, that they often run their compiler on code even when they know it will not successfully compile, and that they make textual program edits that reliably signal future edits that they intend to make. Lastly, we outline the implications of our findings for language and tool design. The success of this approach in revealing program authorship patterns suggests that the same methodology could be used to study other understudied programmer populations.
ICFP ’20
Program Sketching with Live Bidirectional Evaluation
- Justin Lubin
- Nick Collins
- Cyrus Omar
- Ravi Chugh
- pdf
- presentation
- slides
- website
- honors thesis
- poster
- repo
Abstract

We present a system called Smyth for program sketching in a typed functional language whereby the concrete evaluation of ordinary assertions gives rise to input-output examples, which are then used to guide the search to complete the holes. The key innovation, called live bidirectional evaluation, propagates examples "backward" through partially evaluated sketches. Live bidirectional evaluation enables Smyth to (a) synthesize recursive functions without trace-complete sets of examples and (b) specify and solve interdependent synthesis goals. Eliminating the trace-completeness requirement resolves a significant limitation faced by prior synthesis techniques when given partial specifications in the form of input-output examples.

To assess the practical implications of our techniques, we ran several experiments on benchmarks used to evaluate Myth, a state-of-the-art example-based synthesis tool. First, given expert examples (and no partial implementations), we find that Smyth requires on average 66% of the number of expert examples required by Myth. Second, we find that Smyth is robust to randomly-generated examples, synthesizing many tasks with relatively few more random examples than those provided by an expert. Third, we create a suite of small sketching tasks by systematically employing a simple sketching strategy to the Myth benchmarks; we find that user-provided sketches in Smyth often further reduce the total specification burden (i.e. the combination of partial implementations and examples). Lastly, we find that Leon and Synquid, two state-of-the-art logic-based synthesis tools, fail to complete several tasks on which Smyth succeeds.
UIST ’19
Sketch-n-Sketch: Output-Directed Programming for SVG
- Brian Hempel
- Justin Lubin
- Ravi Chugh
- pdf
- demo
- repo
Abstract

For creative tasks, programmers face a choice: Use a GUI and sacrifice flexibility, or write code and sacrifice ergonomics?

To obtain both flexibility and ease of use, a number of systems have explored a workflow that we call output-directed programming. In this paradigm, direct manipulation of the program's graphical output corresponds to writing code in a general-purpose programming language, and edits not possible with the mouse can still be enacted through ordinary text edits to the program. Such capabilities provide hope for integrating graphical user interfaces into what are currently text-centric programming environments.

To further advance this vision, we present a variety of new output-directed techniques that extend the expressive power of Sketch-n-Sketch, an output-directed programming system for creating programs that generate vector graphics. To enable output-directed interaction at more stages of program construction, we expose intermediate execution products for manipulation and we present a mechanism for contextual drawing. Looking forward to output-directed programming beyond vector graphics, we also offer generic refactorings through the GUI, and our techniques employ a domain-agnostic provenance tracing scheme.

To demonstrate the improved expressiveness, we implement a dozen new parametric designs in Sketch-n-Sketch without text-based edits. Among these is the first demonstration of building a recursive function in an output-directed programming setting.
ICSE ’18
Deuce: A Lightweight User Interface for Structured Editing
- Brian Hempel
- Justin Lubin
- Grace Lu
- Ravi Chugh
- pdf
- demo
- repo
Abstract

We present a structure-aware code editor, called Deuce, that is equipped with direct manipulation capabilities for invoking automated program transformations. Compared to traditional refactoring environments, Deuce employs a direct manipulation interface that is tightly integrated within a text-based editing workflow. In particular, Deuce draws (i) clickable widgets atop the source code that allow the user to structurally select the unstructured text for subexpressions and other relevant features, and (ii) a lightweight, interactive menu of potential transformations based on the current selections. We implement and evaluate our design with mostly standard transformations in the context of a small functional programming language. A controlled user study with 21 participants demonstrates that structural selection is preferred to a more traditional text-selection interface and may be faster overall once users gain experience with the tool. These results accord with Deuce's aim to provide human-friendly structural interactions on top of familiar text-based editing.

Drafts

Bio Paper

(Under review.)
Programmable epigenome editing by transient delivery of CRISPR epigenome editor ribonucleoproteins
- Da Xu
- ∗
- Swen Besselink
- ∗
- Gokul N. Ramadoss
- Philip H. Dierks
- Justin P. Lubin
- Rithu K. Pattali
- Jinna I. Brim
- Anna E. Christenson
- Peter J. Colias
- Izaiah J. Ornelas
- Carolyn D. Nguyen
- Sarah E. Chasins
- Bruce R. Conklin
- James K. Nuñez
- preprint
Abstract

Programmable epigenome editors modify gene expression in mammalian cells by altering the local chromatin environment at target loci without inducing DNA breaks. However, the large size of CRISPR-based epigenome editors poses a challenge to their broad use in biomedical research and as future therapies. Here, we present Robust ENveloped Delivery of Epigenome-editor Ribonucleoproteins (RENDER) for transiently delivering programmable epigenetic repressors (CRISPRi, DNMT3A-3L-dCas9, CRISPRoff) and activator (TET1-dCas9) as ribonucleoprotein complexes into human cells to modulate gene expression. After rational engineering, we show that RENDER induces durable epigenetic silencing of endogenous genes across various human cell types, including primary T cells. Additionally, we apply RENDER to epigenetically repress endogenous genes in human stem cell-derived neurons, including the reduction of the neurodegenerative disease associated V337M-mutated Tau protein. Together, our RENDER platform advances the delivery of CRISPR-based epigenome editors into human cells, broadening the use of epigenome editing in fundamental research and therapeutic applications.

Fun Stuff

Stop by the Elm Town podcast to hear a lovely conversation I had with the host, Jared M. Smith!
Take a listen to some of my music or play my small musical video game!

Research

Conference and Journal Publications

Drafts

Programmable epigenome editing by transient delivery of CRISPR epigenome editor ribonucleoproteins

Fun Stuff