Justin Lubin

Research Mission To enable scientists to write code with only domain expertise, not programming expertise.

Hi there, I’m Justin! I am a PhD candidate in computer science advised by Sarah E. Chasins at UC Berkeley. I worked with Ravi Chugh as an undergrad at UChicago. I am very grateful for their excellent mentorship and strive to pay it forward!

Research

I co-design programming systems with scientists. These systems empower scientists without a background in computing to write the code they need by themselves. To make new user interactions possible in these systems, I develop programming language theory informed by what I learn from my deep embedding with domain experts and my mixed method human-computer interaction research.

I’m currently immersed in the world of biology to foster a substantive, ongoing, and reciprocal relationship between the fields of programming languages and experimental biology. I have been collaborating most closely with the lovely folks of the Nuñez Lab to advance research in biology and to answer the following question with them:

How can we work with scientists to build a programming system for them and with them to empower them to write the code they need without having programming expertise?

Conference and Journal Publications

For more, please see my CV or Google Scholar! (∗ = equal contribution, † = research mentee)

PLDI ’26
Navigating AND–OR Graph Modifications to Debug Failing Proof Search
- Justin Lubin
- Marlena Preigh
- †
- Max Willsey
- Sarah E. Chasins
- official pdf+doi coming soon!
- preprint
- supplement
- repo
Abstract

Proof search powers our most advanced programming tools, from type systems, to search tactics for interactive theorem provers, to Datalog-backed program analyses. Although proof search tooling is powerful and now pervasive, debugging it is hard, even for experts. When proof search cannot prove the goal, the programmer’s best source of information is a massive AND–OR graph representing the tool’s internal state during the proof search process. The difficulty of understanding and debugging this vast trace of internal state locks programmers out of exactly the high-assurance automated reasoning tools we want them to adopt.

We propose a new formulation of proof search debugging, which: (i) views AND–OR graphs as a partial representations of the underlying proof system, (ii) treats debugging as a process of applying modifications to this proof system, and (iii) uses a debugging tool to solicit these modifications until the resulting proof system proves the original goal. This approach unifies decades of ad-hoc strategies in a single general-purpose framework and is applicable to the diverse range of programming tools that use proof search. Our framework can express existing “why-not” debugging strategies as well as new strategies, and we evaluate such strategies on 284 AND–OR graphs. We find that a strategy that enforces a property called Strong Soundness. reduces the number of decisions by 1.4×–3.2× compared to an unsound baseline, and a new property we call Strong Completeness Modulo Observability. enables pruning to further reduce decisions by 1.0×–2.8× for an overall reduction of 2.0×–3.8×.
CHI ’26
Programming by Scaffolded Demonstration with Perpend
- Angela Bi
- ∗
- Eric Rawn
- ∗
- Justin Lubin
- Sarah E. Chasins
- pdf
- demo
- repo
Abstract

Output-centric programming paradigms such as Direct Manipulation Programming, Programming By Demonstration, and Programming By Example enable users to author programs by constructing an intended output. However, sometimes the purpose of a programming interaction is to discover an “intended output” in the first place (e.g., exploratory data analysis, improvisational creative coding, early-stage prototyping). We argue that one role for output-centric programming here is scaffolding the user in demonstrating their next program editing step by selecting among possible modifications to their current program. We call this Programming By Scaffolded Demonstration (PBSD). To explore PBSD, we built Perpend, a programming environment for p5.js. In a user study with nine artists, we juxtapose Perpend with an existing Direct Manipulation editor, exploring how participants used Perpend to situate themselves within a space of possible programs, shift focus between program text and visual output, and shape their exploration by modifying their program structure.
Nat. Commun. ’25

Biology paper
Programmable epigenome editing by transient delivery of CRISPR epigenome editor ribonucleoproteins
- Da Xu
- ∗
- Swen Besselink
- ∗
- Gokul N. Ramadoss
- Philip H. Dierks
- Justin P. Lubin
- Rithu K. Pattali
- Jinna I. Brim
- Anna E. Christenson
- Peter J. Colias
- Izaiah J. Ornelas
- Carolyn D. Nguyen
- Sarah E. Chasins
- Bruce R. Conklin
- James K. Nuñez
- repo
Abstract

Programmable epigenome editors modify gene expression in mammalian cells by altering the local chromatin environment at target loci without inducing DNA breaks. However, the large size of CRISPR-based epigenome editors poses a challenge to their broad use in biomedical research and as future therapies. Here, we present Robust ENveloped Delivery of Epigenome-editor Ribonucleoproteins (RENDER) for transiently delivering programmable epigenetic repressors (CRISPRi, DNMT3A-3L-dCas9, CRISPRoff) and activator (TET1-dCas9) as ribonucleoprotein complexes into human cells to modulate gene expression. After rational engineering, we show that RENDER induces durable epigenetic silencing of endogenous genes across various human cell types, including primary T cells. Additionally, we apply RENDER to epigenetically repress endogenous genes in human stem cell-derived neurons, including the reduction of the neurodegenerative disease associated V337M-mutated Tau protein. Together, our RENDER platform advances the delivery of CRISPR-based epigenome editors into human cells, broadening the use of epigenome editing in fundamental research and therapeutic applications.
PLDI ’25
Programming by Navigation
- Justin Lubin
- Parker Ziegler
- Sarah E. Chasins
Selected for MIT PL Review 2026.
- pdf
- presentation
- slides
- errata
- supplement
- repo
Abstract

When a program synthesis task starts from an ambiguous specification, the synthesis process often involves an iterative specification refinement process. We introduce the Programming by Navigation Synthesis Problem, a new synthesis problem adapted specifically for supporting iterative specification refinement in order to find a particular target solution. In contrast to prior work, we prove that synthesizers that solve the Programming by Navigation Synthesis Problem show all valid next steps (Strong Completeness) and only valid next steps (Strong Soundness). To meet the demands of the Programming by Navigation Synthesis Problem, we introduce an algorithm to turn a type inhabitation oracle (in the style of classical logic) into a fully constructive program synthesizer. We then define such an oracle via sound compilation to Datalog. Our empirical evaluation shows that this technique results in an efficient Programming by Navigation synthesizer that solves tasks that are either impossible or too large for baselines to solve. Our synthesizer is the first to guarantee that its specification refinement process satisfies both Strong Completeness and Strong Soundness.
PLDI ’25
Fast Direct Manipulation Programming with Patch-Reconciliation Correspondence
- Parker Ziegler
- Justin Lubin
- Sarah E. Chasins
- pdf
- presentation
- supplement
- repo
Abstract

Direct manipulation programming gives users a way to write programs without directly writing code, by using the familiar GUI-style interactions they know from direct manipulation interfaces. To date, direct manipulation programming environments have relied on two core components: (1) a patch component, which updates the program based on a GUI interaction, and (2) a forward evaluator, which executes the patched program to produce an updated program output. This architecture has worked for developing short-running programs—i.e., programs that reliably execute in <1 second—generating outputs such as SVG and HTML documents. However, direct manipulation programming has not yet been applied to long-running programs (e.g., data visualization, mapping), perhaps because executing such programs in response to every GUI interaction would mean crossing outside of interactive speeds. We propose extending direct manipulation programming to long-running programs by pairing a standard patch component (patch) with a corresponding reconciliation component (recon). recon directly updates the program output in response to a GUI interaction, obviating the need for forward evaluation.

We introduce corresponding patch and recon procedures for the domain of geospatial data visualization and prove them sound—that is, we show that the output produced by recon is identical to the output produced by forward-evaluating a patch-modified program. recon can operate both incrementally and in parallel with patch. Our implementation of our patch-recon instantiation achieves a 2.92× median reduction in interface latency compared to forward evaluation on a suite of real-world geospatial visualization tasks. Looking forward, our results suggest implementations based on patch-recon correspondence are a viable path for extending direct manipulation programming to additional programming domains.
PLDI ’24
Equivalence by Canonicalization for Synthesis-Backed Refactoring
- Justin Lubin
- Jeremy Ferguson
- ∗†
- Jacob Yim
- ∗†
- Kevin Ye
- ∗†
- Sarah E. Chasins
- pdf
- presentation
- supplement
- repo
Abstract

We present an enumerative program synthesis framework called component-based refactoring that can refactor “direct” style code that does not use library components into equivalent “combinator” style code that does use library components. This framework introduces a sound but incomplete technique to check the equivalence of direct code and combinator code called equivalence by canonicalization that does not rely on input-output examples or logical specifications. Moreover, our approach can repurpose existing compiler optimizations, leveraging decades of research from the programming languages community. We instantiated our new synthesis framework in two contexts: (i) higher-order functional combinators such as map and filter in the statically-typed functional programming language Elm and (ii) high-performance numerical computing combinators provided by the NumPy library for Python. We implemented both instantiations in a tool called Cobbler and evaluated it on thousands of real programs to test the performance of the component-based refactoring framework in terms of execution time and output quality. Our work offers evidence that synthesis-backed refactoring can apply across a range of domains without specification beyond the input program.
UIST ’22
Exploring the Learnability of Program Synthesizers by Novice Programmers
- Dhanya Jayagopal
- ∗†
- Justin Lubin
- ∗
- Sarah E. Chasins
- pdf
- presentation
- slides
Abstract

Modern program synthesizers are increasingly delivering on their promise of lightening the burden of programming by automatically generating code, but little research has addressed how we can make such systems learnable to all. In this work, we ask: What aspects of program synthesizers contribute to and detract from their learnability by novice programmers? We conducted a thematic analysis of 22 observations of novice programmers, during which novices worked with existing program synthesizers, then participated in semi-structured interviews. Our findings shed light on how their specific points in the synthesizer design space affect these tools’ learnability by novice programmers, including the type of specification the synthesizer requires, the method of invoking synthesis and receiving feedback, and the size of the specification. We also describe common misconceptions about what constitutes meaningful progress and useful specifications for the synthesizers, as well as participants’ common behaviors and strategies for using these tools. From this analysis, we offer a set of design opportunities to inform the design of future program synthesizers that strive to be learnable by novice programmers. This work serves as a first step toward understanding how we can make program synthesizers more learnable by novices, which opens up the possibility of using program synthesizers in educational settings as well as developer tooling oriented toward novice programmers.
OOPSLA ’21
How Statically-Typed Functional Programmers Write Code
- Justin Lubin
- Sarah E. Chasins
- pdf
- presentation
- slides
- poster
Abstract

How working statically-typed functional programmers write code is largely understudied. And yet, a better understanding of developer practices could pave the way for the design of more useful and usable tooling, more ergonomic languages, and more effective on-ramps into programming communities. The goal of this work is to address this knowledge gap: to better understand the high-level authoring patterns that statically-typed functional programmers employ. We conducted a grounded theory analysis of 30 programming sessions of practicing statically-typed functional programmers, 15 of which also included a semi-structured interview. The theory we developed gives insight into how the specific affordances of statically-typed functional programming affect domain modeling, type construction, focusing techniques, exploratory and reasoning strategies, and expressions of intent. We conducted a set of quantitative lab experiments to validate our findings, including that statically-typed functional programmers often iterate between editing types and expressions, that they often run their compiler on code even when they know it will not successfully compile, and that they make textual program edits that reliably signal future edits that they intend to make. Lastly, we outline the implications of our findings for language and tool design. The success of this approach in revealing program authorship patterns suggests that the same methodology could be used to study other understudied programmer populations.
ICFP ’20
Program Sketching with Live Bidirectional Evaluation
- Justin Lubin
- Nick Collins
- Cyrus Omar
- Ravi Chugh
- pdf
- presentation
- slides
- website
- honors thesis
- poster
- repo
Abstract

We present a system called Smyth for program sketching in a typed functional language whereby the concrete evaluation of ordinary assertions gives rise to input-output examples, which are then used to guide the search to complete the holes. The key innovation, called live bidirectional evaluation, propagates examples "backward" through partially evaluated sketches. Live bidirectional evaluation enables Smyth to (a) synthesize recursive functions without trace-complete sets of examples and (b) specify and solve interdependent synthesis goals. Eliminating the trace-completeness requirement resolves a significant limitation faced by prior synthesis techniques when given partial specifications in the form of input-output examples.

To assess the practical implications of our techniques, we ran several experiments on benchmarks used to evaluate Myth, a state-of-the-art example-based synthesis tool. First, given expert examples (and no partial implementations), we find that Smyth requires on average 66% of the number of expert examples required by Myth. Second, we find that Smyth is robust to randomly-generated examples, synthesizing many tasks with relatively few more random examples than those provided by an expert. Third, we create a suite of small sketching tasks by systematically employing a simple sketching strategy to the Myth benchmarks; we find that user-provided sketches in Smyth often further reduce the total specification burden (i.e. the combination of partial implementations and examples). Lastly, we find that Leon and Synquid, two state-of-the-art logic-based synthesis tools, fail to complete several tasks on which Smyth succeeds.
UIST ’19
Sketch-n-Sketch: Output-Directed Programming for SVG
- Brian Hempel
- Justin Lubin
- Ravi Chugh
- pdf
- demo
- repo
Abstract

For creative tasks, programmers face a choice: Use a GUI and sacrifice flexibility, or write code and sacrifice ergonomics?

To obtain both flexibility and ease of use, a number of systems have explored a workflow that we call output-directed programming. In this paradigm, direct manipulation of the program's graphical output corresponds to writing code in a general-purpose programming language, and edits not possible with the mouse can still be enacted through ordinary text edits to the program. Such capabilities provide hope for integrating graphical user interfaces into what are currently text-centric programming environments.

To further advance this vision, we present a variety of new output-directed techniques that extend the expressive power of Sketch-n-Sketch, an output-directed programming system for creating programs that generate vector graphics. To enable output-directed interaction at more stages of program construction, we expose intermediate execution products for manipulation and we present a mechanism for contextual drawing. Looking forward to output-directed programming beyond vector graphics, we also offer generic refactorings through the GUI, and our techniques employ a domain-agnostic provenance tracing scheme.

To demonstrate the improved expressiveness, we implement a dozen new parametric designs in Sketch-n-Sketch without text-based edits. Among these is the first demonstration of building a recursive function in an output-directed programming setting.
ICSE ’18
Deuce: A Lightweight User Interface for Structured Editing
- Brian Hempel
- Justin Lubin
- Grace Lu
- Ravi Chugh
- pdf
- demo
- repo
Abstract

We present a structure-aware code editor, called Deuce, that is equipped with direct manipulation capabilities for invoking automated program transformations. Compared to traditional refactoring environments, Deuce employs a direct manipulation interface that is tightly integrated within a text-based editing workflow. In particular, Deuce draws (i) clickable widgets atop the source code that allow the user to structurally select the unstructured text for subexpressions and other relevant features, and (ii) a lightweight, interactive menu of potential transformations based on the current selections. We implement and evaluate our design with mostly standard transformations in the context of a small functional programming language. A controlled user study with 21 participants demonstrates that structural selection is preferred to a more traditional text-selection interface and may be faster overall once users gain experience with the tool. These results accord with Deuce's aim to provide human-friendly structural interactions on top of familiar text-based editing.

Preprints

Biology paper
LEMONmethyl-seq: Targeted long-read DNA methylation profiling reveals dynamics of CRISPR epigenome editing and endogenous DNA methylation patterns
- Anna E. Christenson
- Nikita S. Divekar
- Justin P. Lubin
- Luis G. Palma
- Peter J. Colias
- Rithu K. Pattali
- Da Xu
- Akane Hubbard
- Katie Lin
- Ngan T. Phan
- Bernardo D. Moreno
- Sarah E. Chasins
- S. John Liu
- James K. Nuñez
- repo
Abstract

Background: DNA methylation is the most prevalent epigenetic modification in human cells and undergoes dynamic changes during cell differentiation, disease progression, and aging. Here, we introduce Locus-Enriched Mapping Of Nucleotide methylation (LEMONmethyl-seq): an optimized, cost-effective pipeline for single-nucleotide detection of DNA methylation using locus-specific amplification and long-read DNA sequencing.

Results: We apply LEMONmethyl-seq to profile DNA methylation of endogenous gene promoters across different cell types along with DNA methylation establishment and long-range propagation induced by CRISPR epigenome editing technologies. We profile dynamic changes in DNA methylation patterns on transposable element genomic loci during global epigenetic resetting in stem cells, and we identify site-specific enrichment of non-canonical CpH methylation on genomic sites in stem cells and cultured neurons. Lastly, we apply LEMONmethyl-seq to profile DNA methylation across the MGMT promoter, a clinical biomarker for glioblastoma. We identify additional differentially methylated sites correlated with chemotherapeutic sensitivity, which may be clinically relevant.

Conclusions: Together, LEMONmethyl-seq serves as a cost-effective, long-read DNA methylation sequencing pipeline that advances methods for detecting DNA methylation patterns and dynamics in mammalian cells. We envision its broad use for studying chromatin pathways, diagnostics, and therapeutic applications.
Biology paper
Transcriptional regulation of disease-relevant microglial activation programs
- Amanda McQuade
- Reet Mishra
- Venus Hagan
- Weiwei Liang
- Peter J. Colias
- Vincent Cele Castillo
- Justin P. Lubin
- Verena Haage
- Victoria Marshe
- Masashi Fujita
- Layla Gomes
- Thomas Ta
- Olivia Teter
- Sarah E. Chasins
- Philip L. De Jager
- James K. Nuñez
- Martin Kampmann
- popular press coverage
Abstract

Microglia, the brain’s innate immune cells, can adopt a wide variety of activation states relevant to health and disease. Dysregulation of microglial activation occurs in numerous brain disorders, and driving or inhibiting specific states could be therapeutic. To discover regulators of microglial activation states, we conducted CRISPR interference screens in iPSC-derived microglia for inhibitors and activators of six microglial states. We identified transcriptional regulators for each of these states and characterized 31 regulators at the single-cell transcriptomic and cell-surface proteome level in two distinct iPSC-derived microglia models. Finally, we functionally characterized several regulators. STAT2 knockdown inhibits interferon response and lysosomal function. PRDM1 knockdown drives disease-associated and lipid-rich signatures and enhanced phagocytosis. DNMT1 knockdown results in widespread loss of methylation, activating negative regulators of interferon signaling. These findings provide a framework to direct microglial activation to selectively enrich microglial activation states, define their functional outputs, and inform future therapies.

Fun Stuff

Stop by the Elm Town podcast to hear a lovely conversation I had with the host, Jared M. Smith!
Take a listen to some of my music or play my small musical video game!

Justin Lubin (he/him)

Research

Conference and Journal Publications

Navigating AND–OR Graph Modifications to Debug Failing Proof Search

Programming by Scaffolded Demonstration with Perpend

Programmable epigenome editing by transient delivery of CRISPR epigenome editor ribonucleoproteins

Programming by Navigation

Fast Direct Manipulation Programming with Patch-Reconciliation Correspondence

Equivalence by Canonicalization for Synthesis-Backed Refactoring

Exploring the Learnability of Program Synthesizers by Novice Programmers

How Statically-Typed Functional Programmers Write Code

Program Sketching with Live Bidirectional Evaluation

Sketch-n-Sketch: Output-Directed Programming for SVG

Deuce: A Lightweight User Interface for Structured Editing

Preprints

LEMONmethyl-seq: Targeted long-read DNA methylation profiling reveals dynamics of CRISPR epigenome editing and endogenous DNA methylation patterns

Transcriptional regulation of disease-relevant microglial activation programs

Fun Stuff