New AI Model

mDD-0: mRNA Discrete Diffu­sion for Generation of Stable mRNA Sequences

A generative model for complete mRNA sequences that deliver custom therapeutic payloads

Authors: Alyssa Morrow, Michal Jastrzebski, Jake Wintermute

Contributors: Siqi Zhao, Justin Gardin, Lood van Niekerk, Valentin Zulkower, Elise Flynn, Joshua Moller, Porfirio Quintero Cadena, Hao Shen, Dana Merrick, Ankit Gupta, Seth Ritter

Ginkgo Bioworks is excited to announce mDD-0, a generative AI model for designing full-length mRNA sequences. Trained with a large dataset of genomic sequences spanning hundreds of species, mDD-0 learns to generate novel mRNA sequences using discrete diffusion. We created mDD-0 with a unique multimodel architecture, allowing it to jointly learn from different regions of an mRNA sequence, including the coding sequence (CDS) and the 3' and 5' untranslated regions (UTRs), to produce an integrated mRNA model.

This new model makes mRNA easier to engineer. With mDD-0 you can:

  • Generate diverse mRNA sequences that resemble native genomic mRNA

  • Fine-tune on payload-specific functional features to improve mRNA stability, protein expression, and translation efficiency.

  • Experimentally validate your mRNA designs by partnering with Ginkgo to access our data generation platform.

Access to the base mDD-0 model is available through Ginkgo's Model API and you can find additional documentation here. If you're interested in building on mDD-0 with additional training data to enhance the efficacy of your payload of interest, contact us today!

mDD-0: mRNA Discrete Diffusion for Generation of Stable mRNA Sequences

Recent advances in AI have enabled remarkable progress in the design and optimization of messenger RNA (mRNA) for RNA vaccines and therapeutics. While recent work on mRNA design has largely focused on designing individual components of the mRNA sequence, designing end-to-end mRNA sequences presents a unique challenge. Specifically, designing functional mRNA requires joint optimization of the coding sequence (CDS) for a protein of interest in addition to the 3’ and 5’ untranslated regions (UTRs).

Recently, diffusion models have emerged as a powerful generative framework for protein design and sequence generation, for example, Ginkgo's recently released antibody discrete diffusion model. Building on these advances, we introduce mRNA discrete diffusion (mDD-0), a discrete diffusion model for the generation of mRNA sequences.

Access, documentation & example usage

Access to the base mDD-0 model is available through Ginkgo's Model API and you can find additional documentation here. If you're interested in building on mDD-0 with additional training data to enhance the delivery of your payload of interest, contact us today!