EvolutionaryScale Introduces ESM3: A Frontier Multimodal Generative Language Model that Reasons Over the Sequence, Structure, and Function of Proteins
What is ESM3's primary function in protein engineering?

ESM3 is a generative language model designed for protein engineering. Its primary function is to simulate evolutionary processes to create functional proteins that are vastly different from known ones. By integrating sequence, structure, and function, ESM3 can generate proteins following complex prompts, offering creative solutions to biological challenges in drug discovery, materials science, and carbon capture3.
What unique protein did ESM3 generate?

ESM3 generated a new green fluorescent protein (GFP) called esmGFP, which is 58% different from any known fluorescent proteins. This degree of difference is comparable to 500 million years of natural evolution.
How does ESM3 integrate sequence, structure, and function?

ESM3 integrates sequence, structure, and function by representing them as tracks of discrete tokens at the input and output2. The model is a series of transformer blocks, where all tracks are fused within a single latent space. Geometric attention in the first block allows conditioning on atomic coordinates. This integration enables ESM3 to generate diverse, high-quality proteins that differ significantly from known natural proteins.