De Novo protein design, a revolution in writing the new language of life
Published on May 26, 2026

Preface
For the first time in human history, we are not imitating nature, but creating entirely new protein molecules from scratch. We are opening a door that has never been opened before. This is no longer 'reading comprehension' of the code of life; it is personally writing an entirely new chapter.
Imagine this: you are an architect, but your task is not to build a house based on any existing blueprint. Instead, you start directly from physical principles and functional requirements, and 'draw' a building that has never existed before—where every atom in this building must be precisely arranged, or the entire structure will collapse.
This is the challenge faced by de novo protein design.
'"De novo" comes from Latin, meaning "from the beginning." De novo protein design refers to designing proteins with entirely new structures and functions from scratch, relying solely on physical principles and computational methods, without using any natural protein templates.
Proteins are the 'molecular machines' of life—they catalyze reactions, transmit signals, build structures, and defend against pathogens... Behind almost every intricate life process, proteins are 'at work.' Nature has spent billions of years evolving the rich and diverse world of proteins we see today. But many of the challenges humanity faces—cancer, Alzheimer’s disease, environmental pollution, the need for novel materials—will not wait for us for millions of years.
De novo protein design, in essence, is a 'cheat' against the time of evolution.
From 'patching' to 'creating from scratch': a leap in a field
The approach of traditional protein engineering is essentially to 'patch and repair' based on nature’s existing work—through site-directed mutations, directed evolution, and other methods, making local optimizations to natural proteins. Although effective, this method is limited by the 'evolutionary baggage' of natural proteins: limited structural plasticity, difficulty achieving major functional leaps, and some entirely new functions simply do not exist in natural proteins.
The ambition of de novo protein design is much greater: not to modify, but to create. It allows researchers to directly define the target function and then 'reverse-design' the protein sequences and structures that can achieve this function. This process is not constrained by evolutionary history and can explore structural and functional spaces that nature has never 'tried.'
The development of this field has gone through three key stages:

The Development of De novo Design
01
Classical Era (1990s–2020): Early research mainly relied on physics-based energy function methods. The Rosetta software suite was the core tool of this era, searching the sequence-structure space through force field energy minimization. In 2003, scientists successfully designed Top7 — the first stable protein fold designed entirely from scratch without relying on any natural protein templates, marking a milestone in the field. However, the bottleneck of classical methods was the high computational cost and limited success rate.
02
Structure Prediction-Driven Era (2020–2022): Breakthroughs in protein structure prediction models, such as AlphaFold, brought a new paradigm to protein design. Researchers developed the "hallucination" strategy — by optimizing the input sequence, the structure prediction model predicts the target structure with high confidence, thereby obtaining sequences that can fold into the desired configuration in reverse.
03
Generative AI Era (2022–present): Protein design has now entered a new era centered on generative artificial intelligence. Diffusion models can generate protein backbone structures directly from random noise, graph neural networks design optimal amino acid sequences for given backbones, and structure prediction tools are used to validate design schemes — these three tools constitute the standard "generate–design–validate" workflow.
How fast is the leap in this field? In 2025, Nobel laureate David Baker led a team to achieve, for the first time, AI "from scratch" design of serine hydrolases with complex active sites — not by modifying natural enzymes, but by creating entirely new enzyme molecules that do not exist in nature. As researchers have said, the central question in the field of protein design is shifting from "how to design proteins" to "what proteins should be designed".
Why is De Novo Design So Important?
If you think that 'designing proteins from scratch' is just an academic curiosity, you may need to reconsider its value.
First, it breaks through the functional limits of natural proteins. Natural enzymes can only catalyze reactions that exist in nature, and for synthetic chemical molecules—such as novel plastics or specific drug intermediates—natural enzymes are often helpless. De novo design can directly target the desired chemical reaction and 'customize' the active sites required for catalysis. This opens up limitless possibilities for biomanufacturing.
Second, it significantly shortens the research and development cycle. The development of traditional protein drugs usually takes months or even years and relies on extensive wet-lab experiments for repeated screening and validation. AI-driven de novo design can rapidly explore a vast number of candidates in a virtual space and then focus on the few most promising candidates for experimental verification, leading to an exponential increase in R&D efficiency.
Third, it can create 'superpower' proteins that do not exist in nature. Researchers can design industrial enzymes with stability far beyond that of natural proteins, binding proteins that precisely target specific disease sites, and even smart protein materials with 'switch' functions. As scientists envision: 'Imagine a future where you can press a button and design a custom enzyme for your specific application.'
The application prospects of this field are extremely broad. In medicine, de novo design can create entirely new binding proteins that precisely target disease sites for treatments of cancer, autoimmune diseases, and more. In industry, from super enzymes that degrade plastics to biocatalysts that fix carbon dioxide, from food proteins with higher and more stable sweetness to programmable biomaterials, de novo design is becoming a core technology engine for synthetic biology and green biomanufacturing.
MatwingsVenus™: Taking De Novo Design from 'the art of a few' to 'accessible infrastructure'

MatwingsVenus™(晓鹜™)
After discussing so many exciting prospects, a practical problem must be faced: the threshold for de novo protein design is extremely high.
Traditional protein design workflows require researchers to master expertise in multiple fields such as structural biology, computational chemistry, and bioinformatics, while also needing to utilize a variety of computational tools and databases. Moreover, design results must go through cumbersome wet-lab verification. A complete 'design-validate-iterate' cycle often takes months or even longer and relies on the manpower and resources of large institutions.
It is against this backdrop that a platform aimed at changing the game was born.
On April 24, 2026, Shanghai Matwings Technology, a company focused on AI-driven full-stack protein research and development, officially launched the conversational protein research and development intelligent agent — MatwingsVenus™ (Xiaowu™). The core innovation of this platform lies in its deep integration of AI computational design, automated wet experiments, and expert knowledge, establishing an intelligent R&D closed loop of 'design is validation, validation is iteration.'
Specifically, the competitiveness of MatwingsVenus™ (Xiaowu™) is reflected in three dimensions:
First, the 'conversational' feature lowers the usage threshold. Users only need to input task objectives in natural language, and the system can automatically break down tasks and schedule corresponding design, prediction, analysis, and screening capabilities. It completes in-depth research, enzyme mining, directed evolution, de novo design, and automated wet experiment collaboration. The platform integrates 200 protein design tools, tens of billions of real labeled protein data, 50 platform-certified experts, and 30 skills fine-tuned by experts in various fields — resources that previously only top teams could mobilize are now easily accessible to individual developers.
Second, the 'dry-wet loop' bridges the most critical validation step. In the traditional model, after completing computational design, scientists need to manually transfer sequence information and contact outsourced labs for synthesis and characterization, making the whole process lengthy and prone to errors. After the AI agent completes the design, MatwingsVenus™ (Xiaowu™) can directly feed the results into automated experimental procedures through a self-built communication mechanism, driving robots to perform sample preparation, protein purification, and functional testing, with experimental results finally flowing back into the next round of AI design, forming an 'AI-driven wet experiment, wet experiment-fed AI' iterative cycle.
Third, practical validation has proven the platform's capabilities.
In a de novo design project for an immune regulatory receptor target, Matwings Technology leveraged the MatwingsVenus™ ™ platform, using target structure and functional requirements as inputs, and AI agents automatically completed the entire computational process including skeleton screening, interface design, sequence optimization, and druggability prediction. They successfully obtained dozens of brand-new binder molecules with in vitro cell blocking activity—these molecules not only have cell blocking activity but also possess functional inhibition and high affinity potential. In another case of protein-based sweetness modification, the platform adopted a multi-round iterative strategy of "agent design—automated experiments—AI feedback—agent redesign." Ultimately, the sweetness of several samples increased more than tenfold compared to the wild type, and heat resistance remained at a high range of about 75°C.
As industry analysts have pointed out, the capital market values not just the algorithm itself, but the "data flywheel" barrier it builds—for AI protein design, whoever has high-quality wet experimental feedback data has the deepest moat. Matwings Technology has completed Series A financing of over 100 million yuan and over 200 million yuan, which is precisely the market's recognition of this "AI design, automated verification" full-stack model.
Outlook: As "molecule makers" move toward "machine makers"

Outlook for the Future
Looking back at the development of de novo protein design, from the birth of the first entirely new fold Top7 in 2003, to AI designing functional enzymes from scratch by 2025, and now to conversational intelligent agents enabling individual developers to access top-level design capabilities—this field has spent more than twenty years completing a leap from "whether it can be done" to "how to do it better" and then to "how to let everyone do it."
Looking ahead to the next five to ten years, protein design is expected to move from "making molecules" to "making machines": researchers are likely to design complex protein nanomachines with functions far beyond what natural evolution can produce, bringing transformative applications to medicine, materials science, and energy. Super enzymes that can degrade plastics can be designed, more efficient and greener biosynthetic pathways can be created, and even "therapeutic enzymes" capable of precisely cutting pathogen-specific molecules can be developed.
As represented by MatwingsVenus™ (Xiaowu™), the vision is: when protein design shifts from "driven by large platforms" to "accessible to individuals," and when everyone can develop the proteins they want through dialogue with AI, what we are witnessing may not be just an upgrade of tools, but the true arrival of the "developer era" in life sciences.
In this shift from "understanding life" to "designing life," de novo protein design is no longer merely an experimental topic for scientists, but is becoming a core engine driving the future development of the bioeconomy.