AI-Driven Enzyme Engineering Platform: From Trial and Error to Computation | MatwingsVenus™（晓鹜™）

In an unremarkable laboratory, a thermostatic chamber is constantly maintained at 65 degrees Celsius. Inside, a specially engineered enzyme is rapidly breaking down fragments of plastic bottles, completely 'digesting' the plastic in just a few hours. This high-temperature-resistant plastic-degrading enzyme was not screened from nature; its creation followed a completely different path—it was mined and optimized by a large artificial intelligence model, accomplishing in one step the leap from sequence design to functional verification.

The Vast Sequence Space of Proteins

1. The Dilemma of Traditional Enzyme Engineering: Why Is It So Difficult to Modify an Enzyme?

The story of enzyme engineering is, at its core, a long history of trial and error.

Enzymes are almost everywhere in industry—decontamination proteases in laundry detergents, lipases in biodiesel production, chiral synthases in pharmaceuticals, and amylase in food processing. Natural enzymes provided by nature often fall short of expectations. They may be vigorous in mild physiological environments, but when exposed to industrial production conditions of high temperatures, strong acids, or strong alkalis, their performance is greatly compromised. Humanity had no choice but to take the path of enzyme modification.

Traditional enzyme engineering mainly relies on two paths. Directed evolution replicates the logic of natural selection: first, a vast variant library is constructed through random mutation, and then mutants with improved functions are selected from them. Rational design advances in a knowledge-driven direction, using the enzyme's three-dimensional structural information to perform targeted modifications of active sites or key residues. Both paths come at a high cost.

The bottleneck in directed evolution lies in the screening stage. The sequence space generated by random mutations is suffocatingly vast. Liu Hao, CTO of Matwings Technology, gave an example: a protein composed of 361 amino acids can have nearly 7,000 possibilities for just one amino acid substitution; Replacing two of them, the number soared to over 23 million; Replacing three units could push the potential to about 53.3 billion. With traditional methods, finding mutants that meet the needs has a positivity rate of only about 1%, with a cycle often lasting several months or even years.

Rational design appears smarter but heavily relies on high-quality 3D structural data and a deep human understanding of protein folding dynamics. Once encountered with enzymes of unknown structure, or systems with complex catalytic mechanisms and drastically changing conformations, this path often fails. An enzyme composed of just 100 amino acids has a sequence combination space as high as 10,130, far surpassing the physical limits of experimental screening. More challengingly, functional sequences are extremely sparse in this space—about one in every 1,013 random sequences is likely catalytic. The random mutation strategy relied upon by traditional directional evolution ultimately means relying on luck in an almost empty space.

Even after crossing the mathematical threshold of serial space, multi-objective collaborative optimization remains a major obstacle in practical engineering. Industrial applications require enzymes to simultaneously meet multiple dimensions such as activity, thermal stability, substrate selectivity, and tolerance, and these indicators often compete against each other. Over-optimizing thermal stability may damage the dynamic flexibility of active sites, potentially causing loss of activity. It increases the selectivity of one substrate but may weaken its catalytic ability against other substrates. Finding that optimal balance point on the Pareto front is almost impossible based on manual experience alone.

2. How AI Rewrites the Rules of Enzyme Engineering: A Paradigm Leap from "Trial" to "Computational"

With AI involved, enzyme engineering has gradually shifted from a "science" dependent on experience and luck to a predictable and highly efficient "engineering." This shift is reflected in three key levels.

Generative AI first makes "design from scratch" possible. Traditional methods can only patch up known natural enzymes, but generative models can start from random noise and directly create entirely new enzyme sequences that have never existed in nature. A study published in Nature Communications in 2026 provides a footnote: researchers used protein language models to generate a series of novel tryptophan synthases (TrpBs), which not only fold correctly and have catalytic activity, but also have substrate breadth in several variants, even surpassing versions that have undergone multiple rounds of directed evolutionary optimization. AI has learned to create, not just optimize.

Second, multi-objective optimization algorithms have solved the long-standing "robbing Peter to pay Paul" dilemma. With the help of technologies such as reinforcement learning, the enzyme design process can be modeled into decision sequences—algorithms automatically adjust sequences based on multidimensional indicators such as activity, thermal stability, substrate selectivity, and more, finding optimal compromises between conflicting goals.

The high-speed closed-loop flywheel technology represents the third transformation. After the AI system completes its design, it automatically connects to the experimental workflow, and experimental data is fed back in real time to optimize the model. The directional evolution cycle, which used to take half a year, has now been compressed to just a few weeks or even days. The data at the basic research level is even more exciting: AI-driven methods have been shown to reduce laboratory workload to one-ten-thousandth of traditional processes. A 2026 industry frontier review further organizes AI applications in enzyme engineering into three core task families—functional modeling (enzyme vs. non-enzyme discrimination, EC numbering prediction, kinetic parameter estimation, etc.), structural modeling (near-atomic-level 3D prediction of enzymes and complexes), and property modeling (thermal stability, pH tolerance, selectivity, binding affinity, etc.), expanding these capabilities from single-enzyme modeling to multi-enzyme pathway design.

3. How to implement platformization: a specific technical logic

Optimized Enzyme Activity, Stability and Selectivity

For theoretical breakthroughs to truly be used by the industry, they must be refined into tool platforms that can be accessed and called. Currently, the AI-assisted enzyme engineering platform has demonstrated clear practical capabilities in the fields of biomanufacturing and industrial enzyme development. The core technical logic of these platforms can be summarized as a dual drive of "AI algorithms and automated experiments." Matwings Technology's conversational protein R&D agent MatwingsVenus™ provides ™ a concrete example: it integrates AI design capabilities with an automated wet experiment platform, allowing users to complete the entire closed loop from model deduction to experimental validation and iterative optimization through natural language dialogue.

This platform integrates over 200 protein design tools, a database of tens of billions of authentic labeled proteins, and Skills modules optimized by experts from various fields. When users enter a task goal like "Help me find a high-temperature resistant plastic degrading enzyme," the system automatically disassembles the instructions and sequentially schedules enzyme mining, targeted evolution, design from scratch, and experimental collaboration, running the entire R&D process in an orderly manner.

The real technical challenge is not a single model, but the automated integration of the entire process. The typical approach for traditional enzyme engineering projects is: first search the database for sequences, use comparison tools to analyze key residues, then jump to the prediction model to generate structures, switch design tools to write mutants, and finally export the sequences to outsourced labs for synthesis and testing. Any manual data export, format conversion, or email delivery mistake in this process could waste weeks of effort. What the integrated platform does is completely eliminate the internal friction of manual handling.

There is a special need in enzyme engineering—capturing the dynamic interactions between enzymes and substrates under different conformations—which is being tackled by specialized AI methods. The open-source collaboration platform PoseX, released at ICLR 2026, uses high-precision AI algorithms to precisely simulate protein conformational change scenarios, helping researchers quickly design "super enzymes" with high temperature resistance, high conversion rates, and high selectivity. Previously, the directional evolution process required multiple rounds of wet experiment iterations, but now efficient screening and optimization can be achieved in the digital space.

The synergistic efforts of these underlying technical logics have been repeatedly validated in real projects. Matwings Technology's self-developed series of large models have demonstrated outstanding performance in key indicators such as thermal stability and catalytic activity for specific industrial enzymes. In just a few months, they successfully developed various proteins with several-fold alkali resistance or activity surpassing the best products of leading international companies. By integrating AI enzyme mining and targeted evolution capabilities through a platform-based approach, the company has successfully delivered over 30 industrialization projects, with clients including Fortune Global 500 companies and several domestic listed companies. Its technical service scenarios have extended from innovative drug R&D all the way to more than ten industries such as in vitro diagnostics, food and beverage, beauty and skincare, and laundry and textiles, deeply covering multiple fields such as industrial enzyme preparations and synthetic biology.

4. Why AI-Assisted Enzyme Engineering Platforms Are an Industrial Necessity

The driving force from the demand side is simple: the industry needs better enzymes, and they need them fast, cost-effectively, and with a high success rate. Market research data shows that by 2025, the global industrial enzyme market will reach approximately $7.66 billion, and it is expected to grow to about $19.98 billion by 2032, with a compound annual growth rate of nearly 9.4%. Food and beverages, detergents, animal feed, bioenergy, pulp and paper, textile processing… almost every industrial sector is calling for enzymes that are higher-performing, more stable, and more cost-effective, and AI-assisted platforms are accelerating these demands into reality.

The improvement of technological maturity is equally noteworthy. A frontier review on enzyme engineering published in early 2026 pointed out that the field is upgrading from "single-enzyme modeling" to "multi-enzyme pathway design," integrating sequence, structure, reaction environment, and system-level constraints into a continuous "modeling-design-validation" framework, and divides this into three core tasks: functional modeling, structural modeling, and property modeling. AI has already demonstrated quantifiable capability breakthroughs in all three dimensions. Packaging these capabilities into "ready-to-use" engineered tools is precisely the core value of the platform.

5. From "Making Molecules" to "Building Systems": The Next Five Years of Enzyme Engineering Platforms

AI-Powered Enzyme Engineering Workflow

At the forefront of academia, the role of AI in enzyme engineering is evolving from being an 'assistant tool' to a 'collaborative scientist.' Hong Liang summarizes it in three stages: as a 'past tense' mature tool, as a 'present tense' AI agent-assisted platform, and as an 'AI co-research scientist' that can proactively propose scientific hypotheses and autonomously design verification pathways—this is the 'future tense.'

At the industrial level, AI-assisted enzyme engineering platforms are upgrading from 'molecular modification tools' to 'biomanufacturing infrastructure.' Whether it’s designing and breaking metabolic pathway bottlenecks in synthetic biology, or advancing bio-based chemicals and materials from the laboratory to industrial fermentation tanks, AI enzyme engineering platforms play a core supporting role. The construction of industry standards has also accelerated. In 2026, the University of Science and Technology of China, together with the China Institute of Standardization, compiled the "Intelligent Research Platform Standards White Paper" (2026 edition), establishing a standardized framework across six dimensions: basic general standards, data, models and AI foundation, experimental infrastructure, platform security, and platform ecosystem construction, promoting intelligent research platforms from isolated 'bonsai' to systematic 'scenery.'

6. From Craftsmanship to Engineering: The True Turning Point of Enzyme Engineering

Still in that laboratory, a plastic-degrading enzyme designed by AI is quietly working in a constant-temperature tank at 65 degrees Celsius. A few years ago, humans could only marvel at such enzymes—either struggling for years to isolate them from extreme environments or going through round after round of blind screening with traditional methods, wasting precious time in long waiting periods.

Today, a natural sequence and an industrial demand are input into an intelligent platform. AI searches for enzymes for you, directs their evolution, predicts thermal stability, and even arranges experimental verification when you hesitate. You are no longer someone casting a net in the dark ocean; someone has already pointed out the course for you.

The core value of AI-assisted enzyme engineering platforms is not faster algorithms, but an entire set of infrastructure that liberates 'research productivity' from tedious labor. It makes enzyme design predictable, programmable, and scalable, transforming a technology that relied on experience and patience for decades into a true engineering discipline.