Zymergen: Marrying Automation, Machine Learning, and Genomics

Pharma Tech Outlook: Pharma Tech Magazine

Zymergen: Marrying Automation, Machine Learning, and Genomics

Follow Zymergen on :

Joshua Hoffman, CEO, ZymergenJoshua Hoffman, CEO
It’s flabbergasting to know that there are more ways to tweak the DNA of a simple cell microbe than the count of atoms in the whole universe. That’s true. Given the advent of the cutting-edge technologies such as artificial intelligence (AI) and machine learning, the possibilities to test each DNA individually are inexhaustible. Microbes are used in the manufacturing of a wide array of products as diverse as active pharmaceutical ingredients, fragrances, and flavors. Interestingly, these microbes can be engineered to develop a wide spectrum of products via fermentation. But there exists a challenge: getting cells to produce a given product at a cost that is commercially feasible can be a tricky endeavor for pharmaceutical and other companies. To engineer a cell during product development, scientists must make a precise set of genetic changes to the cell’s genome. The inherent complexity of biology coupled with the enormous search space, however, slows down the process, leading to exorbitant cost. Also, the classical approach is hypotheses-led, not data-driven.

The question to ask is, “Instead of following a hypotheses-led approach, what if we could approach genetic engineering the way Google does searching; supplanting hypotheses with algorithms to search the full space systemically?”California-based Zymergen has built a distinctive solution that combines automation, machine learning, and genomics to design, engineer, and optimize microbes. The firm puts into use computational methods and techniques to eliminate the constraints posed by the hypotheses-led discovery. It leverages machine learning to navigate the genomics search space, making it possible for humans to make discoveries beyond the bounds of human intuition. Thanks to its proprietary algorithms, team Zymergenis guided to the precise set of genetic changes required to engineer cells, to make a product of interests more efficiently.

Ever since its inception in 2013, Zymergen has genetically edited the DNA of some of the microbes successfully, helping its clients improve the efficiency and time to market. However, in recent years, the company has moved beyond to produce wholly new materials that can be used at scale in processes like industrial fermentation and more. This was previously impossible unless the development of pioneering technologies like data science, robotics, and machine learning. The pioneering robotics enable high throughput testing of genetic changes at a rate of thousands per month. While robotics helps to eradicate any error—which otherwise cannot be avoided if manually done by the humans, machine learning assists in finding patterns for pharma companies to achieve breakthroughs in DNA research.

In a class by itself, Zymergen platform combines genetics, molecular biology, data science, and automation to improve organisms for industrial applications. The unique design, architecture, and functionality of the platform allow companies to make discoveries through high throughput design, analysis, and experimentation—all of which are enabled by its technology stack. It can be used in multiple ways to derive the value, for instance, to optimize the production economics. For one Fortune 500 client, Zymergen delivered 4X the improvement in just one year versus the decades-old traditional methods the client had taken earlier. The result was a sublime 2X increase in the net margin. For another client, Zymergen mitigated the time to market to three years versus the seven years the client had anticipated with their existing internal capabilities. There are several such case studies of clients who have largely benefitted from the Zymergen’s platform in their endeavor to discover and develop novel products.

Uniquely flexible, the platform taps into a new set of chemical building blocks from biology to discover valuable natural products or to develop wholly new products using chemicals from biology. Composed of three core capabilities including data science, automation and genetic engineering, the platform intertwined with the data infrastructure can accommodate a wide range of microbes and genetic information.

While robotics helps to eradicate any error— which otherwise cannot be avoided if manually done by the humans, machine learning assists in finding patterns for pharma companies to achieve breakthroughs in DNA research

Embedded with the largest catalog of genetic diversity called Genetic Libraries, Zymergen’s platform generates a gigantic volume of data endorsed by a robust data infrastructure. Subsequently, it feeds data into its machine learning algorithms to guide the next test. For effective cell engineering, Zymergen employs a myriad of microbiology techniques and methods to make such targeted changes. All in all, Zymergen eliminates the impediments of hypotheses-led discovery to address the fundamental problems of the industry.

Coding Microbes

In its constant endeavor to improve the accuracy of genetic edits, Zymergen is constantly improving its existing (and also building new algorithms)to produce strains with improved chemical production. The resultant strains will potentially mitigate the reliance on chemicals extracted from fossil fuels. Zymergen’s data science ecosystem consists of data pipelines and machine learning algorithms. As new sets of data flow in, the data pipelines immediately update the learning models to generate new strain recommendations, which are subsequently presented to the scientific team for review, and who in turn, submits the strain designs to the build team. In this entire cycle, strains are continuously improved through miniaturized high-throughput experiments. The team at Zymergen engineers and tests different microbe variants simultaneously. The data derived from such experiments are preserved and analyzed to inform the next round of designs. Strains that look promising at the smallest scale are promoted to larger bench-top fermentation testing, and the ones that show the potential are then transferred to the commercial-scale tanks. Spotting the winning strains at this level helps the clients of Zymergen save millions of dollars.

Optimizing genome is a significant,but challenging part of the process because the space to be explored is enormous. Typically, a microbial genome entails about four million characters of DNA, roughly half of which are associated with genes, and the other half associated with known biological functions. What seemed almost impossible earlier, companies can now edit each gene using Zymergen’s high-throughput strain engineering platform. They only need to cultivate and test 4,000 organisms on the platform. According to Zymergen, it is however not advisable to proceed with a single edit. To achieve a superior performance, they must consider multiple combinations of edits.

An Advanced Workflow

Zymergen’s workflow is comparable to GIT workflows(a recipe to accomplish tasks in a consistent manner) and software engineering. At the outset of the process, Zymergen starts with what is called human interpretable ideas about genetic changes, where each gene is perturbed, and this idea is later programmed into a low-level DNA language. This process is performed using Helix, an in-house system scripted in Python. Not only does Helix generate the insights needed to craft and verify the construction of DNA, it actually converts the strain descriptions into DNA designs. At Zymergen, the code quality is taken very seriously, and every line of code the engineers script is reviewed for quality and accuracy. Nothing is more damaging than spending days on a 10,000 line pull request arguing about variable names. When done well, code reviews can be quick, and a useful tool for catching issues early. When done poorly, imagine the otherwise.
Dedicated to synthesizing information into concrete requirements, the team sits at the intersection of three stakeholder groups internally: leadership, users, and engineering. They are entrusted to provide feedback on needs of the business and its pain points with the hardware and software, feasibility and the cost to deliver, among others. Though far younger to the disruptive enterprise behemoths, Zymergen has reached this far in their journey to easily join their ranks. As a company with a culture of goodwill, diversity and humbleness, Zymergen welcomes other companies that wish to come to see their vision. With many accolades to its credit, the company was recently named one of the “World Economic Forum’s (WEF) Technology Pioneers” which recognizes innovative companies with the potential to transform the future and society.
Share this Article:
Top 10 Enterprise Software Solution Companies - 2019


Emeryville, CA

Joshua Hoffman, CEO

Founded in 2013 and based in the San Francisco Bay Area, Zymergen integrates automation, machine learning, and genomics to rapidly accelerate the pace of scientific advancement. Zymergen treats the genome as a search space, leveraging machine learning to make discoveries far beyond the bounds of human intuition. In doing so, Zymergen deliver economic value, material diversity and performance capabilities previously impossible. Ever since its inception, Zymergen has genetically edited the DNA of some of the microbes successfully, helping their clients improve the efficiency and the time to market