4th Annual Machine Learning Meets Biology
MLxBio4 Tesseract

September 27, 2019

Terra Gallery, San Francisco

MLxBio4 Tesseract

The 4th Annual Machine Learning Meets Biology conference continues our tradition of exciting talks and in-depth discussions at the interface between biotechnology and information science. New applications, fascinating opportunities and great networking.

This is an invitation-only event that brings together a small (~100 attendees) distinguished group of Machine Learning / Biotech nerds from both academia and industry.

We have confirmed speakers from Google, UC Berkeley, Grail, Chr. Hansen, Synthase, UCSD, Auransa, Medicago, Illumina and more on the way.

Email: communications@atum.bio

AGENDA & SPEAKERS


11:00 - 12:00

REGISTRATION. LUNCH

12:00 - 12:30

Jonathan Allen, Brian Van Essen, ATOM - LLNL

12:30 - 13:00

Annalisa Pawlosky, Google GAS

13:00 - 13:30

Jennifer Listgarten, UC Berkeley

13:30 - 13:50

Markus Gershater, Synthace

13:50 - 14:30

BREAK

14:30 - 15:00

Pierre-Olivier Lavoie, Medicago

15:00 - 15:30

Christian Jäckel, Chr Hansen

15:30 - 16:00

Elizabeth Brunk, UC San Diego

16:00 - 16:30

Arash Jamshidi, Grail

16:30 - 17:00

Pek Lum, Auransa

17:00 - 17:30

Serafim Batzoglou, Illumina

17:30 - 19:00

COCKTAIL HOUR


Confirmed attendees from:

AbCellera, AdiMab, ADM, Alector, Amyris, Anaptys, Ancestry, Aromyx, ATOM, ATUM, Auransa, BASF, bluebird, Bolt, Calysta, Cargill, Chr Hansen, CytomX, Cytovance, Chan Zuckerberg Biohub, EMD Millipore, Firmenich, Gigagen, Google, Grail, IBM, IgM, Illumina, Insitro, Pfizer, Kaiser Permanente, Lexent, Ligand, Medicago, NASA, Novozymes, Numerate, Perfect Day, Pionyr, QLSF, Quantapore, Stanford, Sutro, Synthace, Target, Tenaya, Thermo Fisher, UC Berkeley, UC San Diego, Vir, xCella, Zymergen and more.

ABSTRACTS


12:00 - 12:30 - Jonathan Allen, Brian Van Essen, ATOM - Lawrence Livermore Natl. Labs

Developments in Machine Learning for Applications in Biomedicine

New developments in machine learning and underlying capabilities in high-performance computing are beginning to have significant impact in multiple areas of biology and medicine. We will describe on-going work at Lawrence Livermore National Laboratory and at the ATOM Consortium (LLNL, Frederick National Lab, UCSF, and GSK) to accelerate drug design and to better understand cancer mechanisms. These applications will take advantage of emerging machine learning and computing technologies that extend the scale and complexity of data-driven modeling.


12:30 - 13:00 - Annalisa Pawlosky, Google GAS

Opportunities and Challenges of Applying ML to Biological Experiments

Google Accelerated Science (GAS) is an organization focused on producing breakthroughs in the natural sciences, from quantum physics to biological sciences, by applying machine learning (ML) and large scale computational analysis to large-scale physical experimentation. GAS brings together experimentalists and analysts to design scientific studies and build models in novel ways to answer today’s most exciting questions. In the process, we encountered significant challenges to classifying patterns in experimental data, and generating predictions with transferable performance in real world applications. Today’s talk discusses the mistakes made along the way, lessons learned, and strategies we have implemented to improve ML model predictions or physically manipulate the experimental conditions to yield ML-compatible data sets.


13:00 - 13:30 - Jennifer Listgarten, UC Berkeley

Machine Learning for Protein Engineering (and Chemistry)

With the advent of more and more high-throughput technologies to measure protein properties of interest such as binding, expression, fluorescence, the time for machine learning to act synergistically with protein design is here. I will describe our work on accelerating the design/optimization of proteins (and small molecules) with machine learning approaches--- a sort of in silico approach to the method of Directed Evolution, which won the 2018 Nobel prize in Chemistry.


13:30 - 13:50 - Markus Gershater, Synthace

Computer Aided Biology: Tools for Machine Learning-Driven R&D

To unlock use of ML in biology, we must make the production and collation of sophisticated experimental datasets routine. Bringing together the diverse lab automation and software tools needed for this will enable all bioscience companies to transform their R&D, without having to invest heavily in building internal tools.


14:30 - 15:00 - Pierre-Olivier Lavoie, Medicago

GeneGPS algorithm development: Increasing protein expression in Nicotiana benthamiana

Medicago is a leading clinical-stage biopharmaceutical company using a novel plant-based manufacturing and virus-like particle (VLP) technologies to rapidly develop innovative vaccines and protein-based therapeutics for infectious diseases and emerging public health challenges. We have worked with ATUM on the optimization of the GeneGPS™ algorithm to accurately predict the preferred gene sequence for protein expression in our plant-based expression technology. Combining design of experiments (DoE), exact empirical measurements, and machine learning tools, the collaboration has delivered an algorithm capable of effective gene design that allows high expression of biopharmaceutical proteins. The optimized GeneGPS™ algorithm has now demonstrated robust and predictable protein expression in our Proficia® plant-based manufacturing platform.


15:00 - 15:30 - Christian Jäckel, Chr Hansen

Improvement of Industrial Enzymes by Machine-Learning Guided Directed Evolution

High-throughput screening of large mutant libraries has become the dominant strategy to find sparse solutions for enzyme improvement in the nearly infinite protein sequence space. Unfortunately, high-throughput compatible assays are often weak predictors for enzyme performance in complex industrial applications, such as dairy products. I will present an example of how machine learning can exploit assay data most efficiently to enable the development of industrial enzyme products with low screening effort.


15:30 - 16:00 - Elizabeth Brunk, UC San Diego

Tackling multi-omic data integration with machine learning

Standardized, multi-omics datasets are becoming increasingly available in the biological sciences, but major impediments prevent the realization of impact of big data resources. Advanced data integration and modern machine learning methods bring the promise of leveraging high-throughput omics data to make accurate predictions. I will present recent development of integrative systems that interpret omics data and provide mechanistic, translational insights for biological and medical science.


16:00 - 16:30 - Arash Jamshidi, Grail

Development of a multicancer early detection blood test

GRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured. To achieve this mission, GRAIL is developing a blood test capable of detecting and localizing multiple deadly cancer types early utilizing high-intensity next-generation sequencing, population-scale clinical studies, and leveraging key scientific developments in cell free nucleic acid cancer biology, assay, as well as machine-learning algorithms.


16:30 - 17:00 - Pek Lum, Auransa

Using AI and Machine Learning to Solve the Patient Heterogeneity Puzzle

Patient heterogeneity is a challenge for drug discovery and development. Our goal is to intelligently categorize patients into more homogenous patient subpopulations, allowing for the discovery of precision medicines and ultimately, running more precise clinical trials. We are leveraging our SMarTR™ Engine to build an internal pipeline of novel drug candidates.


17:00 - 17:30 - Serafim Batzoglou, Illumina

AI for Linking Genomic Variants to Genetic Disease

The association of genomic variants to genetic disease is to-date poorly characterized. At our AI and population genomics laboratory, our goal is to increase actionability of WGS in rare and common genetic disease. PrimateAI is a deep learning method that leverages variation in primate genomes to classify variants in human proteins into benign and pathogenic. Motivated by its effectiveness, we launched the Primate Genome Conservation initiative aiming to sequence 1000 primate genomes the first year and 10,000 within a three-year period. We also developed SpliceAI to predict mRNA splicing from primary sequence and is capable of identifying mutations that disrupt splicing or generate new cryptic splice sites. Our methods can be applied to enhance the power of burden tests to discover new genes associated with common disease.


Stay tuned for agenda and speakers!

Stay tuned for agenda and speakers!

Have a question?
Let's talk.

ATUM customer support scientists are available to discuss cloning strategies, gene design constraints, bioinformatics analyses, and other molecular biology/biotechnology concerns.

Call

Corporate Headquarters
(Newark, California)
+1 650 853 8347

Email

We generally reply within a few hours.

info@atum.bio