The 4th Annual Machine Learning Meets Biology conference continues our tradition of exciting talks and in-depth discussions at the interface between biotechnology and information science. New applications, fascinating opportunities and great networking.
This is an invitation-only event that brings together a small (~100 attendees) distinguished group of Machine Learning / Biotech nerds from both academia and industry.
We have confirmed speakers from Google, UC Berkeley, Grail, Chr. Hansen, Synthase, UCSD, Auransa, Medicago, Illumina and more on the way.
11:00 - 12:00 |
REGISTRATION. LUNCH |
12:00 - 12:30 |
Jonathan Allen, Brian Van Essen, ATOM - LLNL |
12:30 - 13:00 |
Annalisa Pawlosky, Google GAS |
13:00 - 13:30 |
Jennifer Listgarten, UC Berkeley |
13:30 - 13:50 |
Markus Gershater, Synthace |
13:50 - 14:30 |
BREAK |
14:30 - 15:00 |
Pierre-Olivier Lavoie, Medicago |
15:00 - 15:30 |
Christian Jäckel, Chr Hansen |
15:30 - 16:00 |
Elizabeth Brunk, UC San Diego |
16:00 - 16:30 |
Arash Jamshidi, Grail |
16:30 - 17:00 |
Pek Lum, Auransa |
17:00 - 17:30 |
Serafim Batzoglou, Illumina |
17:30 - 19:00 |
COCKTAIL HOUR |
AbCellera, AdiMab, ADM, Alector, Amyris, Anaptys, Ancestry, Aromyx, ATOM, ATUM, Auransa, BASF, bluebird, Bolt, Calysta, Cargill, Chr Hansen, CytomX, Cytovance, Chan Zuckerberg Biohub, EMD Millipore, Firmenich, Gigagen, Google, Grail, IBM, IgM, Illumina, Insitro, Pfizer, Kaiser Permanente, Lexent, Ligand, Medicago, NASA, Novozymes, Numerate, Perfect Day, Pionyr, QLSF, Quantapore, Stanford, Sutro, Synthace, Target, Tenaya, Thermo Fisher, UC Berkeley, UC San Diego, Vir, xCella, Zymergen and more.
New developments in machine learning and underlying capabilities in high-performance computing are beginning to have significant impact in multiple areas of biology and medicine. We will describe on-going work at Lawrence Livermore National Laboratory and at the ATOM Consortium (LLNL, Frederick National Lab, UCSF, and GSK) to accelerate drug design and to better understand cancer mechanisms. These applications will take advantage of emerging machine learning and computing technologies that extend the scale and complexity of data-driven modeling.
Google Accelerated Science (GAS) is an organization focused on producing breakthroughs in the natural sciences, from quantum physics to biological sciences, by applying machine learning (ML) and large scale computational analysis to large-scale physical experimentation. GAS brings together experimentalists and analysts to design scientific studies and build models in novel ways to answer today’s most exciting questions. In the process, we encountered significant challenges to classifying patterns in experimental data, and generating predictions with transferable performance in real world applications. Today’s talk discusses the mistakes made along the way, lessons learned, and strategies we have implemented to improve ML model predictions or physically manipulate the experimental conditions to yield ML-compatible data sets.
With the advent of more and more high-throughput technologies to measure protein properties of interest such as binding, expression, fluorescence, the time for machine learning to act synergistically with protein design is here. I will describe our work on accelerating the design/optimization of proteins (and small molecules) with machine learning approaches--- a sort of in silico approach to the method of Directed Evolution, which won the 2018 Nobel prize in Chemistry.
To unlock use of ML in biology, we must make the production and collation of sophisticated experimental datasets routine. Bringing together the diverse lab automation and software tools needed for this will enable all bioscience companies to transform their R&D, without having to invest heavily in building internal tools.
Medicago is a leading clinical-stage biopharmaceutical company using a novel plant-based manufacturing and virus-like particle (VLP) technologies to rapidly develop innovative vaccines and protein-based therapeutics for infectious diseases and emerging public health challenges. We have worked with ATUM on the optimization of the GeneGPS™ algorithm to accurately predict the preferred gene sequence for protein expression in our plant-based expression technology. Combining design of experiments (DoE), exact empirical measurements, and machine learning tools, the collaboration has delivered an algorithm capable of effective gene design that allows high expression of biopharmaceutical proteins. The optimized GeneGPS™ algorithm has now demonstrated robust and predictable protein expression in our Proficia® plant-based manufacturing platform.
High-throughput screening of large mutant libraries has become the dominant strategy to find sparse solutions for enzyme improvement in the nearly infinite protein sequence space. Unfortunately, high-throughput compatible assays are often weak predictors for enzyme performance in complex industrial applications, such as dairy products. I will present an example of how machine learning can exploit assay data most efficiently to enable the development of industrial enzyme products with low screening effort.
Standardized, multi-omics datasets are becoming increasingly available in the biological sciences, but major impediments prevent the realization of impact of big data resources. Advanced data integration and modern machine learning methods bring the promise of leveraging high-throughput omics data to make accurate predictions. I will present recent development of integrative systems that interpret omics data and provide mechanistic, translational insights for biological and medical science.
GRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured. To achieve this mission, GRAIL is developing a blood test capable of detecting and localizing multiple deadly cancer types early utilizing high-intensity next-generation sequencing, population-scale clinical studies, and leveraging key scientific developments in cell free nucleic acid cancer biology, assay, as well as machine-learning algorithms.
Patient heterogeneity is a challenge for drug discovery and development. Our goal is to intelligently categorize patients into more homogenous patient subpopulations, allowing for the discovery of precision medicines and ultimately, running more precise clinical trials. We are leveraging our SMarTR™ Engine to build an internal pipeline of novel drug candidates.
The association of genomic variants to genetic disease is to-date poorly characterized. At our AI and population genomics laboratory, our goal is to increase actionability of WGS in rare and common genetic disease. PrimateAI is a deep learning method that leverages variation in primate genomes to classify variants in human proteins into benign and pathogenic. Motivated by its effectiveness, we launched the Primate Genome Conservation initiative aiming to sequence 1000 primate genomes the first year and 10,000 within a three-year period. We also developed SpliceAI to predict mRNA splicing from primary sequence and is capable of identifying mutations that disrupt splicing or generate new cryptic splice sites. Our methods can be applied to enhance the power of burden tests to discover new genes associated with common disease.
ATUM customer support scientists are available to discuss cloning strategies, gene design constraints, bioinformatics analyses, and other molecular biology/biotechnology concerns.
Corporate Headquarters
(Newark, California)
+1 650 853 8347