February 11, 2018

thank you for inviting me

I’m happy to be here among future biotechnologist

If you are here, then I guess

  • You want to be a biotechnologist
    • Get a good job
    • Amusing
    • Well paid
  • You want to learn new things
    • Get a degree
  • You wan to do something meaningful
    • Be rich and famous
    • At least a little famous

Everybody is talking about biotech

This is a special time for Biotech

“Biotechnology in the realm of history” J Pharm Bioall Sci 2011

Biotech is historic

Agriculture, Medicinal Plants, Bread, Wine

Biotech is modern

Genomics, Transcriptomics, Proteomics, Metabolomics, and …

Where will it end?

  • Will you do some meaningful?
  • Will you have a nice job?
  • Will you be well paid?
It is hard to make predictions, especially about the future
Danish proverb

So I will make predictions about the past

My name is Andrés Aravena

I am

  • Assistant Professor at Molecular Biology and Genomics Department, Istanbul University
  • Mathematical Engineer, U. of Chile
  • PhD Informatics, U Rennes 1, France
  • PhD Mathematical Modeling, U. of Chile
  • not a Biologist
  • but an Applied Mathematician who can speak “biologist language”

Before coming to Turkey I worked on

  • Big and small computers
  • Telecommunication Networks
  • Between 2003 and 2014 I was the chief research engineer
    • on the main bioinformatic group in my country
    • in the top research center (CMM)
    • in the top university (University of Chile)
    • of my country

I come from Chile

Small country of ~17 million people

Universities on the 200-300 world ranking

Spanish colony 500 years ago (so language is Spanish)

Independent Republic 200 years ago

First Latin American country to recognize Turkish republic

OECD member, like Turkey

Everyday life very similar to Turkey

Chilean Exports 2016

Chilean Economy: Exports

1st world producer of copper

3rd world producer of salmon

Fruits: peaches, grapes, apples, avocado

Wine: exported worldwide

All these industries depend on Biotechnology

How can you improve these industries

using Biotechnology and Bioinformatics?

Biotech in the Mining Industry

Copper is heated and melt

to separate it from other compounds

This is
very expensive

… and contaminant

(this smoke is sulphuric acid)

Solution: Bioleaching

The use of bacteria to extract elements from ore

Bioleaching is much better that melting copper

  • Less contamination
  • Less expensive

The goal is to understand and improve the involved bacteria so this technology can be used extensively

Enables building new mines

It is like discovering petrol reserves for the country

Who is there?

Question 1

Bioidentification

Monitoring the presence of good bacteria

We need to control the “industrial ecosystem”

Metagenomic approach: We want to detect on site, without isolation

  • Fast, unbiased

Key problem: Design probes that match a taxonomic branch, not a specific strain

The probes should be tolerant to mutations that occur in environmental samples with many strains

Alignment tools do not work on large scales

Design of probes for complex samples

You can design oligonucleotides using computers

  • In 2006 it took one day on 32 processors (one processor month)
  • You can do it faster and cheaper using Cloud Computing

You can used them in qPCR or in microarrays. They detect specific species and functions

Automatic Interpretation of Results

using a Statistical Classification Model

A “robot” prepares the report

The tool can be used by untrained personal

Simple installation on the mine enables continuous monitoring of the industrial process

We published it

We patented it

after some years

  • USA, Number: US 7 853 408 B2, Date: 14/12/2010;
  • South Africa, Number: 2006/06828, Date: 26/03/2008;
  • Australia, Number: 2006203551, Date: 15/09/2011;
  • Mexico, Number: PXMX 32/2006, Date: November 2012.
  • Peru, Number: PE 5838, Date: 29/10/2010;
  • Chine, Number: 200810095172.6, Date: 2013;
  • Chile, Number: DPI-660-2007, Date: 06/05/2013;
  • Argentina, Number: AR056179

We did the same with Wine production

Chilean wine travels long distances to final markets

Any yeast contamination means big economic loses
(people stops buying all Chilean brands)

We designed qPCR kit for rapid detection of yeast contamination

It is currently sold to winemakers for Quality Control

What did we learn: Metagenomics

Genomic of the ecosystem

Microorganism live in the most diverse environments.
They are the key to:

  • develop new biotechnology
  • manage our natural resources
  • improve our health
  • understand our past

But only ~5% of them can be grown in the lab.

Big Data in Biology

Since most microbes cannot be isolated, you can do this:

  • Extract all DNA from the environment
  • Sequence all DNA
  • Identify the taxonomy/function of each read
  • Cluster similar sequences together

Discovering unknown organisms

Most of the sequences from environmental samples are unique

That is, they do not correspond to any known organism

How can you find the “closest” organisms?

Here “closest” is in the phylogenetic sense

This is my research question

Today: DNA from Archeological sites

  • Ancient DNA can be extracted from bones and teeth of specimens
  • Metagenomic sequencing
  • Genetic record of the specimen, its surrounding environment and ecological changes during long periods of time.
  • Human aDNA shows migration patterns
  • Ancestry shows social structure

Nobody knows what is there

20% to 90% of reads do not match human

They can tell us about

  • Diseases
  • Diet
  • Economy
  • Climate

These are complex samples

Often without reference genome

We need to assign taxonomy to DNA reads of new organisms

You can simulate post-mortem decay and see if taxonomy can be identified

Results: we can assign genus, but not yet species

Who we are

  • Post-doc: Emrah Kırkdök (PhD Gebze Tech Univ)
  • Bordeaux Bioinformatics Center CBiB - Université Bordeaux - CNRS
  • Archaeological Research Laboratory, Stockholm University, Sweden
  • Dept. of Biological Sciences, Middle East Technical University, Ankara, Turkey

How can we improve the process?

Question 2

You can do genomics

How does the bacteria work?

To improve the process we need to see inside the black box. We sequenced the complete genome of 3 bacteria

  • Acidithiobacillus ferrooxidans
  • Acidithiobacillus thiooxidans
  • Leptospirillum ferrooxidans

We paid over USD $150K. Today you can do it for USD $5K

You can measure gene expression

You can model metabolism: FBA

You can predict which genes code for enzymes

Each enzyme catalyzes a reaction, with a known stoichiometry

Every reaction gives an equation

All equations plus boundary conditions give model to predict metabolite concentration

We can predict how the cell adapts to environmental changes

You can model regulation

You can predict which genes code for transcription factors and combine with expression data to find the “most probable” regulatory network

Genomic of natural resources: Salmon

The genome of a natural resource is as valuable as the resource itself

Salmon farming is important in Chile

Some towns depend 100% on the Salmon industry

And when fishes get sick, people loose their jobs

You can collect all your data

Or you can sequence everything

DNA sequencing is cheap

A DNA sequencer in every desktop

First computers where big and expensive

Only in a few universities, used by experts

Then there was one on every office… and home

Today everybody has one… in the pocket

A PlayStation has more power than the biggest computer of 1998

Can the same happen with DNA sequencing?

The next iPhone

Today you can buy a DNA sequencer of the size of an iPhone

… at the price of an iPhone

Next step: people will make apps for DNA sequencer

Computing power grows exponentially

Moore’s Law

Big Picture

  • Sequencing cost is going down
  • Moore’s law: computing is cheap
  • There is a phase transition: we changed from “solid” to “liquid”
  • Rules are changing very fast
  • For example, patents are obsolete

Producing data is cheap

There is already a lot of public data

  • Tara Oceans published 7.8 Terabytes of metagenomic data
    • equivalent to 1.660 DVDs
  • Anybody can discover new knowledge there
    • You can do it!
    • And also many other people

This is a race

It doesn’t depend on hands and wallets

It depends on brains and guts

All science is Data Science

But Data Science is not about Data

It is about extracting

  • Information
  • Knowledge
  • Wisdom

How to do a good job

my opinion

Ask a good question

  • Solve a problem relevant to your community

  • About a general issue, not too particular
  • See “the forest”, not only “some trees”

Measure new things

If you use

  • the same instruments as everybody
  • on the same organisms as everybody
  • with the same questions as everybody

is like being a cover band. We can play well but we will not make a real impact

Create a new instrument: What I cannot create, I cannot understand

Extract new knowledge from data

  • Build a model and validate it

  • Follow complex ideas

  • If it is easy, anybody else can do it
    • Never say “Kolay gelsin”
    • Better say “Iyi Çalismalar”

Collaborate

Really new ideas come from other fields

You do not need to be an expert on everything

You need to speak with other experts

  • English
  • Ideas

Get feedback from referees: publish

Thank you