Genome engineering
technology
Bacterial immune system:
New genome engineering technology starts with a bacterial immune system that means understanding how bacteria fight off a viral infection. It turns out that a lot of bacteria have in their chromosome, that are interspaced with sequences that are derived from viruses and these have been noticed by microbiologists who were sequencing bacterial genomes but nobody knew what the function of these sequences might be until it was noticed that they tend to also occur with a series of genes that often encode proteins that have homology to enzymes that do interesting things like DNA repair. So it was a hypothesis that this system which came to be called CRISPR which is an acronym for this type of repetitive locus that this CRISPR systems could actually be an acquired immune system in bacteria that might allow sequences to be integrated from viruses and then somehow used later to protect the cell from an infection with that same virus.
Incorporation of
foreign DNA:
The incorporation of viral
sequences into these genomic loci. And so what emerged over the next several
years was that in fact these CRISPR systems really are acquired
immune systems in bacteria so until this point no one knew that bacteria could
actually have a way to adapt to viruses that get into the cell but this is a
way that they do it and it involves detecting foreign DNA that gets injected
from a virus that gets into the cell, the CRISPR system allows integration of
short pieces of those viral DNA molecules into the CRISPR locus.
Transcription of CRISPR sequences:
In the second step as CRISPR RNA biogenesis these
CRISPR
sequences are actually transcribed in the cell into pieces of RNA that are
subsequently used together with proteins encoded by the CAS genes these CRISPR-associated
genes to form interfering or interference complexes that can use the
information in the form of these RNA molecules to base-pair with matching
sequences in viral DNA. So a very nifty way that bacteria have come up with to
take their invaders and turn the sequence information against them. So in my
own laboratory we have been very interested for a long time in understanding
how RNA molecules are used to help cells to figure out how to regulate the
expression of proteins from the genome.
Cas9 protein:
An organism called Streptococcus
pyogenes which is a bacterium that can cause very severe infections in humans and
what was curious in this bug was that it has a CRISPR system and in that
organism there was a single gene encoding a protein known as Cas9 that had been
shown genetically to be required for the function of the CRISPR system in
Streptococcus pyogenes, but nobody knew at the time what the function of that
protein was. And so we got together and recruited people from our respective
research labs to start testing the function of Cas9. Cas9 is actually a
fascinating protein that has the ability to interact with DNA and generate a double
stranded break in DNA at sequences that match the sequence in a guide RNA and
this slide what you are seeing is that the guide RNA and the sequence of the
guide in orange that base pairs with one strand of the double-helical DNA and
very importantly this RNA interacts with a second RNA molecule called traCR that
forms a structure that recruits the Cas9 protein
Single guide RNA:
so those two RNAs and a single
protein in nature are what are required for this protein to recognize what
would normally be viral DNAs in the cell and the protein is able to cut these
up, literally by breaking up the double-helical DNA, actually generate a
simpler system than nature has done by linking together these two RNA molecules
to generate a system that would be a single protein and a single guiding RNA. So
the idea was to basically take these two RNAs that link them together to create
what we call a single guide RNA.
Plasmid DNA:
A programmable DNA cleaving
enzyme and the idea was to generate short single guide RNAs that recognize
different sites in a circular DNA molecule that you see here and the guide RNAs
were designed to recognize the sequences, plasmid, that circular DNA molecule and
incubate it with two different restriction (or cutting) enzymes, one called
SalI which cuts the DNA sort of upstream at the far end of the DNA and the
second site being directed by the RNA-guided Cas9 at these different sites, and
this incubation reaction with plasmid DNA and an agarose gel that allows us to
separate the cleaved molecules of DNA and that in each of these reaction lanes we
get a different sized DNA molecule released from this doubly digested plasmid in
which the size of the DNA corresponds to cleavage at the different sites directed
by these guide RNA sequences.
Double-stranded breaks:
Programmable DNA cutting enzyme and that we
can program it with a short piece of RNA to cleave essentially any double
stranded DNA sequence, an enzyme that can be programmed to generate double
stranded DNA breaks at any sequence is because there was a long-standing set of
experiments in the scientific community that showed that cells have ways of
repairing double-stranded DNA breaks that lead to changes in the genomic
information in DNA so this is a slide that shows that after a double-stranded
break is generated by any kind of enzyme that might do this including the Cas9
system.
Non-homologous end joining:
those double-stranded breaks in a
cell are detected and repaired by two types of pathways one on the left that
involves non-homologous end-joining which the ends of the DNA are chemically
ligated back together usually with the introduction of a small insertion or
deletion at the site of the break.
Homology directed
repair:
another way that repair occurs through
homology-directed repair in which a donor DNA molecule that has sequences that
match those flanking the site of the double-stranded break can be integrated into
the genome at the site of the break to introduce new genetic information into
the genome
Different tools for repair system:
Double-stranded breaks at
targeted sites in the DNA of a cell then together with all of the genome
sequencing data that are now available. the whole genetic sequence of a cell,
where a mutation occurred that causes a disease for example you could actually
use a technology like this to introduce DNA that would fix a mutation or
generate a mutation you might like to study in a research setting so the power
of this technology is really the idea that we can now generate these types of
double-stranded breaks at sites that we choose as scientists by programming
Cas9 and then allow the cell to make repairs that introduce genomic changes at
sites of these breaks but the challenge was how to generate the breaks in the
first place and so a number of different strategies had been produced for doing
this in different labs most of them, two specific examples here one called zinc finger nucleases and the other TAL
effector domains. These are both programmable ways to generate double
stranded breaks in DNA that will rely on protein-based recognition of DNA
sequences so these are proteins that are modular and can be generated in
different combinations of modules to recognize different DNA sequences it works
as a technology but it requires a lot of protein engineering to do so.
CRISPR/Cas9 enzyme:
This CRISPR/Cas9 enzyme is that it is an RNA programmed protein so a single protein can be used for any site of DNA where we would like to generate a break by simply changing the sequence of the guide RNA associated with Cas9 so instead of relying on protein-based recognition of DNA we're relying on RNA-based recognition of DNA, a system that is simple enough to use that anybody with basic molecular biology training can take advantage of this system to do genome engineering and so this is a tool that really fills out an essential and previously missing component of what we could call biology's IT toolbox that includes not only the ability to sequence DNA and look at its structure, we know about the double helix since the 1950s and then in the last few decades it's been possible to use enzymes like restriction enzymes and the polymerase chain reaction to isolate and amplify particular segments of DNA and now with Cas9 we have a technology that enables facile genome engineering that is available to labs around the world for experiments they might want to do and so this is a summary of the technology of the 2-component system it relies on RNA-DNA base paring for recognition and very importantly because of the way that this system works it is actually quite straight forward to do something called multiplexing which means we can program Cas9 with multiple different guide RNAs in the same cell to generate multiple breaks and do things like cut out large segments of a chromosome and simply delete them in one experiment.
Conclusion
In the field of biology and
genetics with many labs around the world adopting this technology for all sorts
of very interesting and creative kinds of applications and this is a slide that's
actually almost out of date now but just to give you a sense of the way that
the field has really taken off so we published our original work on Cas9 in
2012 and up until that point there was very little research going on CRISPR biology anywhere it was a
very small field and then you can see that starting in 2013 and extending until
now there has been this incredible explosion in publications from labs that are
using this as a Genome engineering technology so it's been really very exciting for me as a
basic scientist to see what started as a
fundamental research project turned into a technology that turns out to be very
enabling for all sorts of exciting experiments. This technology so of course on
the left-hand side lots of basic biology that can be done now with the engineering
of model organisms and different kinds of cell lines that are cultured in the
laboratory to study the behavior of cells but also in biotechnology being able
to make targeted changes in plants and various kinds of fungi that could be
very useful for different sorts of industrial applications and then of course
in biomedicine with lots of interest in the potential to use this technology as
a tool for really coming up with novel therapies for human disease I think is
something that is very exciting and is really something that is on the horizon
already and then this slide just really indicate, in the future with a lot of
interesting and creative kinds of directions that are coming along in different
labs both in academic research laboratories
but also increasingly in commercial labs that are going to enable the
use of this technology for all sorts of applications.