The DNA Code and Codons
What is the DNA code?
The DNA code is really the 'language of life.' It contains the instructions for making a living thing. The DNA code is made up of a simple alphabet consisting of only four 'letters' and 64 three-letter 'words' called codons. It may be hard to believe that most of the wonderful diversity of life is based on a 'language' simpler than English—but it’s true.
This code isn't literally made up of letters and words. Instead, the four letters represent four individual molecules called nucleotides: thymine (T), adenine (A), cytosine (C), and guanine (G). The order or sequence of these bases creates a unique genetic code.
These codon 'words' in the genetic code are each three nucleotides long—and there are 64 of them. If you do the math, this is as many three-letter combinations words as you can get with just four letters. ATG and CCC are a couple of examples of codons.
Just as there is more to human languages like English than letters and words, such as punctuation, commas, etc., the same is true for the genetic code. For example, instead of capitalizing the start of a sentence, the genetic code almost always signals the start of new instructions with ATG, one of those three-letter codons.
And instead of periods, genes end with one of three different codons: TAG, TAA, or TGA. There are other parts of the DNA that are not codons that can act as sort of punctuation or signals that, for example, indicate when, where, and how strongly a gene should be read.
How Does DNA Encode Information?
One of the key ways that DNA encodes information inside of cells is through genes. Humans have around 20,000 genes. Each gene has the instructions for making a specific protein, and each protein does a specific job in the cell.
For example, the lactase gene has the instructions for making the lactase protein. The lactase protein breaks down the sugar lactose that is found in milk. People with a turned off lactase gene are lactose intolerant.
The instructions for making these proteins are encoded in the three-nucleotide codons discussed earlier. But just like a set of instructions which has to be read to get something built, the instructions encoded in the DNA must also be read.
For example, the DNA with the code for making the lactase protein will not be able to break down the sugar lactose. Instead, to digest lactose, a cell must first read the gene and then make the protein lactase.
The first step in reading a gene is to transfer the information from DNA to messenger RNA (mRNA) using a protein called RNA polymerase (in humans, the polymerase that reads genes like lactase is RNA polymerase II). This process is called transcription.
The mRNA then heads over to a protein making machine in the cell called a ribosome. It is there that the mRNA is translated into the specific protein for which it has the instructions. The lactase mRNA is translated into the protein lactase at the ribosome.
What Do Codons Code For?
A codon is a sequence of three nucleotides on a strand of DNA or RNA. Each codon is like a three-letter word, and all of these codons together make up the DNA (or RNA) instructions. Because there are only four nucleotides in DNA and RNA, there are only 64 possible codons.
Of the 64 codons, 61 code for amino acids, which are the building blocks for proteins. Proteins are made by attaching a series of amino acids together. Each protein is different because of the order and number of amino acids it has. So the DNA code is really just the instructions for stringing together the right number and type of amino acids in the right order.
The three codons that do not code for amino acids are called stop codons. Think of them as periods at the end of a sentence. They serve as the stop signal that tells the ribosome that it has come to the end of the protein instructions and to stop adding amino acids. In RNA, the nucleotide base thymine (T) is replaced by the nucleotide base uracil (U). The three stop codons in mRNA are UAG, UAA, and UGA.
While 61 codons code for amino acids, humans only have 20 amino acids, so there are more codons than necessary. This is known as redundancy. An amino acid can have more than one codon that codes for it. For example, both UUU and UUC code for the amino acid phenylalanine (Phe).
Redundancy helps lessen the impact of changes in the DNA. For a protein to work optimally, it needs to have the right amino acid in the right place. Any changes in a gene that change one amino acid into another can cause a protein to stop working.
While this might not be a big deal for the lactase gene (you just have to take Lactaid when you drink milk), for other genes the effects can be more severe. Sickle cell anemia is a case where a single amino acid change in the beta globin gene leads to the disease.
Redundancy makes mutations less likely to lead to amino acid changes and thus possible disease because some changes in the DNA, called silent mutations, will result in the same amino acid. If a C replaces the last U in UCU to form UCC, for instance, the codon will still make the same amino acid: serine (Ser). Having more than one codon per amino acid can prevent the creation of a nonfunctional protein.
How Many Possible Codons Are There?
Most organisms, like humans, have similar genetic codes with 64 codons that work the same way. In fact, it even goes by the name 'Universal Genetic Code.' One example would be ACG coding for the amino acid threonine (Thr) in humans, cats, and plants.
However, recent research shows that some bacteria have codons that code differently. For example, the stop codon UGA can code for the amino acid glycine (Gly) in some bacteria. Likewise, the stop codon UGA can encode for tryptophan in mitochondria in some organisms.
What Does DNA Provide the Code For?
Only about two percent of the DNA inside your cells actually codes for proteins. The rest is sometimes even called junk DNA—but scientists may have been a bit hasty in calling it that. This non-coding DNA has many different functions in the cell, such as regulating genes. Non-coding DNA can help turn genes on and off, provide a place for proteins to bind, so they can do their work, and so on. Studying noncoding DNA is an active area of research right now.