By Robert Hazen, George Mason University
DNA has a double helix, ladder-like, structure. It has four different kinds of rungs, made up of complementary base pairs: A-T or C-G. Thus, in a sense the complete genetic vocabulary consists of 64 different words, 64 different codons. And yet, how do we decipher and learn it?

Our Genetic Vocabulary
DNA’s double-helix structure has vertical sides of alternating sugar-phosphate, sugar-phosphate pairs. They make the strong verticals of the ladder, and then the rungs are the base pairs. Because there are four different letters that make up the base pairs—A, C, G, and T—we have a four-letter alphabet that conveys information. Because it’s a double helix that can be split apart and have two new double helices, where before there was one; one can duplicate information with this molecule as well.
Thus, our complete genetic vocabulary consists of 64 different codons. That is our complete genetic code. It turns out that 61 of the codons define specific amino acids, while three of the codons mark the end of a gene on the strand, or the beginning, or have some other necessary ‘punctuation’ ability. There is one codon—ATG, or AUG on the RNA side—that codes for the amino acid methionine, and that also marks the beginning of all genes. Every protein begins with the amino acid methionine, and so we have that marker to begin every gene: ATG. The code is quite redundant.
This is a transcript from the video series The Joy of Science. Watch it now, on Wondrium.
Using the Same Genetic Code
As we all know, there are only 20 different amino acids, but there are 61 different codons that specify amino acids. In many cases, 3 or 4 or more of those codons specify the same amino acids. So there’s a certain safety built in; one can have certain kinds of mutations and still have the correct protein being built. By 1964, it was discovered that this genetic code is common to every living thing on Earth. Every living organism known uses the exact same genetic code.
Hence, we can see that the genetic code defines a one-to-one correspondence between three bases, on the one hand, and an amino acid, on the other. The next step was then to deduce the cellular mechanism by which the information of messenger RNA, which is still just a bunch of information, serves as a template for actually making the protein.
Transfer and Ribosomal RNA
That brings us to the question-how does one, physically, construct the protein from amino acids? We have a message stretched out on a messenger RNA, and then we need transfer RNA. Transfer RNA is a fascinating two-ended molecule. On one end, we have three exposed bases, the three of a codon. The other end of the molecule has an amino acid. If we can imagine this structure: three bases on one side correspond to a codon; it links up with an amino acid at the other end, so we can string amino acids at one end corresponding to the sequence of bases at the other.
There are 61 different kinds of transfer RNA; each one for one of the 61 different codons, and each one corresponding to a specific amino acid. Assembly of a protein with 30 amino acids, thus, requires us to use 30 transfer RNA molecules; it also requires a messenger RNA unit that has 3 times 30 base pairs, or 90 base pairs. Hence, one starts seeing this correspondence—the linking-up of transfer RNA, which matches codons to amino acids.
Finally, we need a third kind of RNA: ribosomal RNA. The ribosomes are structures in every cell which contain this ribosomal RNA, along with about 50 different proteins; this makes a chemical machinery for linking together the amino acids. This ribosome looks something like two balls, a smaller one perched on top of a larger one, and there is a groove where the two balls meet. The messenger RNA fits exactly into that groove between the two balls.
Similar Genetic Mechanism

As the long messenger RNA molecule feeds into the ribosome, appropriate transfer RNA molecules fit into grooves on the side of the larger ball, one after another, and the ball rotates, passing the messenger RNA through that groove.
As it does so, one after another, the transfer RNAs fit onto the surface of the ribosome; they link their amino acids together, and as the messenger RNA is pulled through the ribosome, assembly of the protein occurs at the bottom side of the ribosome. It’s an amazing process. What’s really remarkable is that every living thing uses the exact same genetic mechanism. Human genes can be inserted into yeast, or into E. coli, and the appropriate human protein will be produced by those simple, single-celled organisms.
Deciphering Success
Thus, one can say that, in the short span of half a century, molecular biologists have clearly deciphered the four-letter genetic alphabet. They’ve learned its complete 64-word vocabulary, and they have even learned a few sentences in that vocabulary. And yet, it was clearly the discovery of DNA’s structure which marked a historic transition point in the biological sciences and led us down this path.
Prior to 1953, when a structure was published, there were various studies of genetics at the level of organisms, at the level of chromosomes, at the level of molecules; and they were carried out by a completely different group. We had botanists and zoologists studying the organisms; we had cellular biologists looking at chromosomes; and then we had people trained in organic chemistry looking at the chromosomes and trying to understand that chemistry. These were completely different training and different groups of people.
It was only after the DNA breakthrough, that the field of genetics was unified by a common conceptual basis, a simple chemical basis for understanding how genetics works. It also helped us overcome the big challenge, of how to read the complete genetic vocabulary and the book of life.
Common Questions about Learning the Four-letter Genetic Alphabet
Our complete genetic vocabulary consists of 64 different codons. That is our complete genetic code.
Every living thing uses the exact same genetic mechanism. Human genes can be inserted into yeast, or into E. coli, and the appropriate human protein will be produced by those simple, single-celled organisms.
There are 61 different kinds of transfer RNA; each one for one of the 61 different codons, and each one corresponding to a specific amino acid.