Molecular Biology: From DNA to RNA to Protein
4.1 Introduction: “Inner Life of the Genome”
The size of the genome in one of the most well-studied prokaryotes, E.coli, is 4.6 million base pairs (approximately 1.1 mm, if cut and stretched out).
Let’s do a rough calculation about how much DNA a eukaryotic cell has.
- A somatic cell has 2 full copies of every chromosome. The total number of base pairs for ALL chromosomes combined is 6 X 109 bp.
- We know the length of each base-pair is 0.34nm (see DNA structure chapter!)
If you were to take the DNA of all the chromosomes, stretch it out, and lay them end to end then what will the total length of ALL the DNA in the cell be?
Hopefully, you got to a number of 2.0 meters! For reference that is approximately the height of Lebron James! Now consider the size of the eukaryotic nucleus which ranges from 2- 10 microns! For a good frame of reference, a SINGLE grain of salt is 100 microns which is still 10 times larger than the upper range for nucleus size!
This results in an engineering problem that cells need to solve.
Problem 1: How does all of the DNA fit inside a bacterial cell and even more amazingly inside the nucleus of a eukaryotic cell?
At the same time consider all the different cells in the body. The human body contains many different cell types, but each cell type shares the same genomic sequence. In spite of having the same genetic code, cells not only develop into distinct types from this same sequence but also maintain the same cell type over time and across divisions.
They are also specialized in their function- liver cells for example produce enzymes that help with detoxification but they do not produce antibodies, that is the job of a different cell type.
These cells have different expression patterns due to the temporal and spatial regulation of genes. (See the video below for a good analogy).
This leads to the second problem to contend with- once the genome has been folded and all packaged up how can we open small sections of it so the machinery to read and express the information coded within the genes (transcriptional machinery!) can gain access?
The epigenome (“epi” means above in Greek, so epigenome means above genome) is the set of chemical modifications or marks that influence gene expression and are transferred across cell divisions and, in some limited cases, across generations of organisms.
In this chapter, you will learn about the solution to the ‘Packaging Problem’ and how cells regulate access to genes.
The solution involves interactions of DNA with specific proteins, leading to the formation of a nucleoprotein complex called CHROMATIN.
We will focus exclusively on Eukaryotic chromatin, although bacterial chromosomes (which are circular) also coil and supercoil with the help of proteins.
Levels 1 and 2 are factual information, and knowledge-based. Level up indicated by the “Target” symbol is the goal.
When you have mastered the information in this chapter, you should be able to:
Level 1 and 2 (Knowledge and Comprehension)
- Draw/Label nucleosomes in a 10nm fiber to identify core histones, linker DNA, core DNA.
- Outline the steps by which a nucleosome is assembled.
- List the five major types of histone proteins, and describe what role each of them plays in the nucleosome
- Explain what form of chromatin is present during interphase.
- Define the roles of modifying amino acid side chains in altering nucleosome structure and how the structure of chromatin can be used for regulating gene expression.
- o What modifications would lead to activation of gene expression?
⊕ Level Up (Application, Analysis, Synthesis)
- Analyze/Predict/Interpret experimental data connected with DNA organization into chromatin.
- Analyze/Predict/Interpret experimental data connecting chromatin structure with gene-expression.
- Predict how mutations in key histones can alter chromatin structure /gene expression.
- Outline an experiment to purify histones from chromatin.
4.2 Types of Chromatin in Eukaryotic Cells
When we visualize chromosomes we usually think of the characteristic highly condensed structures (X shaped).
DNA in this form exists only transiently for a brief phase in Mitosis, where it is maximally condensed. These discrete packets are ideal for the job at hand which is to accurately separate and segregate to opposite poles of the cell. The majority of the time a cell is in fact in interphase and not dividing.
During Interphase which can be further subdivided into Gap 1-G1, S-phase, and G2] cells are transcribing (RNA) and translating (making proteins) sometimes on demand and depending on the needs of the cell.
Not all parts of the genome however are active. During interphase (Figure 4.1 below) chromatin exists in various states of condensation called heterochromatin and euchromatin respectively.
The transition between these chromatin forms involves changes in the amounts and types of proteins bound to the chromatin that can occur during gene regulation, i.e. when genes are turned on or off.
Experiments to be described later showed that active genes tend to be in the more dispersed euchromatin where enzymes of replication and transcription have easier access to the DNA. Transcriptionally inactive genes are heterochromatic, obscured by additional chromatin proteins present in heterochromatin.
4.3 Chromatin Organization
We can define three levels of chromatin organization in general terms:
- DNA wrapped around histone proteins (nucleosomes) like “beads on a string”.
- Multiple nucleosomes coiled (condensed) into 30 nm fiber (solenoid) structures.
- Higher-order packing of the 30 nm fiber into the eventual familiar metaphase chromosome.
Before we can discuss how these aspects were determined we should introduce the proteins that play a central role in the folding of DNA- the histones.
There are 5 major classes of histones: H1, H2A, H2B, H3, and H4. Histones are basic proteins containing many lysine and arginine amino acids. Their positively charged side chains enable these amino acids to bind to the acidic, negatively charged phosphodiester backbone of double-helical DNA. About a gram of histones is associated with each gram of DNA.
Histones are among the most highly conserved proteins. Very few amino acid differences distinguish a human histone from histones in a mouse, sea urchin, or yeast cell. For instance, the H4 from cows differs from H4 in peas by only 2 amino acids. This eludes to the fundamental aspect of the role and function of these proteins.
Only eukaryotes (i.e., organisms with a nucleus and nuclear envelope) have histones. Prokaryotes, such as bacteria, do not.
4.3.2 First level of organization- The Nucleosome
Aspects of chromatin structure were determined by gentle disruption of the nuclear envelope of nuclei, followed by salt extraction of extracted chromatin. Salt extraction dissociates most of the proteins from the chromatin. The results of a low [salt] extraction are shown in Fig.4.2 (below).
When the low salt extract is centrifuged and the pellet resuspended, the remaining chromatin looks like beads on a string. DNA-wrapped nucleosomes are the beads, which are in turn linked by uniform lengths of the metaphorical DNA ‘string’.
Roger Kornberg, one son of Nobel Laureate Arthur Kornberg (discoverer of the first DNA polymerase enzyme of replication-see the next chapter), participated in the discovery and characterization of nucleosomes while still a post-doc!
The techniques he used are some of the ones we discussed in earlier chapters! Two types of experiments were used, one involved partial digestion of chromatin with micrococcal nuclease (an enzyme that degrades DNA).
Electrophoresis of DNA extracted from digests of nucleosome beads-on-a-string preparations generated DNA fragments about 200 base pairs in length.
In contrast, the digestion of DNA not associated with proteins yielded a continuous smear of randomly sized fragments.
These results suggested that the binding of proteins to DNA in chromatin protects regions of the DNA from nuclease digestion so that the enzyme can attack DNA only at sites separated by approximately 200 base pairs.
This led him to propose a model that the basic unit of chromatin- a Nucleosome (the beads) consists of ~ 200bp of DNA wound around a core, some space (linker DNA), and another nucleosome.
The length of the DNA extracted from the nucleosomes varies between organisms, ranging from 170-240 but the variation results entirely from the linker DNA length between the nucleosomes.
The length of DNA that is wrapped around the histone proteins that make up the core is always approximately 147 bp base pairs long.
The nucleosome core particle consists of an octameric protein complex (two copies each of H2A, H2B, H3, and H4) with the 147 bp DNA wound around it.
The identity of the proteins that make up the core of the nucleosome came next and included familiar techniques of separating proteins and running them on SDS-PAGE gels.
Assembly of the Nucleosome
Today, researchers know that nucleosomes have a common structure comprising two copies of histones H2A, H2B, H3, and H4 that come together to form a histone octamer. Approximately 1.7 turns of DNA, or about 146 base pairs, wraps around this histone core.
All core histones share a conserved structure known as the histone fold, which consists of three alpha-helices connected by 2 short loops.
The N-terminus of the proteins that do not participate in the fold are referred to as Histone Tails and play a crucial role in the regulation of chromatin structure.
The assembly of the nucleosome occurs in a step-wise process with specific associations:
2 subcomplexes are first formed. H3 interacts with H4 to form a heterodimer, and H2A and H2B interact to form H2A.H2B heterodimers. The interaction between histones utilizes histone folds in what is termed the handshake fold.
H3.H4 heterodimers then interact to form a tetramer and the DNA begins to wrap around it. The H2A. H2B dimers then cap the complex at the top and bottom. (Figure 4. 3)
In the figure below core histones are color-coded (yellow = H2A, red = H2B, blue = H3, green = H4) and cylinders represent helices.
Before you continue you should
- Watch the Lecture videos that cover the material above. (if you haven’t already)
The three-dimensional structure of the nucleosome using X-ray crystallography was solved by Dr. Karolin Luger which provided a deeper understanding of chromatin organization.
- Electrostatic interactions of negatively charged DNA (phosphodiester backbone) with positively charged amino acids of histones.
- Additional interactions occur via hydrogen bonds with bases of the DNA.
This association is consistent with the need to package any type of DNA without regard to sequence.
However, there are areas of the genome where nucleosomes position themselves preferentially. These include regions with A-T rich sequences that are more accomodating of the ‘bend’ or ‘compression’ that occurs in the minor groove as the DNA wraps around the histone octamer.
4.3.3 30-nm Fibres and Higher-Order Chromatin
The packaging of DNA into nucleosomes shortens the fiber length about sevenfold, not enough to fit in the nucleus just yet. Therefore, chromatin is further coiled into an even shorter, thicker fiber, termed the “30-nanometer fiber,” because it is approximately 30 nanometers in diameter (Figure 4.4).
Experimentally this structure can be observed after a high salt chromatin extraction. As shown in the illustration, increasing the salt concentration of an already extracted nucleosome preparation will cause the ‘necklace’ to fold into the 30nm solenoid structure.
In recent years histone H1 also know as ‘Linker histone’ has been shown to plays a role in establishing or stabilizing the 30nm structure.
In experiments when H1 is present the 30-nanometer fibers form readily. H1 binds DNA where the DNA joins and leaves the histone octamer and helps lock the DNA into place, acting as a clamp around the nucleosome.
Additionally, the tails of the core histones have been shown to be important for the formation of 30nm fibers.
In fact, there are at least five levels (orders) of chromatin structure (Fig. 4.5).
The first 2 (#1 and #2 in figure 4.5) we discussed above but other extraction protocols revealed other aspects of chromatin structure shown in #s 3 and 4 ( in Figure 4.5). We know the most about the 10 nm fiber and have yet to fully elucidate how the chains of nucleosomes fold into the final condensed form.
Non-Histone proteins like condensins and scaffold proteins and cohesins play a role to further coil and compact the DNA until it resembles the metaphase chromosome.
Remember that the final metaphase chromosome shape is established and made only in preparation for metaphase. The bulk of the time the cell is in interphase and chromosomes do not resemble those discrete units we are used to thinking of when we think of chromosomes!
Concepts in Context: Shape of the Genome
In this video, you learn how new tools have allowed scientists to probe chromatin organization by finding points of contact between different regions.
COMPLETE: Don’t forget to complete the assignments associated with Concepts in Context within CANVAS.
- The primary structure of chromatin is a 10 nm fiber “beads on a string’ structure
- The ‘bead’ is a nucleosome, which includes about 200 bp of DNA wrapped around a histone core consisting of 2 copies each of core histones (H2A, H2B, H3, and H4)
- The DNA closest to the Histone core- Core DNA is 147 bp.
- The next level of chromatin is 30 nm fibers- formed by interactions between neighboring nucleosomes.
- Histone H1 is associated with linker DNA and assist in the formation of 30 nm fibers.
- 30nm fibers are folded into higher-order structures (loops upon a proteinaceous scaffold, and then loops are further condensed to form metaphase chromosomes)
- Non-Histone proteins are used for higher-order structures.
4.4 Regulation of Chromatin Structure
STUDY TIP: Watch the Lecture Video first. In this part of the chapter, I will highlight just some key points of how chromatin structure is regulated from the lecture videos.
Link here: Dr. Mehta: Regulation of Chromatin Structure
To recap we have just seen that chromosomal DNA associates with histones, forming an organized complex known as chromatin. This accomplished the goal of allowing DNA to fit into a smaller volume so within a eukaryotic cell’s nucleus.
However, compaction of the DNA also limits the accessibility of the DNA to proteins (transcription factors- which we will learn about soon) that will transcribe the gene [Gene Expression].
In order for gene expression to occur, the chromatin structure needs to move back and forth between condensed and decondensed forms.
Changes to chromatin structure can be brought about in 2 ways and they are not mutually exclusive adding layers of complexity to regulation !
- Via Histone modifications
- Via Chromatin remodeling complexes.
4.4.1 Histone Modifications
The amino acid residues of histone tails are subject to many post-translational modifications and the list is ever-growing. In particular, histones contain a large percentage of lysine and arginine (basic amino acids) residues and serines, and threonines.
The side chains of these residues are modified by the action of enzymes.
The most studied modifications include acetylation (adding acetyl groups) of lysines, phosphorylation (adding phosphate groups) of serines, and methylation (adding methyl groups) of lysines and arginines.
The addition of the groups is facilitated by enzymes (collectively known as epigenetic “writers”) whilst de-modifying enzymes (or “erasers”) remove these marks. (see table below)
|Histone Acetyl transferases (HATs)||Adds acetyl groups to histones|
|Histone Deacetylases (HDACS)||Removes acetyl groups from histones|
|Histone Methyltrasnderases||Adds methyl groups|
|Histone Demethylases||Remove methyl groups|
These opposing activities enable a highly dynamic regulation of gene expression as modifications can be added or removed depending on whether a particular writer or eraser is recruited to a specific location of the genome.
Modifications can occur at different amino acids on different histones and can create more than 100 unique potential changes in the histones.
Modifications also result in unique patterns or signatures that DNA binding proteins recognize.
Proteins with BROMODOMAINS- recognize Acetylated Lysine residues.
Proteins with CHROMODOMAINS- recognize methylated residues.
To add to the complexity, these proteins and modifying enzymes are contained within a larger multiprotein complex and can recruit another ‘molecular machine’ the chromatin remodeling complexes. These complexes serve to further reposition nucleosomes as discussed below.
4.4.2 Chromatin Remodeling Complexes
- As the name suggests, these are protein complexes meaning they contain multiple different proteins working together.
- They ‘remodel’ chromatin- alter chromatin structure by repositioning nucleosomes. The mechanisms include
- evicting nucleosome components (such as H2A–H2B dimers)
- ejecting full nucleosomes (creating nucleosome-free regions)
- replacing with variant histone subunits.
There are many families of chromatin remodeling complexes with a variety of different names, and while they are diverse in protein composition they all have at least one enzymatic ‘ATPase subunit which allows them to utilize the energy released from ATP hydrolysis to reposition nucleosomes. (1)
Additionally, as mentioned above they are often recruited to the chromatin by ‘reading’ the ‘marks’ left by histone modifiers OR contain proteins with histone-modifying activities and histone recognition activities.
4.5 Links to Medicine
Given the intimate role chromatin structure plays in regulating gene expression, it is not surprising that changes in acetylation signaling resulting from misregulated HATs or HDACs can cause abnormal gene expression patterns and have been identified in numerous cancers. (2)
Some examples are
- Genetic alterations in HATs in hematological and solid cancers (3)
- Hyper-active HDACs in many cancers (4)
- Components of several chromatin-remodeling complexes are highly mutated in cancer.
Not surprisingly there are a wide variety of small-molecule inhibitors targeting acetylation signaling pathways in development for use as anti-cancer drugs (see table below).
- Watch the Lecture videos that cover the material above. This will help to clarify or reinforce certain concepts if they were unclear.
- Complete any associated practice exercises
- Start work on Problem Set
References and Attributions
This chapter contains material taken from the following CC-licensed content and Public Domain content. Changes include rewording, removing paragraphs and replacing them with original material, and combining material.
Images in this chapter unless otherwise noted are from the textbook Basic Cell and Molecular Biology: What We Know & How We Found Out – 4e by Gerald Bergtrom licensed CC-BY. Available here: https://open.umn.edu/opentextbooks/textbooks/cell-and-molecular-biology-2e-what-we-know-how-we-found-out.
Attributions provided within the original text for the diagrams are included below
Fig. 4.2: Low salt fractionation of interphase nuclei yields 10nm nucleosome beads on a string.
- Upper; From Bergtrom et al., (1977) J. Ultrastr. Res. 60:395-406: Research by G. Bergtrom;
- Lower left; CC-BY-SA 3.0; Adapted from: https://commons.wikimedia.org/wiki/File:Chromatin_nucleofilaments_%28detail%29.png
Fig. 4.3: High salt chromatin extraction from nuclei or high salt treatment of 10 nm filaments yields 30 nm solenoid structures, essentially coils of 10 nm filaments.
- Electron micrograph of nucleus, From Bergtrom et al., (1977) J. Ultrastr. Res. 60:395-406: Research by G. Bergtrom
- CC-BY-SA 3.0; Adapted from: https://commons.wikimedia.org/wiki/File:Chromatin_nucleofilaments_%28detail%29.png
- CC BY-SA 4.0; Alt: Adapted from Richard Wheeler https://en.wikipedia.org/w/index.php?curid=53563761
Fig. 4.5: Five different levels (orders) of chromatin structure. CC-BY-SA 3.0; Adapted From https://en.wikipedia.org/wiki/Chromatin.
(1) Clapier, C., Iwasa, J., Cairns, B. et al. Mechanisms of action and regulation of ATP-dependent chromatin-remodeling complexes. Nat Rev Mol Cell Biol 18, 407–422 (2017). https://doi.org/10.1038/nrm.2017.26
(2) Gong F, Chiu LY, Miller KM (2016) Acetylation Reader Proteins: Linking Acetylation Signaling to Genome Maintenance and Cancer. PLOS Genetics 12(9): e1006272. https://doi.org/10.1371/journal.pgen.1006272
(3) Di Cerbo V, Schneider R. Cancers with wrong HATs: the impact of acetylation. Briefings in functional genomics. 2013;12(3):231–43. pmid:23325510
(4) Gui CY, Ngo L, Xu WS, Richon VM, Marks PA. Histone deacetylase (HDAC) inhibitor activation of p21WAF1 involves changes in promoter-associated proteins, including HDAC1. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(5):1241–6. pmid:14734806