Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.
Molecular Biology: From DNA to RNA to Protein
7 Regulation of Gene Expression – Prokaryotes
Each somatic cell in the body generally contains the same DNA. A few exceptions include red blood cells, which contain no DNA in their mature state, and some immune system cells that rearrange their DNA while producing antibodies. In general, however, the genes that determine whether you have green eyes, brown hair, and how fast you metabolize food are the same in the cells in your eyes and your liver, even though these organs function quite differently. If each cell has the same DNA, how is it that cells or organs are different? Why do cells in the eye differ so dramatically from cells in the liver?
Similarly, all cells in two pure bacterial cultures inoculated from the same starting colony contain the same DNA, with the exception of changes that arise from spontaneous mutations. How is it that the same bacterial cells within two pure cultures exposed to different environmental conditions can exhibit different phenotypes?
Gene regulation is how a cell controls which genes, out of the many genes in its genome, are “turned on”
Before you begin make sure you look at learning objectives.
Level 1 and 2 (Knowledge and Comprehension)
Draw a picture illustrating the general structure of an operon, and identify its parts.
Know the difference between positive and negative control? What is the difference between inducible and repressible operons?
Briefly describe the lac operon and how it controls the metabolism of lactose.
What is catabolite repression? How does it allow a bacterial cell to use glucose in preference to other sugars?
Level Up (Application, Analysis, Synthesis)
1. Predict for the following types of transcriptional control whether the protein produced by the regulator gene will be synthesized initially as an active repressor or as an inactive repressor.
Negative control in a repressible operon
Negative control in an inducible operon
2. Predict for the following types of transcriptional control whether the protein produced by the regulator gene will be synthesized initially as an active form or inactive form.
Positive control in a repressible operon
Positive control in an inducible operon
NOTE: The mechanism of prokaryotic regulation was deciphered with the use of bacterial mutants. (See Link to learning).
3. Predict the effect of mutations in the following elements on the transcription of an operon
Mutation at the operator prevents the regulator protein from binding, if regulator protein is a repressor AND operon is repressible operon.
Mutation at the operator prevents the regulator protein from binding, if regulator protein is a repressor AND operon is inducible operon
4. Use genetic data (phenotypes of mutant strains) for a fictitious operon to determine
Type of operon (inducible, repressible)
Which sequences are the promoter sequences.
Which sequences correspond to regulatory gene.
Which sequences are the structural genes.
5. Identify the level of transcription of a lac operon under given cellular conditions
6. Predict the effect of mutations within the following elements on the transcription of the Lac operon under different conditions.
CAP (such that it can no longer bind the CAP site)
Lac-I gene (repressor protein lac-I non-functional)
Promoter of lac-operon
7.2 Overview of Regulation of Gene Expression
Define the term regulation as it applies to genes
For a cell to function properly, necessary proteins must be synthesized at the proper time.
In a given cell type, not all genes encoded in the DNA are transcribed into RNA or translated into protein because specific cells in our body have specific functions. Specialized proteins that make up the eye (iris, lens, and cornea) are only expressed in the eye, whereas the specialized proteins in the heart (pacemaker cells, heart muscle, and valves) are only expressed in the heart.
The process of turning on a gene to produce RNA and protein is called gene expression. Whether in a simple unicellular organism or a complex multi-cellular organism, each cell controls whenand how its genes are expressed.
The expression of a gene is a highly regulated process. In multicellular organisms it allows for cellular differentiation, in single-celled organisms like prokaryotes, it primarily ensures that a cell’s resources are not wasted making proteins that the cell does not need at that time.
Elucidating the mechanisms controlling gene expression is important to the understanding of human health. Malfunctions in this process in humans lead to the development of cancer and other diseases.
Understanding the interaction between the gene expression of a pathogen and that of its human host is important for the understanding of a particular infectious disease.
Prokaryotic versus Eukaryotic Gene Expression
In prokaryotic cells, the control of gene expression is mostly at the transcriptional level. Recall that prokaryotic organisms are single-celled organisms that lack a cell nucleus, and their DNA, therefore, floats freely in the cell cytoplasm. To synthesize a protein, the processes of transcription and translation occur almost simultaneously. When the resulting protein is no longer needed, transcription stops. As a result, the primary method to control what type of protein and how much of each protein is expressed in a prokaryotic cell is the regulation of DNA transcription. All of the subsequent steps occur automatically. When more protein is required, more transcription occurs.
Recall from Chapter 6 that in eukaryotic cells, the DNA is contained inside the cell’s nucleus and there it is transcribed into RNA. The newly synthesized RNA is then transported out of the nucleus into the cytoplasm, where ribosomes translate the RNA into protein. The processes of transcription and translation are thus physically separated by the nuclear membrane. Thus regulation of gene expression in eukaryotes can occur at multiple stages such as:
When the DNA is uncoiled and loosened from nucleosomes to bind transcription factors (epigenetic level)
When the RNA is transcribed (transcriptional level)
When the RNA is processed and exported to the cytoplasm after it is transcribed (post-transcriptional level)
When the RNA is translated into protein (translational level), or
After the protein has been made (post-translational level).
Figure 7.1. Prokaryotic transcription and translation occur simultaneously in the cytoplasm, and regulation occurs at the transcriptional level. Eukaryotic gene expression is regulated during transcription and RNA processing, which take place in the nucleus, and during protein translation, which takes place in the cytoplasm. Further regulation may occur through post-translational modifications of proteins.
The differences in the regulation of gene expression between prokaryotes and eukaryotes are summarized in the table below.
Differences in the Regulation of Gene Expression of Prokaryotic and Eukaryotic Organisms
Lack a membrane-bound nucleus
DNA is found in the cytoplasm
DNA is confined to the nuclear compartment
RNA transcription and protein formation occur almost simultaneously
RNA transcription occurs prior to protein formation, and it takes place in the nucleus. Translation of RNA to protein occurs in the cytoplasm.
Gene expression is regulated primarily at the transcriptional level
Gene expression is regulated at many levels (epigenetic, transcriptional, nuclear shuttling, post-transcriptional, translational, and post-translational)
In this chapter, we consider systems of gene regulation in bacteria. Before we do let’s look at examples of why understanding the regulation of gene expression in bacteria is relevant.
Clinical and Biological Relevance
It is becoming increasingly clear bacterial cells live in communities, interacting with other cells of their own species and of different species. They also exhibit community behaviors such as coordinated expression of genes within cells! For example, a type of community behavior is Quorum Sensing.
Here bacteria ‘count’ the presence of others and when there is an appropriate cell density reached (quorum!) then turn on the synthesis of genes together. This behavior is medically important. For example, some microbial species, such as Staphylococcus aureus, can encase their community within a self-produced matrix of hydrated extracellular polymeric substances that include polysaccharides, proteins, nucleic acids, and lipid molecules. These encasements are known as biofilms. Organisms within the biofilm are more than 1000-fold more resistant to antibiotics and more able to evade the host immune response than are free-living bacteria. Patients implanted with prosthetic devices, including simple bladder catheters, are especially at risk for biofilm formation.
Bacterial genes with related functions—such as the genes that encode the enzymes that catalyze the many steps in a single biochemical pathway— are regulated together and found next to each other on the DNA. This cluster of genes shares ONE promoter and a regulatory sequence (explained below) that controls the transcription of the unit.
The organization of genes in this manner is called an OPERON.
Transcription of the OPERON forms a polycistronic mRNA (Figure 7.2) – one mRNA that contains the information to make more than one protein. The promoter has simultaneous control over the regulation of the transcription of these structural genes because they will either all be needed at the same time, or none will be needed.
Grouping related genes under a common control mechanism allow bacteria to rapidly adapt to changes in the environment.
The organization of an operon is illustrated below in Figure 7.2
The genes that encode proteins used in metabolism or biosynthesis or that play a structural role in the cell are called STRUCTURAL GENES
Each operon includes DNA sequences that influence its own transcription; these are located in a region called the regulatory region.
The regulatory region includes the promoter and the region surrounding the promoter, to which transcription factors, proteins encoded by regulatory genes, can bind. Transcription factors influence the binding of RNA polymerase to the promoter and allow its progression to transcribe structural genes. Collectively these sequences are often called regulatory elements.
A repressor is a transcription factor that suppresses the transcription of a gene in response to an external stimulus by binding to a DNA sequence within the regulatory region called the operator, which is located between the RNA polymerase binding site of the promoter and the transcriptional start site of the first structural gene. Repressor binding physically blocks RNA polymerase from transcribing structural genes.
Conversely, an activator is a transcription factor that increases the transcription of a gene in response to an external stimulus by facilitating RNA polymerase binding to the promoter.
An inducer, the third type of regulatory molecule, is a small molecule that either activates or represses transcription by interacting with a repressor or an activator.
Other genes in prokaryotic cells are needed all the time. These gene products will be constitutively expressed or turned on continually. Most constitutively expressed genes are “housekeeping” genes responsible for the overall maintenance of a cell.
The genes that code for these regulatory proteins (Activators and Repressors) are called Regulatory Genes. Transcription of these genes is under the control of its own promoter but is often found next to the operon on the same DNA.
How does one transcript code for multiple proteins?
Just like DNA replication and transcription have different start and stop signals, translation also has its own start and stop signals.
DNA replication starts at origins (this is on DNA), transcription starts at promoters (also on DNA) and translation begins on mRNA. The coding information for protein is buried within the mRNA and does not start at the transcriptional start site.
Just like DNA has extra sequences like the promoter that enable the RNA polymerase to bind and signals where mRNA transcription to begin, the mRNA as at its 5′ end a leader sequence (the untranslated region on the 5′ end) that caries a ribosome binding site.
Translation of the protein begins at the translational START Codon (we will revisit this when we learn about translation in upcoming modules) and ends at the translation STOP codon. The region between the 2 is the open reading frame.
Thus a polycistronic transcript carries many such open reading frames, each beginning with a translation initiation codon and consisting of a linear sequence of codons that specifies the protein.
A more accurate representation of an operon is shown below. Note that while the transcription start site is ONE (it is one long message- ONE promoter), within the message are present multiple start and stop codons, and the region between those two represents the code to make protein (Polycistronic).
Check your understanding
Answer the questions below to check your understanding and decide whether to (1) study the previous section again or (2) move on to the next section.
Another term for DNA sequences that regulated transcription is cis-elements because they must be located on the same piece of DNA as the genes they regulate.
Binding sites for proteins involved in transcriptional regulation of the operon- promoters, operators, and activator binding sites are called cis-elements
On the other hand, the proteins that bind to these cis-elements are called trans-regulators because (as diffusible molecules) they do not necessarily need to be encoded on the same piece of DNA as the genes they regulate.
Cis and trans-acting elements are concepts that will be relevant when predicting the effect of mutations. See Link to Learning.
Recall that regulation of gene expression or operons is occurring in responseto some environmental signal. Therefore operons can be
Inducible: Where transcription is normally off (not taking place); something must happen to induce transcription or turn it on. Usually, this is in response to a metabolite (a small molecule undergoing metabolism) that regulates the operon.
Repressible: Where transcription is normally on (mRNA made, proteins made); something must happen to repress transcription, or turn it off.
The type of control (how these operons can be induced or repressed) is defined by the mechanism it uses.
For operons under negative control: The regulatory protein used is a repressor. Genes are expressed unless they are turned OFF by a repressor that binds to DNA and inhibits transcription. Thus the operon will be turned OFF when the repressor is present, but ON when the repressor is absent or somehow inactivated.
For operons under positive control: The genes are expressed only when an active regulator protein, e.g. an activator, is present. Thus the operon will be turned off when the positive regulatory protein is absent or inactivated.
Let’s take the basic parts and see if we can build the logic of gene regulatory circuitry.
We have a NEGATIVE INDUCIBLE OPERON.
Inducible = the operon is normally OFF (inhibited) = no genes are expressed.
Negative Control = Repressor protein is used to control the expression of genes within this operon.
How it would work:
Operon normally off because repressor protein is bound to operator preventing RNA polymerase to bind.
A small molecule called an inducer accumulates and binds to the repressor protein. The binding of the inducer alters the shape of the repressor, preventing it from binding to DNA and thus turning ON (inducing transcription)! The repressor is inactivated.
Concepts in Context
Bonnie Bassler discovered that bacteria “talk” to each other, using a chemical language that lets them coordinate defense and mount attacks. The find has stunning implications for medicine, industry — and our understanding of ourselves.
COMPLETE: Don’t forget to complete the associated assignment on CANVAS
Check your understanding
Use this quiz to check your understanding and decide whether to (1) study the previous section further or (2) move on to the next section.
Figure out the logic of control for Positive Inducible, Negative Repressible, and Positive Repressible operons.
7.4 The lac Operon- An example of regulation of bacterial gene expression
One of the best-understood examples of gene regulation is the lac operon of Escherichia coli, which is often used as a model system in genetics and has real, practical applications in molecular biology.
The lac operon contains three enzyme-coding structural genes and regulatory elements. The enzymes work together to allow E. coli to digest the disaccharide lactose, and the regulatory elements control the transcription of these enzymes.
The preferred carbon and energy source for E. coli is glucose, but E. coli will metabolize lactose if no glucose is present in the growth medium. (Figure 7.3)
The operon consists of lacZ, lacY, and lacA genes which are called structural genes, and are all transcribed together onto a single polycistronic mRNA strand.
Let’s take a closer look at the structure of the lac operon and the function of the Y, Z, and A proteins (See Figure 7.4 A).
The lacZ gene encodes β-galactosidase, the enzyme that breaks lactose (a disaccharide) into galactose and glucose. (Figure 7.3)
The lacY gene encodes lactose permease, a membrane protein that facilitates lactose entry into the cells.
The role of the lacA gene (a transacetylase) in lactose energy metabolism is not well understood.
The I gene to the left of the lac Z gene is a coding sequence for the repressor protein.
The operator (O) sequence separating the I and Z genes is a transcription regulatory DNA sequence.
Promoter (P) sequence. Recall this is where RNApol must bind to begin mRNA transcription
The beauty of the operon system lies in the fact that it ensures that the structural genes only get transcribed under specific environmental conditions.
In the case of Lac operon, 2 conditions must be met in order for the lac-operon to be expressed
1) Glucose must be absent AND 2) lactose must be present.
Thus the bacteria must have a lactose sensor and glucose sensor to determine when conditions are appropriate to begin transcribing the lac operon.
We will begin with how lactose is sensed and used first.
7.4.1 Negative Regulation of the Lac Operon by Lactose
The regulatory protein known as the lactose operon repressor or LacI is the built-in lactose sensor!
Lac l is always made and present in E. coli cells!
In the absence of lactose in the growth medium, the repressor protein binds tightly to the operator’sDNA.
Since the operator partially overlaps with the promoter, the presence of LacI blocks RNA polymerase from accessing the promoter and hence blocks transcription. Under these conditions, little or no transcript is made. (Figure 7.4 B)
Let’s look more closely at how the repressor prevents RNA polymerase from binding to the promoter. When RNA polymerase binds to the promoter, it physically contacts a stretch of DNA that extends upstream to roughly position −40 relative to the start site of transcription (recall that the sigma factor contacts the −35 and −10 sequences) and downstream to roughly position +20.
Meanwhile, the stretch of DNA contacted by the repressor, the operator, overlaps with the downstream region of the promoter, covering the transcription start site and extending past the end of the promoter (Figure 7.5). Thus, when the repressor binds to the operator, it physically occludes RNA polymerase.
LacI is therefore a classic example of negative regulation in which the binding of a regulatory protein to an operon decreases transcription.
How does the lac operon escape repression to turn on the synthesis of β-galactosidase when lactose is present in the growth medium instead of glucose?
Here lactose itself serves as the inducer! If cells are grown in the presence of lactose, some of the lactose entering the cells is converted to allolactose. (conversion to allolactose occurs by β-galactosidase!)
Allolactose binds to the repressor sitting on the operator DNA removing its ability to bind to the operator region. RNA polymerase can transcribe the lac operon genes as illustrated in Figure 7.4 C and image below.
7.4.2 Positive Regulation of the Lac Operon; Induction by Catabolite Activation
Bacteria typically have the ability to use a variety of substrates as carbon sources. However, because glucose is usually preferable to other substrates, bacteria have mechanisms to ensure that alternative substrates are only used when glucose has been depleted.
Recall that 2 conditions must be met in order for the lac-operon to be expressed.
1) Glucose must be absent AND 2) lactose must be present.
How does the bacterial cell sense the availability of glucose?
Here lac operon uses positive regulation with help of an activator called CAP (cAMP-bound catabolite activator protein or cAMP receptor protein) that increases transcription.
CAP binds to a site (CAP Binding Site) just upstream of the promoter such that both CAP and RNA polymerase can sit side-by-side on the DNA. This is in contrast to the repressor, whose binding site overlaps with the binding site for RNA polymerase.
If the inducer is present, then, as we have seen, the LacI repressor is not bound to the operator and hence RNA polymerase should be able to bind to the promoter and initiate transcription.
Why does RNA polymerase require the assistance of CAP to bind to the promoter in the presence of an inducer?
The answer is that the lac promoter is a poor match to the −35 and −10 consensus sequences. As you will recall, the ideal −35 and −10 sequences are 5’-TTGACA-3’ and 5’-TATAAT-3’, respectively. The promoter for the lac operon differs from these ideal sequences at three positions.
Hence, the lac promoter is an intrinsically weak promoter to which RNA polymerase only weakly binds.
This is the basis for positive control; an activator compensates for the promoter’s poor match to the consensus sequence by helping to facilitate the binding of RNA polymerase.
How does CAP facilitate the binding of RNA polymerase? It does so by directly contacting the RNA polymerase, and the favorable free energy from this protein-protein interaction helps to stabilize the binding of RNA polymerase to the otherwise weak promoter. Situations such as these in which an activator stabilizes the binding of RNA polymerase to DNA are often referred to as recruiting RNA polymerase.
Just as the affinity of the Lac-I repressor for DNA is governed by a small molecule, the inducer allolactose, the ability of CAP to adhere to its binding site is strongly influenced by a small molecule, 3’,5’-cyclic adenosine monophosphate (cyclic-AMP- cAMP)
When glucose is available, cellular levels of cAMP are low in the cells and CAP is in an inactive conformation. When glucose is scarce, the accumulating cAMP binds to the catabolite activator protein (CAP). The complex binds to the promoter region of the lac operon. The binding of the CAP-cAMP complex to this site increases the binding ability of RNA polymerase to the promoter region to initiate the transcription of the structural genes.
Thus, in the case of the lac operon, for transcription to occur, lactose must be present (removing the lac repressor protein) and glucose levels must be depleted (allowing the binding of an activating protein). The result is the synthesis of higher levels of lac enzymes that facilitate efficient cellular use of lactose as an alternative to glucose as an energy source.
Maximal activation of the lac operon in high lactose and low glucose is shown below.
When glucose levels are high, there is catabolite repression of operons encoding enzymes for the metabolism of alternative substrates. Because of low cAMP levels under these conditions, there is an insufficient amount of the CAP-cAMP complex to activate the transcription of these operons.
Let’s look at some of the classic experiments that led to our understanding of E. coli gene regulation in general, and of the lac operon in particular.
LINK TO LEARNING: How we know
LO: Predict the effect of mutations in the following elements on the transcription of an operon.
In molecular biology, one of the most common methods for figuring out a gene’s function is to mutate it and measure the resulting effects on its organism’s phenotype.
François Jacob and Jacques Monod first described the “operon model” for the genetic control of lactose metabolism in E. coli in 1961. Jacob and Monod deduced the structure of the operon genetically by analyzing the interactions of mutations that interfered with the normal regulation of lactose metabolism.
They knew that wild-type E. coli would not make the 𝛽-galactosidase, 𝛽-galactoside permease, or 𝛽-galactoside transacetylase proteins when grown on glucose.
Of course, they also knew that the cells would switch to lactose for growth and reproduction if they were deprived of glucose! They then searched for and isolated different E. coli mutants that could not grow on lactose, even when there was no glucose in the growth medium.
Jacob and Monod deduced the structure and various regulatory elements using genetics.
You already know how the lac operon is regulated, and therefore should be able to predict the effect of mutations on various components of the lac operon.
Crucial to the experiments described in the video was the creation of partial diploid strains of E.coli, in which 2 copies of the operon were present: one on the chromosome and one on a plasmid.
These partially diploid prokaryotes merodiploids (“mero-” comes from the Greek word for “part”, or “partial”). Merodiploids can be produced in a lab setting,
Merodiploids of E. coli is a fantastic research tool. They allow us to examine how wild-type and mutated alleles interact within a living organism. Genetic tests using such diploids distinguished between mutations in the genes coding for trans-acting elements or within the regulatory sequences.
UN-INDUCIBLE mutants: Mutations in the regulatory circuit that abolish expression of the operon
CONSTITUTIVE mutants: A mutant in which a protein is produced at a constant level, as if continuously induced; a bacterial regulatory mutant in which an operon is transcribed in the absence of inducer; a mutant in which a regulated enzyme is in a continuously active form.
For example Mutations in Lac I gene (that codes for the repressor) and the Operator sequences would be constitutive.
The partial diploid strains would be useful in distinguishing between the options.
Cis-acting mutations: Only affect those genes on the contiguous stretch of DNA. Mutations in promoter sequences, regulatory sequences (operator) were identified as cis-acting mutations.
Cis-dominant: A site or mutation that affects the properties of its own molecule of DNA, often indicating that it does not encode a diffusible product.
Trans-acting mutations: Repressors and activators are trans-acting; that is, they affect the expression of their regulated genes no matter on which DNA molecule in the cell these are located.
For In-Class Activity/Practice Problems
You will be looking at a variety of mutations that can occur in lac operon genes and discussing the effects of those mutations on E. coli. To do this, we’ll be using the following symbols to represent the individual components of the lac operon:
I P O Z Y A
Since the function of lac A is not well defined, we’ll be leaving it out of this model more often than not.
When all the sequences are, the lac operon functions normally. We’ll represent this using the following notation
I+ P+ O+ Z+ Y+ A+
If a given gene is mutated, we’ll change the superscript above that gene. Listed below are the specific mutations
Null mutation: Denoted by X– (where X can be any genetic element on the operon), DNA sequences with this mutation have completely lost their normal activity. In protein-coding genes, this means no protein is produced. In regulatory genes, this means that regular binding sites are non-functional
Constitutive activity: Denoted by Oc, this mutation is specific to the operator region. Constitutively active operator regions always block the binding of repressor protein to the operator region. This results in transcription of the operon whether or not lactose is present, because the repressor is unable to block RNApol from binding to the promoter.
When Merodiploids are used the following notation is used:
I+ P+ O+ Z+ Y+ A+ /I+ P+ O+ Z+ Y+ A+
In this notation, we show a chromosomal lac operon and the plasmid lac operon side by side. Again, we’ve included the lacA gene here for completeness but will be leaving it out of our exercises.
Because merodiploids have two copies of a given set of genes, mutations affect them differently. For example, if a single copy of a protein-coding gene is inactivated, the second copy may still continue to produce viable protein, effectively masking the mutation. Here is where ‘Trans-acting” and ‘Cis-acting” becomes relevant!
For extra practice:
Try this simulation (System requirements: this requires JaVA but should operate on most computers)
Watch the Lecture videos that cover the material above. This will help to clarify or reinforce certain concepts if they were unclear.
Begin work on Problem Set/Practice Problems
References and Attributions
This chapter contains material taken from the following CC-licensed content. Changes include rewording, removing paragraphs and replacing with original material, and combining material from the sources.
1. Bergtrom, Gerald, “Cell and Molecular Biology 4e: What We Know and How We Found Out” (2020). Cell and Molecular Biology 4e: What We Know and How We Found Out – All Versions. 13.