Central Dogma Review
Following images from http://www.emc.maricopa.edu/faculty/farabee/BIOBK/BioBookPROTSYn.html
Prokaryotic RNA polymerase scans DNA for a promoter sequence that consists of a specific set of about 13:
How often would we expect a promoter sequence to occur by random chance?
(Note: Prokaryotic genomes are only a few million nucleotides in length)
Amino Acid Encodings
Glycine (G, GLY)
Prokaryote Coding Example
DNA Sequence TAC CAC GTG GAC TGA GGA CTC CTC ACT
ATG GTG CAC CTG ACT CCT GAG GAG TGA
mRNA Sequence AUG GUG CAC CUG ACU CCU GAG GAG UGA
Amino Acid Met Val His Leu Thr Pro Glu Glu Stop
Sequence
You try:
DNA Sequence CAG AAT GAG TTG ATG GGG CTA ATC
GTC
mRNA Sequence GUC
Amino Acid Val
Sequence
Prokaryotic Genomes
Density of genes is very high - 85% to 88% of nucleotides are gene coding regions
Example: E. coli:
Prokaryootic Gene Structure
Promoter Elements - Prokaryotic RNA polymerases are assemblies of several different proteins:
(beta-prime) - binds to DNA templates
(beta) - links one nucleotide to another
(alpha) - holds all subunits together
(sigma) - recognize the specific nucleotide sequences of promoters
Proteins , , and are evolutionarily well conserved across bacterial species
is less well conserved with several significantly different variants in a cell
The ability to make RNA polymerases from different factors allow the cell to turn on or off the expression of whole sets of genes, e.g., E. coli
s factor | Gene family | -35 sequence | -10 sequence |
s70 | General | TTGACA | TATAAT |
s32 (sH) | Heat Shock | CTTGAAA | CCCATNTA |
s54 (sN) | Nitrogen stress | CTGGCAC | TTGCA |
s28 (s F) | Flagella synthesis | CTAAA | GCCGATAA |
s38 (s S) | Stationary phase genes | CGTCAA | n. a. |
s20 (s Fecl) | Iron-dicitrate transport | n.a. | n.a. |
s24 (s E) | Extracytoplasmic proteins | n.a. | n.a. |
N is any nucleotide
How well an RNA polymerase recognizes a gene's promoter is directly related to how readily it initiates the process of transcription.
Operon
operon - genes with a related function sharing a single promoter so that when one gene is transcribed all will
Example: lactose operon - a set of three genes involved in the metabolism of the sugar lactose. Transcription of the operon results in the synthesis of one long polycistronic RNA molecule that contains the coding information needed by the ribosomes to make all three proteins.
Regulatory Proteins
Individual regulatory proteins also facilitate bacterial gene expression responses to specific environmental circumstances at a finer scale than different s factors with affinities for a handful of different promoters could provide.
Example: Lactose level regulation
lactose repressor protein (pLacI) - a negative regulator that binds to a specific nucleotide sequence on the "operon's operator sequence" immediately downstream of the -10 sequence of the operon's promoter.
pLacI - binds when the lactose levels in the cell are low, and acts as a roadblock that prevents RNA polymerases from transcribing any of the downstream coding sequences.
cyclic-AMP receptor protein (CRP) - a positive regulator that effects the lactose operon's sensitivity to glucose levels (glucose is a preferred sugar that is more efficiently utilized lactose).
CRP binds to the lactose operon's promoter when the glucose level in the cell is low. It aids the attraction of the RNA polymerase to the lactose operon.
Open Reading Frames (ORF) in Prokaryotes
There are 3 "stop" codons (UAA, UAG, and UGA) out of 64.
If all codons are expected to occur with the same frequency in a random DNA sequence, then what is the probability that a 60 codon sequence does not contain a "stop" codon?
(61/64)60 = 0.056 or 5.6 %
Since most prokarotic proteins are longer than 60 amino acids, what would be an easy way to look for genes in a DNA sequence?
The codon AUG is usually used as the "start codon". Example, 83% of E. coli genes start with AUG with the remaining starting with UUG or GUG.
Prokarotic genes have the Shire-Delgarno sequence 5'-AGGAGGU-3' immediately downstream ofthe transcriptional start site and just upstream of the first start codon.
These are the ribosome "loading" sites.
Termination Sequences in Prokarotes
Most (> 90%) prokaryotic operons contain sequences that signal the termination of transcription ,called intrinsic terminators.
Properties of intrinsic terminators:
5'-CGGATG|CATCCG-3' because 5'-CGGATG-3' reads 5'-CATCCG-3' on its complementary strand
Inverted repeats are typically 7 to 20 nucleotides long and rich in G's and C's
RNA molecules can adopt a stable secondary structure as shown in Figure 6.4. During transcription secondary structure can cause the RNA polymerases to pause for an average of 1 minute.
GC Content in Prokaryotic Genomes
Measuring GC content is useful for identifying bacterial species since the GC content varies from 25% to 75% across prokaryotic species.
Each bacterial species GC content seem to be independently shaped by mutational biases of its DNA polymerases and DNA repair mechanisms working over long periods of time.
Resulting in the relative ratios of G/C to A/T base pairs to be generally uniform throughout any bacteria's genome.
However, bacteria evolve through large-scale acquistions of genes (tens to hundreds of thousands of nucleotides in length) from other organisms through a process called horizontal gene transfer.
So, bacterial genomes are patchworks of regions with distinctive GC contents that reflect the evolutionary histor of the bacteria.
Prokaryotic Gene Density
Density of genes is very high - 85% to 88% of nucleotides are gene coding regions
Example: E. coli:
Finding genes in a prokaryotic genome is relatively easy since: