The most common recognition pattern between transcription factors and DNA is an interaction between an alpha-helical domain of the factor and about five base pairs within the major groove.
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
note: AP1 is a combination of Fos and Jun. Fos and Jun are two different families of transcription factors. Jun can form both homodimers and heterodimers (and the heterodimers have 10-fold higher binding affinity than the homodimers). Fos can only form heterodimers, and cannot bind to DNA without a partner. CREB is cyclic AMP response element binding protein. TFIIIA is an RNA polymerase III general transcription factor.
Eukaryotic transcription factors contain a
variety of structural motifs that interact with specific DNA sequences.
As with most bacterial activators and repressors, alpha helices
in the DNA-binding domain of eukaryotic transcription factors
are oriented so that they lie in the major groove of DNA where
protein atoms make specific hydrogen bonds and van der Waals interactions
with atoms in the DNA. Interactions with sugar-phosphate backbone
atoms and, in some case, with atoms in the DNA minor groove contribute
to binding. X-ray crystallographic analyses of several complexes
between specific protein-binding sites in DNA and isolated transcription
factor DNA-binding domains have revealed a number of structural
motifs that can present an alpha helix to the major groove.
Transcription factors often are classified according to the type
of DNA-binding domain they contain. Most of the structural classes
of DNA-binding domains have characteristic consensus amino acid
sequences. Consequently, newly characterized transcription factors
frequently can be classified once the corresponding genes or cDNAs
are cloned and sequenced. A few of the more common classes of
DNA-binding domains whose three-dimensional structures have been
determined are described here. Many additional classes are recognized,
and new classes are still being characterized. The genomes of
higher eukaryotes may encode dozens of classes of DNA-binding
domains and literally hundreds of transcription factors (Figure
AY). (Click here to see the Original Figure from the journal Nature).
Zinc-Finger Proteins. A number of different proteins have regions that
fold around a central Zn2+ ion, producing a compact domain from
a relatively short length of the polypeptide chain. Termed a zinc
finger (zif), this structural motif was first recognized in
DNA-binding domains. It is now known to occur in proteins that
do not bind to DNA. To date, three classes of zinc-finger proteins
have been identified.
Fig. ET6 shows the
structure of one type of zinc finger DNA-binding domain, termed
the C2H2 (or classic) zinc finger. The name
is derived from the sequence of repeating unit initially identified
in the DNA-binding domain of transcription factor IIIA, which
is required for transcription of 5S rRNA genes by RNA polymerase
III. Each repeating unit has the consensus sequence (Tyr/Phe)
X Cys X2-4 Cys X3 (Phe/Tyr) X5 Leu X2 His X3-4 His where X is
any amino acid. Each repeating unit binds one zinc ion through
the two cysteine (C) and two histidine (H) side chains. The name
"zinc finger" was coined because a two-dimensional diagram
of the structure resembles a finger. When the three-dimensional
structure was solved, it became clear that the binding of the
zinc ion by the two cysteines and two histidines folds the relatively
short polypeptide sequence into a compact domain, which can insert
its alpha helix into the major groove of DNA.
The C2H2 zinc finger is one of the most
common DNA-binding motifs in eukaryotic transcription factors.
More than a thousand of these consensus sequences are in the current
protein data base. The repeating units in these proteins can interact
with successive groups of base pairs, primarily within the major
groove, as the protein wraps around the DNA double helix.
Leucine-Zipper Proteins. Another structural motif present
in a large class of transcription factors is exemplified by the
DNA-binding domain of yeast Gcn4. The first transcription factors
recognized in this class contained the hydrophobic amino acid
leucine at every seventh position in the C-terminal portion of
their DNA-binding domains (Fig. ET4). These
proteins bind to DNA as dimers, and mutagenesis of the leucines
showed that they were required for dimerization. Consequently,
the name leucine zipper was coined to denote this structural
motif.
X-ray crystallographic analysis of complexes between DNA and the
Gcn4 DNA-binding domain has shown that the dimeric protein contains
two extended alpha helices that "grip" the DNA molecule,
much like a pair of scissors, at two adjacent major grooves separated
by about half a turn of the double helix (Fig. ET5). The portions of the alpha helices contacting the
DNA include basic residues that interact with phosphates in the
DNA backbone and additional residues that interact with specific
bases in the major groove.
Gcn4 forms dimers via hydrophobic interactions between the C-terminal
regions of the alpha helices, forming a coiled-coil structure.
This structure is common in proteins containing amphipathic alpha
helices in which hydrophobic amino acid residues are regularly
spaced alternately three or four positions apart in the sequence.
As a result of this characteristic spacing, the hydrophobic side
chains form a stipe down one side of the alpha helix. The hydrophobic
stripes make up the interacting surfaces between the alpha-helical
monomers in a coiled-coil dimer.
As noted above, the first transcription factors in this class
to be analyzed contained leucine residues at every seventh position
in the dimerization region and thus were named leucine-zipper
proteins. However, additional DNA-binding proteins containing
other hydrophobic amino acids in these positions subsequently
were identified. Like leucine-zipper proteins, they form dimers
containing a C-terminal coiled-coil region and N-terminal DNA-binding
domain. The term basic zipper (bZip) now is frequently
used to refer to all proteins with these common structural features.
Many basic-zipper transcription factors are heterodimers of two
different polypeptide chains, each containing one basic-zipper
region.
Heterdimeric Transcription Factors Increase Regulatory Diversity
Three types of DNA-binding proteins discussed
in the previous section can form heterodimers: C4 zinc-finger
proteins, basic-zipper proteins, and helix-loop-helix proteins.
Other classes of transcription factors whose structures have not
yet been determined also form heterodimeric proteins. One consequence
of heterodimeric formation is an expansion of the number of potential
DNA sequences that a family of factors can bind. Heterodimer formation
also allows different combinations of activation domains to be
brought together at regulatory sequences. In addition, there are
examples of basic-zipper and helix-loop-helix proteins that block
DNA binding when they dimerize with another polypeptide otherwise
capable of binding DNA. When these inhibitory factors are expressed,
they repress transcriptional activation by the factors with which
they interact.
The rules governing the interactions of members of a transcription-factor
class are complex. In the example shown in Fig. 11-52, factors A, B, and C can interact with each other,
but the inhibitory factor interacts only with factor A. This combinatorial
complexity expands both the number of DNA sites from which these
factors can activate transcription and the ways in which they
can be regulated. This is not possible for transcription factors
that bind only as monomers or homodimers.
Although the three-dimensional structures of
the DNA-binding domains from numerous eukaryotic transcription
factors have been determined, the structure of not one activation
domain has yet been solved. Nonetheless, activation domains defined
by mutation analysis exhibit common amino acid sequence features
in some cases.
For example, Gal4, Gcn4, and most other yeast transcription factors
analyzed so far are rich in acidic amino acids (aspartic and glutamic
acids). Deletion analyses of numerous transcription factors from
mammals and Drosophila have identified several classes
of activation domains. Some are glutamine rich, some are proline
rich, and some are rich in the closely related amino acids serine
and threonine, both of which have hydroxyl groups. However, some
strong activation domains that are not particularly rich in any
specific amino acid also have been identified.
Most activation domains characterized in yeast transcription factors
also stimulate transcription in mammalian cells, whereas a number
of mammalian activation domains do not stimulate transcription
when tested in yeast. Thus, while some activation mechanisms function
in all eukaryotic cells, some mechanisms may have evolved since
the divergence of yeast and animals.
Comparisons between the sequences of many transcription factors suggest that common types of motifs can be found that are responsible for binding to DNA. The motifs are usually quite short and comprise only a small part of the protein structure. Motifs have also been identified that are responsible for activating transcription via interactions between proteins of the transcription apparatus.
We have detailed information about several groups of proteins that regulate transcription by using particular motifs to bind DNA:
The steroid receptors are defined as a group by a functional relationship: each receptor is activated by binding a particular steroid. The glucocorticoid receptor is the most fully analyzed. Together with other receptors, such as the thyroid hormone receptor or the retinoic acid receptor, the steroid receptors are members of a superfamily of transcription factors with the same general mode of action.
The zinc finger motif comprises a DNA-binding domain. It was originally recognized in factor TFIIIA, which is required for RNA polymerase III to transcribe 5S rRNA genes. It has since been identified in several other transcription factors (and presumed transcription factors). A distinct form of the motif is found also in the steroid receptors.
The helix-turn-helix motif was originally identified as the DNA-binding domain of phage repressors. One alpha-helix lies in the wide groove of DNA; the other lies at an angle across DNA. A related form of the motif is present in the homeodomain, a sequence first characterized in several proteins coded by genes concerned with developmental regulation in Drosophila. It is also present in genes for mammalian transcription factors.
The amphipathic helix-loop-helix (HLH) motif has been identified in some developmental regulators and in genes coding for eukaryotic DNA-binding proteins. Each amphipathic helix presents a face of hydrophobic residues on one side and charged residues on the other side. The length of the connecting loop varies from 12-28 amino acids. The motif enables proteins to dimerize, and a basic region near this motif contacts DNA.
Leucine zippers consist of a stretch of amino acids with a leucine residue in every seventh position. A leucine zipper in one polypeptide interacts with a zipper in another polypeptide to form a dimer. Adjacent to each zipper is a stretch of positively charged residues that is involved in binding to DNA.
A factor is tissue-specific because it is synthesized only in a particular type of cell. This is typical of factors that regulate development, such as homeodomain proteins.
The activity of a factor may be directly controlled by modification. HSTF is converted to the active form by phosphorylation. AP1 (a heterodimer between the subunits Jun and Fos) is converted to the active form by phosphorylating the Jun subunit.
A factor is activated or inactivated by binding a ligand. The steroid receptors are prime examples. Ligand binding may influence the localization of the protein (causing transport from cytoplasm to nucleus), as well as determining its ability to bind to DNA.
One transcription factor is produced as a protein bound to the nuclear envelope and endoplasmic reticulum. The absence of sterols (such as cholesterol) causes the cytosolic domain to be cleaved; it then translocates to the nucleus and provides the active form of the transcription factor.
Availability of a factor may vary; for example, the factor NF-kappa B (which activates immunoglobulin kappa genes in B lymphocytes) is present in many cell types. But it is sequestered in the cytoplasm by the inhibitory protein I-kappa B. In B lymphocytes, NF-kappa B is released from I-kappa B and moves to the nucleus, where it activates transcription.
A dimeric factor may have alternative partners.
One partner may cause it to be inactive; synthesis of the active
partner may displace the inactive partner. Such situations may
be amplified into networks in which various alternative partners
pair with one another, especially among the HLH proteins.
Hormone binding to a nuclear receptor regulates its activity as a transcription factor. This regulation differs in some respects for heterodimeric and homodimeric nuclear receptors.
When heterodimeric nuclear receptors (e.g., RXR-VDR, responsive to vitamin D3; RXR-TR, responsive to thyroid hormone; and RXR-RAR, responsive to retenoic acid. Note: RXR is a common nuclear-receptor monomer) are bound to their cognate sites in DNA, they act as repressors or activators of transcription depending on whether hormone occupies the ligand-binding site. In the absence of hormone, these nuclear receptors direct histone deacetylation at nearby nucleosomes. In the presence of hormone, the ligand-binding domain undergoes a dramatic conformational change. In the ligand-bound conformation, these nuclear receptors can direct hyperacetylation of histones in nearby nucleosomes, thereby reversing the repressing effects of the free ligand-binding domain. The N-terminal activation domain in these receptors then probably interacts with additional factors, stimulating the cooperative assembly of an initiation complex.
In contrast to heterodimeric nuclear receptors, which are located exclusively in the nucleus, homodimeric receptors are found both in the cytoplasm and nucleus, and their activity is regulated by controlling their transport from the cytoplasm to the nucleus. The hormone-dependent translocation of the homodimeric glucocorticoid receptor (GR) was demonstrated in the transfection experiments shown in Figure BA1. The GR hormone-binding domain alone mediates this transport. Subsequent studies showed that, in the absence of hormone, the glucocorticoid receptor is anchored in the cytoplasm as a large protein aggregate complexed with inhibitor proteins, including Hsp90, a protein related to Hsp70, the major heatshock chaperone. In this situation, the receptor cannot interact with target genes; hence, no transcriptional activation occurs. Binding of hormone releases the glucocorticoid receptor from its cytoplasmic anchor, allowing it to enter the nucleus where it can bind to response elements associated with target genes (Figure BA2). Once the receptor with bound hormone interacts with a response element, it activates transcription by directing histone hyperacetylation and facilitating cooperative assembly of an initiation complex.
Most eukaryotic transcription factors that
have been studied extensively are activators, which stimulate
transcription. However, proteins that repress transcription also
have been identified in eukaryotes. Some repressor proteins function
by binding to DNA sequences that overlap activator-binding sites.
Other repressors function by binding to sequences that overlap
a transcription start site, much like prokaryotic repressors.
In both cases, binding of a repressor molecule to a specific DNA
site blocks binding of proteins required to initiate transcription.
In many cases, however, eukaryotic repressors inhibit transcription
without interfering with the binding of an activator or general
transcription factor. One important example is the protein encoded
by the Wilm's tumor (WT1) gene, which is expressed
preferentially in the developing kidney. Children who inherit
mutations in both the maternal and paternal WT1 genes,
so that they produce no functional WT1 protein, invariably develop
kidney tumors early in life. The WT1 protein has a C2H2
zinc-finger DNA-binding domain, and binding sites for the protein
were discovered in the control region of the gene encoding a transcription
activator called EGR-1. The experiment outlined in Fig. 11-55 demonstrated that WT1 protein repressed transcription
of a reporter gene linked to the EGR-1 promoter region.
Eukaryotic transcription repressors like WT1 appear to be the
functional converse of activators. They can inhibit transcription
from a gene they do not normally regulate when their cognate binding
sites are placed within a few hundred base pairs of the gene's
start site. This effect was demonstrated in an experiment with
Kruepple protein, which represses transcription of several
genes during embryonic development of Drosophila. Like
activators, many eukaryotic repressors have two functional domains:
a DNA-binding domain and a repression domain. When the Kruepple
repression domain was fused to the DNA-binding domain of the E.
coli lac repressor, the resulting fusion protein inhibited
transcription of a reporter gene linked to upstream lac
operator sites. As discussed previously for activation domains,
a variety of amino acid sequences can function as repression domains.
Little information is yet available on the mechanism by which
eukaryotic repressors inhibit transcription.