17.3: Additional Levels of Regulating Transcription - Biology

Eukaryotes regulate transcription via promoter sequences close to the transcription unit (as in prokaryotes) and also use more distant enhancer sequences to provide more variation in the timing, level, and location of transcription, however, there are still additional levels of genetic control. These two are often inter-connected.

Chromatin Dynamics

Despite the simplified way in which we often represent DNA in figures such as those in this chapter, DNA is almost always associated with various chromatin proteins. For example, histones remain associated with the DNA even during transcription. Thus the rate of transcription is also controlled by the accessibility of DNA to RNApol and regulatory proteins. So, in regions were the chromatin is highly compacted, it is unlikely that any gene will be transcribed, even if all the necessary cis- and trans- factors are present in the nucleus. The extent of chromatin compaction in various regions is regulated through the action of chromatin remodeling proteins. These protein complexes include enzymes that add or remove chemical tags, such as methyl or acetyl groups, to various DNA bound proteins. These modifications alter the local chromatin density and thus the availability for transcription. Acetylated histones, for example, tend to be associated with actively transcribed genes, whereas deacetylated histone are associated with genes that are silenced (Figure (PageIndex{15})).

Likewise, methylation of DNA itself is also associated with transcription regulation. Cytosine bases, particularly when followed by a guanine (CpG sites) are important targets for DNA methylation (Figure (PageIndex{16})). Methylated cytosine within clusters of CpG sites is often associated with transcriptionally inactive DNA.

The modification of DNA and its associated proteins is enzymatically reversible (acetylation/deacetylation; methylation/demethylation) and thus a cyclical activity. Regulation of this provides another layer through which eukaryotic cells control the transcription of specific genes.

Chapter 17 Regulation of Gene Transcription and Keratinocyte Differentiation by Anandamide

Anandamide (AEA) is a member of an endogenous class of lipid mediators, known as endocannabinoids, which are involved in various biological processes. In particular, AEA regulates cell growth, differentiation, and death. Accumulating evidence demonstrates that AEA controls also epidermal differentiation, one of the best characterized mechanisms of cell specialization. Indeed, the epidermis is a keratinized multistratified epithelium that functions as a barrier to protect the organism from dehydration, mechanical trauma, and microbial insults. Its function is established during embryogenesis and is maintained during the whole life span of the organism, through a complex and tightly controlled program, termed epidermal terminal differentiation (or cornification). Whereas the morphological changes that occur during cornification have been extensively studied, the molecular mechanisms that underlie this process remain poorly understood.

In this chapter, we summarize current knowledge about the molecular regulation of proliferation and terminal differentiation in mammalian epidermis. In this context, we show that endocannabinoids are finely regulated by, and can interfere with, the differentiation program. In addition, we review the role of AEA in the control of cornification, and show that it occurs by maintaining a transcriptional repression of gene expression through increased DNA methylation.

Chromatin modifier enzymes, the histone code and cancer

In all organisms, cell proliferation is orchestrated by coordinated patterns of gene expression. Transcription results from the activity of the RNA polymerase machinery and depends on the ability of transcription activators and repressors to access chromatin at specific promoters. During the last decades, increasing evidence supports aberrant transcription regulation as contributing to the development of human cancers. In fact, transcription regulatory proteins are often identified in oncogenic chromosomal rearrangements and are overexpressed in a variety of malignancies. Most transcription regulators are large proteins, containing multiple structural and functional domains some with enzymatic activity. These activities modify the structure of the chromatin, occluding certain DNA regions and exposing others for interaction with the transcription machinery. Thus, chromatin modifiers represent an additional level of transcription regulation. In this review we focus on several families of transcription activators and repressors that catalyse histone post-translational modifications (acetylation, methylation, phosphorylation, ubiquitination and SUMOylation) and how these enzymatic activities might alter the correct cell proliferation program, leading to cancer.


Identification of genes that affect DNA methylation

In order to identify genes that affect DNA methylation, we employed an approach that consists of two parts. First, we identified predictive genetic variants for the expression of each gene in our data and aggregated these into single predictive scores termed genetic instruments (GIs) [13]. Second, we used these GIs as causal anchors to establish directed effects of gene expression on genome-wide DNA methylation levels, while ensuring that these associations were specific by accounting for linkage disequilibrium (LD) and pleiotropy among neighboring GIs (see Fig. 1 for an overview of the successive steps in the analysis).

Flowchart showing the successive steps leading to the identification of 818 genes that affect DNA methylation in trans

To construct the genetic instruments, we used data on 3357 unrelated individuals with available genotype and RNAseq data derived from whole blood. We focused the analysis on 11,830 expressed genes (median counts per million > 1). In the training set (1/3 of the data, 1119 individuals), we obtained a GI for the expression of each gene, which consisted of 1 or more SNPs selected by applying LASSO regression to nearby genetic variants [18]. We corrected the expression data for age, sex, biobank, blood cell composition, and five principal components. We then assessed the predictive ability of the constructed GIs in a separate test set of 2238 individuals by predicting their gene expression values using the GIs derived in the training set. Of the 11,830 tested GIs, 8644 were sufficiently predictive of expression levels in the test set to serve as valid GIs (F-statistic > 10, median R 2 = 0.04, Additional file 1: Table S1) [19].

Next, we tested for an association between all 8644 predictive GIs and genome-wide DNA methylation levels at 428,126 autosomal CpG sites in trans (> 10 Mb distance from the tested gene), using genotype and DNA methylation data (Illumina 450k array) derived from whole blood of 4056 unrelated individuals (3251 samples overlapped with RNAseq data). These associations were computed using linear regression, while correcting for age, sex, blood cell composition, biobank, and five principal components, and test statistics were corrected for bias and inflation [20]. These analyses resulted in directed associations between 2223 genes and 5284 CpGs (Bonferroni correction, P < 1.4 × 10 −11 Additional file 2: Table S2). Although directed, the associations resulting from this analysis may not be specific for a single gene as linkage disequilibrium (LD) and/or pleiotropy may result in GIs that are predictive of multiple neighboring genes [13]. We therefore adjusted all significant GI-CpG pairs for all neighboring GIs (< 1 Mb) to account for correlation induced by LD/pleiotropy among neighboring genes. This enabled us to identify the specific gene in a region driving the directed association. Next, we removed genes with potential residual pleiotropic effects on the expression of neighboring significant genes (F > 5) (together, these two steps led to the removal of 1387 genes and 2844 CpGs Additional file 3: Table S3). Finally, we excluded effects of long-range pleiotropy and LD (by rerunning the analysis for CpGs affected by multiple genes from the same chromosome, including all these genes in the model removing 6 genes and 13 CpGs), and residual effects of white blood cell composition (by correcting for genetic variants known to be associated with WBC removing 12 genes, 43 CpGs, Additional file 4: Fig. S1) [21, 22].

The final result of our step-wise analysis was a collection of 818 genes with directed and specific associations with DNA methylation levels of 2384 unique target CpGs in trans (Bonferroni correction, P < 1.4 × 10 −11 (Additional file 5: Table S4). The target CpGs were located in 1915 distinct regions (consecutive probes within < 1 kb), and for genes affecting DNA methylation at more than 1 CpG site, on average 33% of the target CpGs were co-localized (< 1 kb) with at least one other target CpG (Additional file 6: Table S5).

The validity of these results was corroborated by a comparison with previous trans-methylation QTL studies in blood. Although not designed to infer genes that are specifically responsible for associations, such studies are expected to produce partly overlapping outcomes. We found that 1638 target CpGs identified in our study were reported in three previous independent trans-meQTL studies (OR = 103 P < 1 × 10 −32 ) [23,24,25]. For the great majority of overlapping CpGs, the corresponding GI and trans-meQTL SNP were in close proximity (Additional file 4: Table S6, Additional file 7: Table S7, Additional file 8: Table S8, Additional file 9: Table S9).

We performed post hoc power analyses to assess the power we had to detect varying effect sizes for each gene tested (Additional file 4: Fig. S2 and Additional file 1: Table S1) [26]. In the uncorrected analysis (not corrected for neighboring GIs), we had > 0.8 power to detect effect sizes of 1 SD (1 standard deviation change in DNA methylation upon 1 standard deviation change in expression) for about 85% of the tested genes, and for about 50% of the genes (4475), we had > 0.8 power to detect effect sizes of 0.5 SD (Additional file 4: Fig. S2). Correcting for neighboring GIs is required to identify specific genes (instead of genomic regions with multiple correlated genes), but does so at the cost of reduced power. Correction left 5685 genes (compared to 7299) with power > 0.8 to detect effect sizes of 1 SD and left 3061 genes (compared to 4475) with > 0.8 power to detect effect sizes of 0.5 SD (Additional file 4: Fig. S2). This analysis shows that for the majority of tested genes, we were well-powered to detect large effects, and for over a third of the genes, we were well-powered to detect medium effect sizes. We included the explained variance and power across varying effect sizes for each gene in Additional file 1: Table S1.

Function of genes that affect DNA methylation in trans

As shown in Fig. 2, a considerable fraction (N = 308) of the identified genes affected multiple CpGs in trans (Additional file 6: Table S5). We observed that these genes, often consistently, either increased or decreased DNA methylation at their target CpGs (Fig. 2a). For 30 out of 37 genes that were associated with 10 or more CpGs, the direction of effect was significantly skewed towards increased (19 genes) or decreased (11 genes) methylation levels, respectively (binomial test, FDR < 0.05, Additional file 10: Table S10). We first considered two previously hypothesized molecular roles of the identified genes: transcription factors [27] and core epigenetic factors [6], which we will now discuss in more detail.

A considerable fraction of the identified genes (N = 308) affected multiple target CpGs in trans. a Each dot represents a gene with trans DNA methylation effects. The x-axis shows the number of affected target CpGs with decreased methylation levels upon increased gene expression, and the y-axis shows the number of affected target CpGs with increased methylation levels upon increased gene expression. The figure in the right upper corner is a zoomed-in version in which only genes that affect less than 25 CpG sites in either direction are displayed. b Bars represent the number of genes with either 1, 2, 3–5, or more than 5 target CpGs. The percentage of genes that are annotated as transcription factors increases with the number of target CpGs

Transcription factors

We found that the identified genes (818) were enriched for transcription factors (TFs) (N = 127, odds ratio = 2.74, P = 3.1 × 10 −18 ) using a manually curated list of TFs [27]. This enrichment was not explained by TFs having stronger genetic instruments in fact, non-TFs had stronger instruments than TFs (P = 6.3 × 10 −8 Additional file 4: Fig. S3). As shown in Fig. 3a, this enrichment was driven by TFs that were associated with multiple target CpGs, and there was a stronger TF enrichment with an increasing number of target CpGs. In total, 80 (63%) of the significant TFs in our data affected more than 1 CpG site, which was a significant enrichment compared to the non-TF genes (OR = 3.45, P = 3.1 × 10 −10 ). We further found that the target CpGs of TFs frequently co-localized. For TFs affecting more than 1 CpG, on average, 45% of the target CpGs were co-localized with at least one other target CpG (< 1 kb), which was a significant enrichment compared to non-TFs (average non-TFs = 25%, OR = 2.5, P = 2.2 × 10 −21 ). The majority of TFs either consistently increased or consistently decreased DNA methylation at their target CpGs: a significant skew in the direction of effect was present for 20 out of 23 TFs that were associated with at least 10 CpGs (6 consistently decreased, and 14 consistently increased DNA methylation at the target CpGs, respectively). TFs affecting the most CpGs included NFKB1, a key immune regulator (142 target CpGs 127 regions, that is multiple CpGs spaced less than 1 kb) ZBTB38, a methyl-binding TF (49 target CpGs 34 regions) and ZNF202, a zinc finger protein involved in lipid metabolism (37 target CpGs 19 regions). One hundred out of the 127 (79%) TFs belonged to the C2H2 zinc finger family (odds ratio = 3.07, P = 5.2 × 10 −7 ), of which the majority (N = 70) contained a KRAB domain. In line with the enrichment for TFs and zinc fingers, the gene set was overrepresented in the GO terms Nucleic Acid binding (N = 99, P = 1.1 × 10 −14 ), DNA Binding (N = 114, P = 4.7 × 10 −9 ), Metal Ion binding (N = 146, P = 1.4 × 10 −8 ), and transcription factor activity (N = 73, P = 4.4 × 10 −8 ) (Additional file 11: Table S11).

a Enrichment (odds ratio) for transcription factors among identified genes with either 1, 2, 3–5, or more than 5 target CpGs. Error bars represent 95% confidence intervals. b Transcription factor binding site enrichment each dot represents a transcription factor, with on the x-axis the logarithm of the number of target CpGs for that transcription factor (at a gene-level significance level P < 1.2 × 10 −7 ), and on the y-axis the odds ratio for the enrichment of the target CpGs in its experimentally determined binding sites (ChIP-seq). The size of the dots represents the significance (FDR), and TFs for which the target CpGs were significantly enriched in its binding sites are colored in blue

To assess whether TFs may affect DNA methylation directly (i.e., at their binding sites), we leveraged existing ChIP-seq data [28]. For each TF, we determined the overlap between the target CpGs (at a gene-level significant threshold: P < 1.2 × 10 −7 ) and its experimentally determined binding sites as compared to a GC-content matched background. ChIP-seq data was available for 59 out of 110 TFs affecting multiple CpGs (at P < 1.2 × 10 −7 ). For one third of these TFs (N = 20), target CpGs were significantly enriched for co-localization with their respective TF binding sites (FDR < 0.05 Fig. 3b, Additional file 12: Table S12).

Core epigenetic factors

Next, we compared our findings with a manually curated database of core epigenetic factors (EpiFactors) [6]. This database is mainly focused on the core enzymes that directly write, maintain, and/or establish epigenetic marks, but it also includes a few “borderline cases”, such as TFs that interact with epigenetic proteins. We found that 36 of the identified genes overlapped with genes in this database (odds ratio = 1.02, P > 0.05), of which 12 affected more than 1 CpG, which did not constitute a significant enrichment compared to the other genes affecting multiple CpGs (odds ratio = 0.82, P > 0.05). Interestingly, the majority of the 36 genes encode proteins that target histone proteins (27 out of 36, OR = 1.13, P > 0.05). Another 7 genes were also annotated as TFs in the manually curated TF catalog [27]. The core epigenetic factors associated with most target CpGs include transcription factor IKZF1 (positively associated with methylation at 17 target CpGs), histone demethylase KDM5B (positively associated with methylation at 7 target CpGs), and BRD3, which recognizes acetylated lysine residues on histones (positively associated with methylation at 5 target CpGs). The significant core epigenetic factors also included the DNA methyltransferase DNMT3A, which was associated with increased methylation at five target CpGs. Further exploration of potential DNMT3A targets indicated that the test statistics of DNMT3A were skewed towards increased DNA methylation levels, compatible with widespread but small effects (Additional file 4: Fig. S4). Of note, of the other main DNA methylation modifiers (DNMT1, DNMT3B, TET1,2,3), we had a sufficiently predictive GI for DNMT1 only. However, both in the corrected and the uncorrected (for neighboring GIs) analyses, we did not find significant associations for this gene (Additional file 4: Fig. S4), although the statistical power of the uncorrected analysis was very similar to that of DNMT3A (Additional file 1: Table S1).

Other mechanisms underlying regulation of DNA methylation

Finally, the majority of the identified genes (N = 662) did not belong to the two a priori categories (TFs and core epigenetic factors Fig. 2). A small fraction of these genes encodes proteins with DNA-binding properties (N = 24, OR = 0.91, P > 0.05). BEND3, for example, is a DNA-binding protein that was associated with increased methylation at 15 CpG sites. A previous study showed that BEND3 represses transcription by attracting the MBD3/NuRD complex that initiates histone deacetylation [29].

GO term enrichment analysis did not reveal significant functions underlying these genes. To explore possible biological functions among these genes, we provide case studies below for the five genes from this set that were associated with the most target CpGs: SENP7 (189 target CpGs), CDCA7 (79 target CpGs), NFKBIE (76 target CpGs), CDCA7L (47 target CpGs), and NLRC5 (43 target CpGs).


The NFKBIE gene encodes IκBε which is an inhibitor of NFκB, a transcription factor that plays a fundamental role in the regulation of the immune response [30, 31]. IκBε binds to components of NFκB and retains it in the cytoplasm, thereby preventing it from activating genes in the nucleus. Consistent with the previous interpretation of a trans-methylation QTL effect [16], increased expression of NFKB1 was associated with genome-wide loss of DNA methylation. In contrast, increased expression of NFKBIE resulted in higher methylation levels at 76 CpG sites across the genome (70 regions). In line with its role as NFκB inhibitor, a substantial number of its target CpGs (28) overlap with NFκB’s target CpGs and show opposite effects (Fig. 4a). To further characterize the target CpGs, we overlapped the CpGs with trait-associated probes included in EWASdb [32] (results for all genes are included in Additional file 13: Table S13). The target CpGs were enriched for CpGs associated with obesity/BMI, consistent with the role of NFκB in obesity-related inflammation [33].

a Network for transcription factor NFKB1 and its inhibitor NFKBIE. Gray circles indicate target CpGs, and arrows represent directed associations (i.e., association between GI and DNA methylation levels). Blue lines indicate a positive association between gene expression and DNA methylation levels red lines indicate a negative association between gene expression levels and DNA methylation levels. b NLRC5 (chromosome 16) was associated with decreased DNA methylation levels at multiple (N = 43) CpG sites in the classical and extended MHC region (chromosome 6). Red lines indicate a negative association between NLRC5 expression levels and DNA methylation levels. The numbers displayed in the lines indicate how many target CpGs the line represents. Gene labels are displayed if one or more target CpGs were associated with the expression of these genes. Blue gene symbols refer to genes negatively correlated with target CpG methylation (implying upregulation by NLRC5), and vice versa for red labels. Asterisks indicate that the GI corresponding to NLRC5 was also (positively) associated with this gene


Increased expression of NLRC5 was associated with decreased methylation levels at 43 CpG sites (11 regions), which were all located in either the classical or the extended MHC region [34]. NLRC5 is a known activator of MHC class I genes [35], and in line with this, the methylation levels of most target CpGs (N = 36) were negatively associated with the expression levels of one or more neighboring MHC genes (Fig. 4b/Additional file 14: Table S14, Additional file 15: Table S15). Furthermore, the GI corresponding to NLRC5 was positively associated with the expression of 16 of these genes. NLRC5 itself does not contain a DNA-binding domain instead, it has been shown to affect transcription by cooperating with a multi-protein complex that is assembled on the MHC class I promoter [35]. Interestingly, NLRC5 acts as a platform for enzymes that open chromatin by histone acetylation and/or demethylation of histone H3, indicating that decreased DNA methylation may be a consequence of altered chromatin state. In line with the role of NLRC5 in immune response, the target CpGs of NLRC5 were enriched for CpGs that were previously associated with immune-related disorders (including auto-immune disorders primary Sjögren’s syndrome and mixed connective tissue disease and sTNFR2 levels Additional file 13: Table S13) [32].


The gene with the largest number of detected target CpGs was SENP7. It was associated with decreased methylation levels at 189 target CpGs (87 regions) and with increased methylation levels at 19 target CpGs (12 regions). The majority (86%) of the target CpGs were located on the q-arm of chromosome 19. For most of these CpGs (92%), the DNA methylation levels were associated with the expression levels of one or more nearby zinc fingers (Additional file 16: Table S16, Additional file 17: Table S17), consistent with a previous gene network analysis [13]. Although SENP7 has no DNA-binding properties, previous research has shown that it exerts its effect through deSUMOylation of the chromatin repressor KAP1 [36]. KAP1 can act as a scaffold for various heterochromatin-inducing factors, and there is emerging evidence that KAP1 is directly involved in regulating DNA methylation [37, 38]. SENP7 could therefore affect DNA methylation through its interaction with KAP1. We further characterized SENP7 target CpGs by overlapping the CpGs with trait-related probes and found an enrichment for Werner syndrome-associated CpGs [32]. Interestingly, the Werner syndrome gene product is modified by SUMO [39, 40] and may therefore be related to SENP7’s function as SUMO protease.


Mutations in CDCA7 have been shown to cause ICF syndrome, a rare primary immunodeficiency characterized by epigenetic abnormalities [41]. Previous research showed that CDCA7-mutated ICF patients show decreased DNA methylation levels at pericentromeric repeats and heterochromatin regions, and similarly, CDCA7 depletion in mouse embryonic fibroblasts leads to decreased DNA methylation at centromeric repeats [41, 42]. In line with this prior work, increased expression of CDCA7 was associated with increased methylation levels at 79 CpG sites (79 regions) that were distributed across chromosomes (Fig. 5a) and were enriched in low-activity regions (e.g., quiescent states Fig. 5b) [43]. In addition, the target CpGs were enriched in repeat sequences as defined by the UCSC RepeatMasker (odds ratio 2.13, P = 0.006) [44]. A volcano plot showed that the test statistics of CDCA7 were highly skewed towards positive effects, suggesting that CDCA7 has widespread effects on DNA methylation (Additional file 4: Fig. S5a).

a CDCA7 (located on chromosome 2) and CDCA7L (located on chromosome 7) both affect genome-wide DNA methylation levels. Blue lines indicate a positive association between CDCA7 expression and trans DNA methylation levels. Green lines indicate a positive association between CDCA7L expression levels and trans DNA methylation levels. b, c Over- or underrepresentation of target CpGs in predicted chromatin states for b CDCA7 and c CDCA7L. Blue bars represent enrichment of CpGs that are significant at a genome-wide significance level (P < 1.4 × 10 −11 ), and gray bars represent enrichment of CpGs that are significant at a gene-level significance level (P < 1.2 × 10 −7 ). BivFlnk, flanking bivalent TSS/enhancer Enh, enhancer EnhBiv, bivalent enhancer EnhG, genic enhancer Het, heterochromatin Quies, quiescent ReprPC, repressed polycomb ReprPCWk, weak repressed polycomb TssA, active TSS TssAFlnk, flanking active TSS TssBiv, bivalent/poised TSS Tx, strong transcription TxFlnk, weak transcription ZNF/Rpts, ZNF genes and repeats


CDCA7L is a paralog of CDCA7, and similarly, its increased expression was associated with a genome-wide increase of DNA methylation levels (47 CpG sites, 47 regions Fig. 5a and Additional file 4: Fig. S5b). CDCA7L’s target CpGs did not overlap with those of CDCA7 however, they did show a similar genomic distribution and were enriched in inactive regions (Fig. 5c), although enrichment at repeat regions as defined in the UCSC RepeatMasker was reduced (OR = 1.59, P > 0.05). Interestingly, previous research has shown that the risk allele of the genetic variant most highly associated with multiple myeloma (rs4487645) was associated with increased CDCA7L expression [45]. Our GI for CDCA7L consisted of 5 SNPs, of which one (rs17361667) was in strong LD (r 2 = 0.7) with the risk variant rs4487645. If the risk variant indeed exerts its pathogenic effect through an effect on CDCA7L expression, CDCA7L’s effects on DNA methylation might be involved in the pathogenesis of multiple myeloma. Moreover, our multi-SNP GI was a stronger predictor of CDCA7L expression (F = 171) as compared with rs4487645 (F = 60) and may therefore be useful in gaining more insight into the role of CDCA7L in multiple myeloma.

Transcriptional Regulatory Proteins

The isolation of a variety of transcriptional regulatory proteins has been based on their specific binding to promoter or enhancer sequences. Protein binding to these DNA sequences is commonly analyzed by two types of experiments. The first, footprinting, was described earlier in connection with the binding of RNA polymerase to prokaryotic promoters (see Figure 6.3). The second approach is the electrophoretic-mobility shift assay, in which a radiolabeled DNA fragment is incubated with a protein preparation and then subjected to electrophoresis through a nondenaturing gel (Figure 6.24). Protein binding is detected as a decrease in the electrophoretic mobility of the DNA fragment, since its migration through the gel is slowed by the bound protein. The combined use of footprinting and electrophoretic-mobility shift assays has led to the correlation of protein-binding sites with the regulatory elements of enhancers and promoters, indicating that these sequences generally constitute the recognition sites of specific DNA-binding proteins.

Figure 6.24

Electrophoretic-mobility shift assay. A sample containing radiolabeled fragments of DNA is divided into two, and one half of the sample is incubated with a protein that binds to a specific DNA sequence. Samples are then analyzed by electrophoresis in (more. )

One of the prototypes of eukaryotic transcription factors was initially identified by Robert Tjian and his colleagues during studies of the transcription of SV40 DNA. This factor (called Sp1, for specificity protein 1) was found to stimulate transcription from the SV40 promoter, but not from several other promoters, in cell-free extracts. Then, stimulation of transcription by Sp1 was found to depend on the presence of the GC boxes in the SV40 promoter: If these sequences were deleted, stimulation by Sp1 was abolished. Moreover, footprinting experiments established that Sp1 binds specifically to the GC box sequences. Taken together, these results indicate that the GC box represents a specific binding site for a transcriptional activator—Sp1. Similar experiments have established that many other transcriptional regulatory sequences, including the CCAAT sequence and the various sequence elements of the immunoglobulin enhancer, also represent recognition sites for sequence-specific DNA-binding proteins (Table 6.2).

Table 6.2

Examples of Transcription Factors and Their DNA-Binding Sites.

The specific binding of Sp1 to the GC box not only established the action of Sp1 as a sequence-specific transcription factor it also suggested a general approach to the purification of transcription factors. The isolation of these proteins initially presented a formidable challenge because they are present in very small quantities (e.g., only 0.001% of total cell protein) that are difficult to purify by conventional biochemical techniques. This problem was overcome in the purification of Sp1 by DNA-affinity chromatography (Figure 6.25). Multiple copies of oligonucleotides corresponding to the GC box sequence were bound to a solid support, and cell extracts were passed through the oligonucleotide column. Because Sp1 bound to the GC box with high affinity, it was specifically retained on the column while other proteins were not. Highly purified Sp1 could thus be obtained and used for further studies, including partial determination of its amino acid sequence, which in turn led to cloning of the gene for Sp1.

Figure 6.25

Purification of Sp1 by DNA-affinity chromatography. A double-stranded oligonucleotide containing repeated GC box sequences is bound to agarose beads, which are poured into a column. A mixture of cell proteins containing Sp1 is then applied to the column (more. )

The general approach of DNA-affinity chromatography, first optimized for the purification of Sp1, has been used successfully to isolate a wide variety of sequence-specific DNA-binding proteins from eukaryotic cells. Protein purification has been followed by gene cloning and nucleotide sequencing, leading to the accumulation of a great deal of information on the structure and function of these critical regulatory proteins.

Cells of the body require nutrients in order to function, and these nutrients are obtained through feeding. In order to manage nutrient intake, storing excess intake and utilizing reserves when necessary, the body uses hormones to moderate energy stores. Insulin is produced by the beta cells of the pancreas, which are stimulated to release insulin as blood glucose levels rise (for example, after a meal is consumed). Insulin lowers blood glucose levels by enhancing the rate of glucose uptake and utilization by target cells, which use glucose for ATP production. It also stimulates the liver to convert glucose to glycogen, which is then stored by cells for later use. Insulin also increases glucose transport into certain cells, such as muscle cells and the liver. This results from an insulin-mediated increase in the number of glucose transporter proteins in cell membranes, which remove glucose from circulation by facilitated diffusion. As insulin binds to its target cell via insulin receptors and signal transduction, it triggers the cell to incorporate glucose transport proteins into its membrane. This allows glucose to enter the cell, where it can be used as an energy source. However, this does not occur in all cells: some cells, including those in the kidneys and brain, can access glucose without the use of insulin. Insulin also stimulates the conversion of glucose to fat in adipocytes and the synthesis of proteins. These actions mediated by insulin cause blood glucose concentrations to fall, called a hypoglycemic “low sugar” effect, which inhibits further insulin release from beta cells through a negative feedback loop.

This animation describe the role of insulin and the pancreas in diabetes.

Impaired insulin function can lead to a condition called diabetes mellitus, the main symptoms of which are illustrated in Figure 18.10. This can be caused by low levels of insulin production by the beta cells of the pancreas, or by reduced sensitivity of tissue cells to insulin. This prevents glucose from being absorbed by cells, causing high levels of blood glucose, or hyperglycemia (high sugar). High blood glucose levels make it difficult for the kidneys to recover all the glucose from nascent urine, resulting in glucose being lost in urine. High glucose levels also result in less water being reabsorbed by the kidneys, causing high amounts of urine to be produced this may result in dehydration. Over time, high blood glucose levels can cause nerve damage to the eyes and peripheral body tissues, as well as damage to the kidneys and cardiovascular system. Oversecretion of insulin can cause hypoglycemia, low blood glucose levels. This causes insufficient glucose availability to cells, often leading to muscle weakness, and can sometimes cause unconsciousness or death if left untreated.

Figure 18.10.
The main symptoms of diabetes are shown. (credit: modification of work by Mikael Häggström)

When blood glucose levels decline below normal levels, for example between meals or when glucose is utilized rapidly during exercise, the hormone glucagon is released from the alpha cells of the pancreas. Glucagon raises blood glucose levels, eliciting what is called a hyperglycemic effect, by stimulating the breakdown of glycogen to glucose in skeletal muscle cells and liver cells in a process called glycogenolysis. Glucose can then be utilized as energy by muscle cells and released into circulation by the liver cells. Glucagon also stimulates absorption of amino acids from the blood by the liver, which then converts them to glucose. This process of glucose synthesis is called gluconeogenesis. Glucagon also stimulates adipose cells to release fatty acids into the blood. These actions mediated by glucagon result in an increase in blood glucose levels to normal homeostatic levels. Rising blood glucose levels inhibit further glucagon release by the pancreas via a negative feedback mechanism. In this way, insulin and glucagon work together to maintain homeostatic glucose levels, as shown in Figure 18.11.

Pancreatic tumors may cause excess secretion of glucagon. Type I diabetes results from the failure of the pancreas to produce insulin. Which of the following statement about these two conditions is true?

  1. A pancreatic tumor and type I diabetes will have the opposite effects on blood sugar levels.
  2. A pancreatic tumor and type I diabetes will both cause hyperglycemia.
  3. A pancreatic tumor and type I diabetes will both cause hypoglycemia.
  4. Both pancreatic tumors and type I diabetes result in the inability of cells to take up glucose.

Regulation of Gene Expression: Negative and Positive Regulation

The two types of gene expression regulation are: (1) Negative Regulation and (2) Positive Regulation. And also discuss about some important terms used in connection with the regulation of gene expression.

Most of the genes of an organism produce specific proteins (enzymes), which, in turn produce specific phenotypes. The genes whose mRNA transcripts are translated into protein are known as structural genes. Every cell of an organism possesses all the structural genes normally present in the species, but only a small fraction of them are functional in any cell at a given time.

In prokaryotes, cells generally synthesize only those enzymes which they need in a given environment. For example, E. coli cells grown in the presence of lactose produce abundant (up to 3000 molecules/cell) β-galactosidase, the enzyme that hydrolyses lactose. However, very little of this enzyme (less than 3 molecules/cell) is produced in the absence of lactose.

In eukaryotes, the cells of different organs produce different proteins needed for their function. Red blood cells contain a high concentration of hemoglobin, while leucocytes (white blood cells) have no hemoglobin at all.

Apparently, there is a precise control on the kinds of proteins or enzymes product in a given tissue or cell at a given time. Such a control on gene activity, i.e., protein production, that permits the function of only those genes whose products are required in a given cell at a given time is termed as gene regulation.

Synthesis of enzyme depends mainly on two factors in a degradative process, the synthesis of enzyme depends on the availability of the molecule to be degraded. If the molecule is in more quantity, the enzyme synthesis will be more and vice versa. In a biosynthetic pathway, the synthesis of an enzyme is controlled by the end product. If the end product is more, the enzyme synthesis will be less and vice versa.

There are two types of gene regulation, viz:

(1) Negative regulation, and

(1) In negative regulation:

An inhibitor is present in the cell/system, that prevents transcription by inactivating the promoter. This inhibitor is known as repressor. For initiation of transcription, an inducer is required. Inducer acts as antagonist of the repressor. In the negative regulation, absence of product increases the enzyme synthesis and presence of the product decreases the synthesis.

(2) In positive regulation:

An effector molecule (which may be a protein or a molecular complex) activates the promoter for transcription. In a degradative system, either negative or positive mechanism may operate, while in a biosynthetic pathway negative mechanism operates (e.g., lac operon).

The phenomenon of gene expression can be elaborated further such as given below:

1. Gene expression is the mechanism at the molecular level by which a gene is able to express itself in the phenotype of an organism.

2. The mechanism of gene expression involves biochemical genetics. It consists of synthesis of specific RNAs, polypeptides, structural proteins, proteinaceous bio-chemicals or enzymes which control the structure or functioning of specific traits.

3. Gene regulation is the mechanism of switching off and switching on of the genes depending upon the requirement of the cells and the state of development.

4. It is because of this regulation that certain proteins are synthesized in as few as 5-10 molecules while others are formed in more than 100,000 molecules per cell.

5. There are two types of gene regulations positive and negative.

6. In negative gene regulation the genes continue expressing their effect till their activity is suppressed.

7. This type of gene regulation is also called repressible regulation.

8. The repression is due to a product of regulatory genes.

9. Positive gene regulation is the one in which the genes remain non-expressed unless and until they are induced to do it.

10. It is, therefore called inducible regulation.

11. Here a product removes d biochemical that keeps the genes in non-expressed state.

12. As the genes express their effect through enzymes, their enzymes are also called inducible enzymes and repressible enzymes.

Gene regulation is exerted at four levels:

1. Transcriptional level when primary transcript is formed.

2. Processing level when splicing and terminal additions are made.

3. Transport of mRNA out of nucleus into cytoplasm.

Important Terms used in Connection with the Regulation of Gene Expression:

In operon, protein molecules which prevent transcription. The process of inhibition of transcription is called repression.

The substance that allows initiation of transcription (e.g., lactose in lac operon). Such process is known as induction.

A combination of repressor and a metabolite which prevents protein synthesis. Such process is known as co-repression.

An enzyme whose production is enhanced by adding the substrate in the culture medium. Such system is called inducible system.

An enzyme whose production can be inhibited by adding an end product. Such system is known as repressible system.

6. Constitutive Enzyme:

An enzyme whose production is constant irrespective of metabolic state of the cell.

Inhibition of transcription by repressor through inactivation of promoter, e.g., in lac operon.

Enhancement of transcription by an effector molecule through activation of pro-motor.

Gene Regulation in Eukaryotes

Let us make an in-depth study of the gene regulation in eukaryotes. After reading this article you will learn about: 1. Chromatin Modification 2. Control of Transcription by Hormones 3. Regulation of Processing of mRNA 4. Control of Life Span of mRNA 5. Gene Amplification 6. Post Translation Regulation and 7. Post Transcription Gene Silencing.

Introduction to Gene Regulation:

The expression of genes can be regulated in eukaryotes by all the principles as those of prokaryotes. But there are many additional mechanisms of control of gene expression in eukaryotes as genome is much bigger. The genes are present in the nucleus where mRNA is synthesized. The mRNA is then exported to cytoplasm where translation takes place.

In eukaryotes, the organization is multicellular and specialized into tissues and organs. The cells are differentiated and cells of a tissue generally produce a specific protein involving a particular set of genes. All other genes become permanently shut off and are never transcribed.

Structural features of eukaryotes that influence the gene expression are the presence of nucleosomes in chromatin, heterochromatin and the presence of the split genes in chromosomes.

As compared to prokaryotic genes, the eukaryotic genes have many more regulatory binding sites and they are controlled by many more regulatory proteins. Regulatory sequences can be present thousands of nucleotides away from the promoter, may lie upstream and downstream. These regulatory sequences act from a distance. The intervening DNA loops out, so that the regulatory sequence and promoter come to lie near each other.

Most of the regulation of gene control occurs at the initiation of transcription level. Initiation of translation also influences gene regulation immensely.

Chromatin Modification:

The genome of eukaryotes is wrapped in histone proteins to form nucleosomes. This condition leads to partial concealment of genes and reduces the expression of genes.

The packing of DNA with histone octomers is not permanent. Any portion of DNA can be released from the octomer whenever DNA binding proteins have to act on it. These DNA binding proteins or enzymes recognize their binding sites on DNA only when it is released from histone octomer or when present on linker DNA. The DNA is unwrapped from nucleosomes.

This unwrapping of DNA from nucleosomes is performed by nucleosome modifier enzymes or nucleosome remodelling complex. They act in various ways. They may remodel the structure of octomer or slide the octomer along DNA, thus uncover the DNA binding sites for the action of regulatory proteins. Thus the genes are activated.

Some of these nucleosome modifiers add acetyl groups (acetylation) to the tails of histones, thus loosen the DNA wrapping and in the process exposes the DNA binding sites. All these lead to the expression of genes. Similarly, deacetylation by deacetylases causes inactivation of DNA.

Nucleosomes are entirely absent in the regions that are active in transcription like rRNA genes.

Dense form of chromatin is called heterochromatin in eukaryotes. It leads to gene inhibition or gene silencing. Heterochromatin is densely packaged part of chromatin which does not allow gene expression. Densely packaged chromatin cannot be easily transcribed. Some enzymes make the chromatin more dense. Telomeres and contromeres are in the form of heterochromatin.

In higher animals about 50% of the genome is in the form of heterochromatin. Enzymes are capable of changing the density of chromatin by chemically modifying the tails of histones. This affects transcription.

In this way, both activation and repression of transcription is performed by modification of chromatin into heterochromatin and euchromatin.

Methylation of certain sequences of DNA prevents the transcription of genes in mammals. It has been observed that genes, which are heavily methylated are not transcribed, therefore not expressed. DNA methylase enzymes cause methylation of certain DNA sequences thereby silencing of genes.

Control of Transcription by Hormones:

Various intercellular and intracellular signals regulate the gene expression.

Hormones exercise considerable control over transcription. Hormones are extracellular substances synthesized by endocrine glands. They are carried to the distant target cells. Various hormones like insulin, estrogen, progesterone, testosterone etc. often act by “switching on” transcription of DNA.

The hormone on entering a target cell forms a complex with the receptor present in the cytoplasm. This hormone-receptor complex enters the nucleus and binds to a particular chromosome by means of specific proteins. This initiates the transcription. Hormone-receptor complex can enhance or suppress the expression of genes.

It has been observed in chickens that when hormone estrogen is injected, the oviduct responds by synthesizing mRNA, which is responsible for synthesis of albumen. The hormone directly binds to DNA and acts as an inducer.

Regulation of Processing of mRNA:

Genes of eukaryotes have non-coding regions (introns) in between coding regions (exons). Such genes are called split genes. The entire gene is transcribed to produce mRNA which is called precursor mRNA or primary transcript (pre-mRNA). Before translation takes place, the introns are spliced out by excision and discarded. This is known as processing of mRNA and the processed mRNA is called mature mRNA. This takes part in protein synthesis. Mature mRNA is considerably smaller than precursor mRNA.

Higher eukaryotes have various mechanisms by which pre-mRNA is processed in alternate or differential ways to produce different mRNAs which encode different proteins. Multiple proteins are produced from one gene by alternate mRNA processing. Many cells take advantage of different splicing pathways to alter the expression of genes and synthesize different polypeptides. Alternate mRNA splicing increases the number of proteins expressed by a single eukaryotic gene.

Alternate processing of pre-mRNA is accomplished by exon skipping, by retaining certain introns etc.

These alternate processing pathways are highly regulated.

In drosophilla mRNA is processed in four different ways, therefore produces four different kinds of muscle protein myosin. Different kind of myosin is produced in larva, pupa and late embryonic stages.

Control of Life Span of mRNA:

In prokaryotes the life span of an mRNA molecule is very brief, lasting only for a minute or less. The mRNA immediately degenerates after the protein synthesis.

But as the mRNA in eukaryotes is transported to cytoplasm through the nucleopores, this mRNA is repeatedly translated. This repeated translation of mRNA is achieved by increasing the life span of mRNA. In a highly differentiated cell, single mRNA molecule having long life span is able to produce large amount of single protein. Life span of a eukaryotic mRNA varies from a few hours to several days.

Chicken oviduct cells have a single copy of ovalbumen gene but produce large amount of albumen.

Silk gland of silkworm produces a very long thread made of protein fibroin, which forms cocoon. Silk gland is a single polyploid cell. It produces large number of mRNA molecules, which have long life span of several days.

Gene Amplification:

A mechanism exists in various organisms whereby the number of genes is increased many fold without mitosis division. This is called gene amplification.

During amplification DNA repeatedly undergoes replication without mitotic separation into daughter DNA molecules or chromatids. This enables the cell to produce large amount of protein in a short time.

Post Translation Regulation:

In prokaryotes, a single polycistronic mRNA molecule codes for many different proteins. But in eukaryotes having mono-cistronic mRNA, synthesis of different proteins is achieved in a different way. A single mRNA yields a large polypeptide called polyprotein. This polyprotein is then cleaved in alternate ways to produce different proteins. Each protein is regarded as the product of a single gene. In this system, there are many cleaving sites on the polyprotien.

Post Transcription Gene Silencing:

Many small RNAs exist in eukaryotes that play their role in silencing of genes. These small RNAs act on mRNA resulting in disruption of translation. These small RNAs are micro RNAs (miRNAs), small interfering RNAs (siRNAs) and many others.

Now that you have learned some of the basics, check out this example that applies what you learned to a specific case study.

The video above briefly describes the laboratory part of this research. To learn more about what this research looks like, check out the “Stickleback Evolution Virtual Lab.”

If you are still a little unsure of how switches work, then check out this HMMI Biointeractive interactive. The ability to digest lactose as an adult is a rare phenomenon in mammals. It evolved twice in humans—in Africa and Europe.

Now let’s test your understanding of transcription regulation!

Take the quiz below the simulation as you work your way through it. Note that if you are using your mouse to scroll down, it may not work at this point—use the scrolling bar at the right edge of your web browser instead.

The Role of KV7.3 in Regulating Osteoblast Maturation and Mineralization

KCNQ (KV7) channels are voltage-gated potassium (KV) channels, and the function of KV7 channels in muscles, neurons, and sensory cells is well established. We confirmed that overall blockade of KV channels with tetraethylammonium augmented the mineralization of bone-marrow-derived human mesenchymal stem cells during osteogenic differentiation, and we determined that KV7.3 was expressed in MG-63 and Saos-2 cells at the mRNA and protein levels. In addition, functional KV7 currents were detected in MG-63 cells. Inhibition of KV7.3 by linopirdine or XE991 increased the matrix mineralization during osteoblast differentiation. This was confirmed by alkaline phosphatase, osteocalcin, and osterix in MG-63 cells, whereas the expression of Runx2 showed no significant change. The extracellular glutamate secreted by osteoblasts was also measured to investigate its effect on MG-63 osteoblast differentiation. Blockade of KV7.3 promoted the release of glutamate via the phosphorylation of extracellular signal-regulated kinase 1/2-mediated upregulation of synapsin, and induced the deposition of type 1 collagen. However, activation of KV7.3 by flupirtine did not produce notable changes in matrix mineralization during osteoblast differentiation. These results suggest that KV7.3 could be a novel regulator in osteoblast differentiation.

Keywords: KCNQ channels differentiation glutamate matrix mineralization.


Regulation of human mesenchymal stem…

Regulation of human mesenchymal stem cell (hMSC) osteogenic differentiation by tetraethylammonium (TEA). Matrix…

Regulation of human mesenchymal stem…

Regulation of human mesenchymal stem cell (hMSC) osteogenic differentiation by tetraethylammonium (TEA). Matrix…

RT-PCR analysis of the K V 7 channels in osteoblast-like cells. The PCR…

Changes in K V 7 channel expression during osteoblastic differentiation. The relative expression…

Functional characteristics of K V…

Functional characteristics of K V 7.3 channel in MG-63 cells during osteoblast differentiation.…

Functional characteristics of K V…

Functional characteristics of K V 7.3 channel in MG-63 cells during osteoblast differentiation.…

Effect of flupirtine, linopirdine, and…

Effect of flupirtine, linopirdine, and XE991 on MG-63 cell viability. The MTT assay…

Regulation of osteoblastic differentiation by…

Regulation of osteoblastic differentiation by K V 7 channel in MG-63 and Saos-2…

mRNA expression of osteoblastic differentiation…

mRNA expression of osteoblastic differentiation markers in MG-63 cells. The relative mRNA expression…

Regulation of synaptic vesicle-related protein,…

Regulation of synaptic vesicle-related protein, synapsin, by K V 7.3 channel in MG-63…

Alterations of ERK1/2 phosphorylation by…

Alterations of ERK1/2 phosphorylation by the K V 7 opener or K V…

Effect of K V 7 channel on glutamate release during osteoblastic differentiation in…

Suppressive effect of CNQX (6-cyano-7-nitroquinoxaline-2,3-dione),…

Suppressive effect of CNQX (6-cyano-7-nitroquinoxaline-2,3-dione), an AMPA (α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid)/kainite receptor antagonist, on osteoblastic…

Suppressive effect of MK801, an…

Suppressive effect of MK801, an NMDA (N-methyl-D-aspartate) receptor antagonist, on osteoblastic differentiation promoted…

Counter-effect of riluzole, a glutamate…

Counter-effect of riluzole, a glutamate release inhibitor, on osteoblastic differentiation promoted by K…

Induction of intracellular type 1…

Induction of intracellular type 1 collagen by K V 7 channel during osteoblast…

Watch the video: Regulation of Gene Expression: Operons, Epigenetics, and Transcription Factors (November 2021).