5: Regionalization and Organizers - Biology

5: Regionalization and Organizers - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Animal development can be seen as a series of successive fate decisions where cells take internal and external (signals from other cells) information and use it to become progressively more specified. Regionalization refers to subdividing an existing embryo or tissue into smaller parts with unique fates. This can occur at a large scale, for example Bicoid and Nanos broadly regionalizing the embryo into anterior, middle, and posterior, or it can occur at a small scale, for example Shh regionalizing an autopod (hand) into thumb and pinky sides. Regionalization typically occurs in three main steps: First, a morphogen broadly patterns a tissue by forming a gradient. Next, this gradient is read out by a series of transcription and translation factors that turn the gradient into discrete domains of gene expression. Finally (or concurrently) the cells expressing these unique combinations of genes are fated and begin to exhibit different properties. This chapter first broadly examines examples of regionalization and specification and then considers the role of organizers in these processes.

  • 5.1: Splitting up the A/P axis: Beginning-Hox Genes, Another Level of Regionalization
    We already took a quick look at Hox genes, both in your readings and in the Genetic Toolkit section. Now we will put them into a broader Evo-Devo context by looking at a specific example of Hox patterning and by seeing evidence that regionalization by Hox genes is conserved across animals. Some cis-regulatory elements driving Hox gene expression patterns are also conserved. However, the earlier developmental events that set up Hox expression domains are not conserved across animals.
  • 5.2: Organizers-Other Organizers
    One of the most interesting things about building animal bodies is the diversity we see across and within bodies. Much of this differentiation is ruled by local organizers and master control genes.
  • 5.S: Regionalization and Organizers (References)

Cloning and analysing of 5′ flanking region of Xenopus organizer gene noggin

Xenopus organizer specific gene Noggin possesses nearly all the characterestic properties of the action of organizer to specify the embryonic body axis. To analyze how the maternal inherited factors control its expression pattern, we cloned the 5′ regulatory region of noggin gene. The 1.5 kb upstream sequense could direct reporter gene to express in vivo and data from deletion analysis indicated that a 229 base pair fragmet is essential for activating noggin expression. We further demonstrated that the response elements within this regulatory region were indeed under the control of growth factor activin and Wnt signaling pathway components.


Bacterial type II CRISPR-Cas9 systems can effectively induce RNA-guided DNA double strand breaks (DSBs) [1], making them popular tools for genome editing in bacteria [2], animal cells [3], mammalian systems [4,5,6,7], and plants [8,9,10,11]. The most widely used Streptococcus pyogenes Cas9 (SpCas9) uses

20 nucleotides (nt) of a single guide RNA (sgRNA) to recognize a complementary target DNA site along with an NGG protospacer adjacent motif (PAM) [1, 12]. More recently, type V CRISPR-Cpf1 (CRISPR-Cas12a) was shown to mediate efficient genome editing in human cells [13] and plants [14,15,16]. Cpf1(Cas12a) uses

23 nt of an RNA guide to target DNA with a TTTV PAM [13]. RNA-guided nucleases (RGNs) such as Cas9 and Cpf1 represent versatile genome editing tools that promise to advance basic science, enable personalized medicine, and accelerate crop breeding. However, Cas9 may cause undesired off-target mutations due to sgRNAs recognizing DNA sequences with one to a few nucleotide mismatches, albeit with reduced nuclease binding and cleavage activity [1, 6, 17, 18]. Although similar rules apply to Cpf1, recent studies in human cells [19, 20] have shown Cpf1 is generally more specific than Cas9.

Understanding the scope of off-target mutations in Cas9- or Cpf1-edited crops is critical for research and regulation. Previously, whole-genome sequencing (WGS) was applied for detecting off-target mutations by Cas9 in Arabidopsis [21], rice [22], and tomato [23]. Unfortunately, these studies either only looked at potential off-target sites predicted by computer programs or fell short of full analysis of all the mutations identified by WGS in edited plants. Without inclusion of enough necessary controls, such WGS studies had limited power for isolating off-target mutations in edited plants because they were unable to fully assess the levels of preexisting mutations, spontaneous mutations, and mutations caused by tissue culture- and Agrobacterium-mediated transformation. Genome-wide identification of off-target mutations by Cas9 or Cpf1 will be empowered only if all background mutations can be isolated. Furthermore, WGS-based off-target analysis of Cpf1 has not been reported in any higher organism. In recent years, WGS studies on Cas9-edited mice have generated contrasting results one study found few off-target mutations [24] while another found many [25]. This controversy raised the urgency for comprehensive and rigorous analyses of off-target mutations using WGS in edited animals and plants. We reasoned a large-scale and well-designed study is required for comprehensive assessment of off-target effects in crops by Cas9 and Cpf1, two leading CRISPR genome editing systems. Here, we describe a large-scale WGS study to assess off-target effects of Cas9 and Cpf1 in rice, an important food crop. Our results suggest off-target mutations of Cas9 and Cpf1 are largely negligible when compared to spontaneous mutations or mutations caused by tissue culture and Agrobacterium infection in edited plants. The resulting knowledge is likely to serve as an important reference for plant researchers and regulatory agencies.


Date palm (Phoenix dactylifera L.) has long been one of the most important fruit crops in the arid regions of the Arabian Peninsula, North Africa, and the Middle East. During the past three centuries, dates were also introduced to new production areas in Australia, India/Pakistan, Mexico, southern Africa, South America, and the United States. Dates are a main income source and staple food for local populations in many countries in which they are cultivated, and have played significant roles in the economy, society, and environment of those countries.

Date is one of the oldest known fruit crops and has been cultivated in North Africa and the Middle East for at least 5000 years (Zohary and Hopf, 2000). The earliest record from Iraq (Mesopotamia) shows that date culture was probably established as early as 3000 bce . Because of the long history of date culture and the wide distribution and exchange of date cultivars, the exact origin of the date is unknown, but it most likely originated from the ancient Mesopotamia area (southern Iraq) or western India (Wrigley, 1995). From its center of origin, date cultivation spread through out the Arabian Peninsula, North Africa, and the Middle East. Date culture had apparently spread into Egypt by the middle of the second millennium bce . The spread of date cultivation later accompanied the expansion of Islam and reached southern Spain and Pakistan. The Spanish were the first to introduce date palms outside the Arabian Peninsula, North Africa, and the Middle East/South Asia, carrying them to America (Nixon, 1951).

Date cultivation has had a very important influence on the history of the Middle East. Without dates, no large human population could have been supported in the desert regions. The caravan routes existed for centuries mainly for the transportation of dates. Early on, date cultivation became a sacred symbol of fecundity and fertility. Dates had great spiritual and cultural significance to the people of the Middle East. Date palms and culture are depicted in ancient Assyrian and Babylonian tablets, including the famous Code of Hammurabi, which contained laws pertaining to date culture and sales. References relating to date palms are also found in ancient Egyptian, Syrian, Libyan, and Palestinian writings (Nixon, 1951 Popenoe, 1973).

Lysosomal storage disorders: The cellular impact of lysosomal dysfunction

Lysosomal storage diseases (LSDs) are a family of disorders that result from inherited gene mutations that perturb lysosomal homeostasis. LSDs mainly stem from deficiencies in lysosomal enzymes, but also in some non-enzymatic lysosomal proteins, which lead to abnormal storage of macromolecular substrates. Valuable insights into lysosome functions have emerged from research into these diseases. In addition to primary lysosomal dysfunction, cellular pathways associated with other membrane-bound organelles are perturbed in these disorders. Through selective examples, we illustrate why the term �llular storage disorders” may be a more appropriate description of these diseases and discuss therapies that can alleviate storage and restore normal cellular function.

Lysosomal storage disorders: A brief overview

Inborn errors of metabolism are a common cause of inherited disease (Burton, 1998), of which lysosomal storage diseases (LSDs) are a significant subgroup (Platt and Walkley, 2004 Fuller et al., 2006 Ballabio and Gieselmann, 2009). The combined incidence of LSDs is estimated to be approximately 1:5,000 live births (Fuller et al., 2006), but the true figure is likely greater when undiagnosed or misdiagnosed cases are accounted for. Common to all LSDs is the initial accumulation of specific macromolecules or monomeric compounds inside organelles of the endosomal𠄺utophagic–lysosomal system. Initial biochemical characterization of stored macromolecules in these disorders led to the implication of defective lysosomal enzymes as a common cause of pathogenesis (Hers, 1963 Winchester, 2004). Although most LSDs result from acidic hydrolase deficiencies (Winchester, 2004), a considerable number of these conditions result from defects in lysosomal membrane proteins or non-enzymatic soluble lysosomal proteins (Saftig and Klumperman, 2009). Therefore, LSDs offer a window into the normal functions of both enzymatic and non-enzymatic lysosomal proteins.

Clinical phenotypes of LSDs

The age of clinical onset and spectrum of symptoms exhibited amongst different LSDs vary, depending on the degree of protein function affected by specific mutations, the biochemistry of the stored material, and the cell types where storage occurs. Apart from lysosomal diseases involving substrate storage in bone and cartilage (e.g., the mucopolysaccharidoses Table 1 ) most babies born with these conditions appear normal at birth. The classical clinical presentation of an LSD is a neurodegenerative disease of infancy/childhood (Wraith, 2002), but adult-onset variants also occur (Spada et al., 2006 Nixon et al., 2008 Shapiro et al., 2008). A health surveillance program tasked with diagnosing all neurodegenerative disease cases in UK children has so far revealed that lysosomal disorders are amongst the most commonly confirmed diagnoses of neurodegeneration (45% of cases) and will provide a robust frequency of infantile/juvenile onset cases as the study gathers more data over the coming years (Verity et al., 2010). Key molecular and clinical features of the storage diseases mentioned in this review are summarized in Table 1 . In addition, detailed medical descriptions on the various disorders are available on the Online Metabolic and Molecular Bases of Inherited Disease (OMMBID) website (Valle et al., 2012).

Table 1.

The causes of lysosomal storage diseases, the organelles affected, and major sites of pathology

Mechanism of lysosomal storageDisease examplesLysosomal protein defect (gene symbol)Substrate(s) storedMajor peripheral organ systems affectedCNS pathology
Lysosomal enzyme deficienciesAspartylglucosaminuriaAspartylglucosaminidase (glycosylasparaginase, AGA)aspartylglucosamine (N-acetylglucosaminyl-asparagine)Skeleton, connective tissue+
Fabryα-Galactosidase (GLA)(Lyso-)GlobotriaosylceramideKidney, heart
Gaucher types 1, 2, and 3β-Glucocerebrosidase (GBA)Glucosylceramide, glucosylsphingosineSpleen/liver, bone marrow+ a
GM1-gangliosidosisβ-Galactosidase (GLB1)GM1-ganglioside, oligosaccharidesSkeleton, heart+
Krabbe (globoid cell leukodystrophy)Galactocerebrosidase (GALC)GalactosylceramideHeart+
Metachromatic leukodystrophyArylsulfatase A (ARSA)Sulfogalactosylceramide +
MucopolysaccharidosesEnzymes involve in mucopolysaccharide catabolismMucopolysaccharidesCartilage, bone, heart, lungs+ b
Multiple sulfatase deficiencySUMF1 (Formylglycine-generating enzyme needed to activate sulfatases)Multiple, including sulfated glycosaminoglycansSpleen/liver, bone, skin+
Pompeα-Glucosidase (GAA)GlycogenSkeletal muscle
Sandhoffβ-hexosaminidase A and B (HEXB)GM2-ganglioside +
Trafficking defect of lysososomal enzymesMucolipidosis type II (I-cell disease)N-acetyl glucosamine phosphoryl transferase α/β (GNPTAB)Carbohydrates, lipids, proteinsSkeleton, heart+
Mucolipidosis type IIIA (pseudo-Hurler polydystrophy)N-acetyl glucosamine phosphoryl transferase α/β (GNPTAB)Carbohydrates, lipids, proteinsSkeleton, heart+/−
Defects in soluble non-enzymatic lysosomal proteinsNiemann-Pick disease type C2NPC2 (soluble cholesterol binding protein)Cholesterol and sphingolipidsLiver+
Defects in lysosomal membrane proteinsCystinosisCystinosin (cysteine transporter, CTNS)CystineKidney, eye
Danon diseaseLysosomal-associated membrane protein 2, splicing variant A (LAMP2)Glycogen and other autophagic componentsCardiac and skeletal muscle+
Free sialic acid storage disorderSialin (sialic acid transporter, SLC17A5)Free sialic acidLiver/spleen, skeleton+
Mucolipidosis IVMucolipin-I (MCOLN1)Mucopolysaccharides and lipidsEye+
Niemann-Pick disease type C1NPC1 (membrane protein involved in lipid transport)Cholesterol and sphingolipidsLiver+
Enigmatic lysosomal disordersNeuronal ceroid lipofuscinoses (NCLs, including Batten disease)Disparate group of diseases with genetic defects in apparently unrelated genes, not all of which are associated with the lysosomal system. Not known if these genes cooperate in common cellular pathways.Autofluorescent lipofuscin is a common feature, with convergent clinical signs, e.g., visual system defects/blindness +

Listed are the diseases discussed in the main text. Mucopolysaccharidoses and neuronal ceroid lipofuscinoses refer to collections of related disorders.

Relatively few lysosomal diseases lack pathology in the central nervous system (CNS Wraith, 2004). In the majority of LSDs, CNS involvement is common and neurodegeneration can occur in multiple brain regions (e.g., thalamus, cortex, hippocampus, and cerebellum). Neuropathology in LSDs involves unique temporal and spatial changes, which often entails early region-specific neurodegeneration and inflammation, before global brain regions are affected. The main reasons for this are threefold: (1) specific storage metabolites exert differential effects on neuronal subtypes, (2) varying proportions of macromolecules are synthesized in different neuronal populations, and (3) there is differential neuronal vulnerability to storage (e.g., Purkinje neurons degenerate in many of these diseases leading to cerebellar ataxia). Activation of the innate immune system is also prevalent in the brain of LSDs, which directly contributes to CNS pathology (Vitner et al., 2010). Astrogliosis (activation of astrocytes) is another common feature of LSDs, which damages neurons through an inflammatory process known as glial scarring (Jesionek-Kupnicka et al., 1997 Vitner et al., 2010). The additive detrimental effects that astrogliosis has on neuron function is recapitulated in animal models of lysosomal diseases (Farfel-Becker et al., 2011 Pressey et al., 2012).

A notable non-neuronopathic LSD is Type 1 Gaucher disease (β-glucocerebrosidase deficiency), which is a relatively common LSD, particularly within the Ashkenazi Jewish community. The major cell type affected by glucosylceramide storage in this disease is the macrophage (“Gaucher cells”), whose dysfunction affects the production and turnover of cells belonging to the hematopoietic system. Gaucher cells infiltrate into various organs and affect the immune system, bone strength, spleen, and liver function.

A key question currently challenging this field is how endosomal–lysosomal storage leads to pathogenesis and how expanding this knowledge will improve treatment for patients (Bellettato and Scarpa, 2010 Cox and Cachón-González, 2012). This review aims to delineate regulatory systems and organelles that become disrupted in these disorders, highlighting the complexity of cellular storage, its consequences on pathogenesis, and implications for therapy.

Endosomal𠄺utophagic–lysosomal function and dysfunction in storage diseases

Lysosomes play a central role in processing the clearance of cellular substrates from multiple routes within the endosomal𠄺utophagic–lysosomal system ( Fig. 1 ). Lysosomes are acidic organelles that contain enzymes required for the degradation of macromolecules, and efflux permeases that facilitate the inside-out translocation of small molecules generated through macromolecule catabolism. In comparison to endosomes and autophagosomes, lysosomes are smaller in size, are highly enriched in particular transmembrane proteins and hydrolytic enzymes (including proteases, glycosidases, nucleases, phosphatases, and lipases), have a higher buoyant density, an electron-dense appearance by transmission electron microscopy, and a high proton and Ca 2+ content (Luzio et al., 2007 Saftig and Klumperman, 2009 Morgan et al., 2011). Lysosomes differ from endosomes in their degree of acidification and more abundant levels of lysosomal membrane proteins (LMPs) such as LAMP1 and LAMP2. Most nascent lysosomal enzymes bind to mannose-6-phosphate receptors (M6PRs) in the trans-Golgi network (TGN), which traffic the enzymes to early and late endosomes (Ghosh et al., 2003). Lysosomes in turn receive these enzymes when endosomal–lysosomal fusion occurs. Notably, dense lysosomes do not contain M6PRs. Acidotropic reagents such as Lysotracker are useful for labeling lysosomes however, the mildly acidic interiors of late endosomes and autophagosomes also allows Lysotracker to label these organelles to varying degrees (Bampton et al., 2005).

Lysosomes as catabolic centers of the cell. Lysosomes utilize four distinct pathways for the degradation of cellular material. (A) Macroautophagy begins with the formation of isolation membranes that sequester regions of the cytosol that include denatured proteins, lipids, carbohydrates, and old/damaged organelles into encapsulated vesicles known as autophagosomes. The dynamic kinetics of autophagosome production and clearance by lysosomes is known as autophagic flux. (B) Endosomal degradation by lysosomes predominantly targets late endosomes/multivesicular bodies. Fusion between late endosomes and lysosomes can occur by (i) full fusion/degradation or (ii) kiss-and-run content mixing, where transient endosomal docking occurs. (C) Microautophagy involves the pinocytosis of cytosolic regions surrounding lysosomes. (D) Chaperone-mediated autophagy (CMA) selectively targets proteins with a KFERQ motif for delivery to lysosomes using Hsc-70 as its chaperone and LAMP-2A as its receptor.

The biogenesis and functioning of endosomal and autophagosomal pathways is controlled by transcription factor EB (TFEB), which regulates the expression of 471 genes that constitute the CLEAR (coordinated lysosomal expression and regulation) gene network (Sardiello et al., 2009 Palmieri et al., 2011). Recent work indicates that non-active TFEB is highly phosphorylated and associates with late endosomes/lysosomes (Roczniak-Ferguson et al., 2011). Autophagy-inducing conditions (e.g., deprivation of glucose or amino acids) result in reduced and altered TFEB phosphorylation, leading to its translocation into the nucleus (Pe༚-Llopis et al., 2011) and transcriptional expression of CLEAR genes (Palmieri et al., 2011).

Degradation of endosomal and autophagosomal material takes place upon exchange of content (via transient “kiss-and-run” contacts) or fusion with lysosomes, forming endolysosomes (Tjelle et al., 1996 Bright et al., 1997, 2005 Mullock et al., 1998) and autolysosomes (Jahreiss et al., 2008 Fader and Colombo, 2009 Orsi et al., 2010), respectively ( Fig. 1, A and B ). Lysosomes can be regarded as storage compartments for acidic hydrolases that enter cycles of fusion and fission with late endosomes and autophagosomes, while the digestion of endocytosed and autophagic substrates takes place primarily in endolysosomes and autolysosomes (Tjelle et al., 1996 Luzio et al., 2007). Under physiological conditions, endolysosomes and autolysosomes are transient organelles.

Cells deficient in lysosomal hydrolytic enzymes, lysosomal membrane proteins, or non-enzymatic soluble lysosomal proteins accumulate excessive levels of undegraded macromolecules (enzyme deficiency) or monomeric catabolic products (efflux permease deficiency) and contain numerous endo/autolysosomes ( Fig. 2 ). When very high levels of macromolecules/monomers accumulate in endo/autolysosomes, they inhibit catabolic enzymes and permeases that are not genetically deficient, which results in secondary substrate accumulation (Walkley and Vanier, 2009 Lamanna et al., 2011 Prinetti et al., 2011). For example, lysosomal proteolytic capacity is reduced in fibroblasts from various LSDs, such as mucopolysaccharidoses I and VI, and GM1-gangliosidosis, which are themselves not caused by protease deficiency (Kopitz et al., 1993). The accumulation of primary and secondary substrates sets off a cascade of events that impacts not only the endosomal𠄺utophagic–lysosomal system, but also other organelles, including mitochondria, the ER, Golgi, peroxisomes ( Fig. 3 ), and overall cell function ( Fig. 4 ).

Subtypes of storage organelles accumulate in LSDs. In different LSDs, cells display a unique spectrum of dysfunctional organelles depending on the specific lysosomal enzyme or non-enzymatic protein affected. (A) In primary LSDs, deficiencies in degradative enzymes prevent the clearance of autophagic and endocytic substrates, resulting in the accumulation of (i) autolysosomes (LC3-II (+), LAMP-1 (+)), (ii) endolysosomes (CI-MPR (+), LAMP-1 (+)), and (iii), in the case of certain lipase deficiencies, lipid-rich multilamellar bodies (CI-MPR (+), LAMP-1 (+)). (B) In a secondary storage disease such as Niemann-Pick type C1, lysosomal enzyme function remains intact, but impaired heterotypic fusion of autophagic and endocytic organelles with lysosomes results in the accumulation of (iv) autophagosomes (LC3-II (+), LAMP-1 (−)), (v) late endosomes (CI-MPR (+), active cathepsin D (−)), and (vi) endosome-derived multilamellar bodies (lipid-rich, CI-MPR (+), active cathepsin D (−)). Note: many primary storage diseases also accumulate organelles seen in secondary storage diseases (see text).

Summary of organelles affected in LSDs. Also shown are selective examples of LSDs. See Table 1 and main text for details.

Hypothetical cascade of events in LSD pathology. How gene mutations in lysosomal enzymes and non-enzymatic lysosomal proteins could lead to LSDs. Endo/autolysosomal events are confined to the darker shaded background, whereas processes taking place in the cytoplasm that affect autophagosomes, the ER, Golgi, peroxisomes, and mitochondria are on the lighter background. Processes depicted have been observed in a number of LSDs but do not necessarily apply to all LSDs.

Autophagic pathways.

The autophagic (“self-eating”) pathway constitutively targets intracellular cytosolic components for lysosomal degradation, and is essential for maintaining cellular energy and metabolic homeostasis (Kuma and Mizushima, 2010 Singh and Cuervo, 2011). To date, three distinct forms of autophagy have been characterized: macroautophagy, microautophagy, and chaperone-mediated autophagy ( Fig. 1, A, C, and D ). All three autophagic processes culminate in lysosomal degradation however, routes taken by substrates to the lysosome differ between each form. Macroautophagy involves the bulk sequestration of cytosolic regions into double- or multi-membrane bound autophagosomes, which are trafficked to lysosomes for content digestion ( Fig. 1 A ). A diverse range of cellular material is degraded via macroautophagy, including lipids, carbohydrates and polyubiquitinated proteins, RNA, mitochondria, and fragments of the ER (Eskelinen and Saftig, 2009). The most characterized protein associated with autophagosomes is the lipidated (phosphatidylethanolamine) form of microtubule-associated protein light chain 3 (MAP-LC3), known as LC3-II, which is generated early in the autophagic process but degraded in the final phase of autophagic digestion.

Autophagic flux (the rate at which autophagic vacuoles are processed by lysosomes) is reduced in most LSDs (Ballabio, 2009 Ballabio and Gieselmann, 2009 Raben et al., 2009). This is evident from the combined elevation of autophagic substrates and autophagosome-associated LC3-II. LSD cells often display increased numbers of LC3(+) organelles, of which only a subgroup carry lysosomal markers, suggesting that both autophagosomes and autolysosomes persist in these conditions. For example, in mouse models of Batten disease (a neuronal ceroid lipofuscinosis [NCL] disorder Table 1 ), most LC3-positive compartments are not positive for LAMP1 (Koike et al., 2005), and in multiple sulfatase deficiency and juvenile neuronal ceroid lipofuscinosis, LC3 and LAMP1 are predominantly localized in separate organelles, which is even more pronounced after starvation (Cao et al., 2006 Settembre et al., 2008). Endosome–lysosome and autophagosome–lysosome fusion is also impaired in mucolipidosis type IIIA and multiple sulfatase-deficient mouse embryonic fibroblasts (Fraldi et al., 2010).

Microautophagy does not involve de novo synthesis of nascent vacuoles, but rather occurs via the direct pinocytosis of cytosolic material by lysosomes ( Fig. 1 C ). The membrane dynamics regulating microautophagy are similar to those involved in the formation of intra-luminal vesicles (ILVs) found in multivesicular bodies/late endosomes (Sahu et al., 2011). Currently, little is known about the repercussions of lysosomal storage on microautophagy, but this process appears to be impaired in primary myoblasts from patients with the muscle-wasting condition Pompe disease (Takikita et al., 2009).

Chaperone-mediated autophagy (CMA) is a selective form of autophagic proteolysis that targets proteins containing a KFERQ motif for degradation (Dice et al., 1990 Cuervo and Dice, 2000). The eponymous chaperone that recognizes and binds to proteins destined for CMA is the heat shock cognate protein of 70 kD (Hsc70). Substrate-bound Hsc70 docks on lysosomes via contact with lysosomal-associated membrane protein 2A (LAMP-2A), allowing entry of proteins into lysosomes ( Fig. 1 D ). Mutations in LAMP-2A cause Danon disease, and specifically affect CMA (Eskelinen et al., 2003 Fidziańska et al., 2007). CMA is also known to be impaired in mucolipidosis IV, where mutations in transient receptor potential mucolipin-1 (MCOLN1) lead to reduced amounts of LAMP-2A and substrate uptake into lysosomes (Venugopal et al., 2009).

Lysosome reformation.

Both endolysosomes and autolysosomes extend tubular structures where lysosomal hydrolases and LMPs concentrate (Tjelle et al., 1996 Bright et al., 1997, 2005 Pryor et al., 2000 Yu et al., 2010). At the ends of these tubules, [LC3(−), LAMP1(+)] vesicles bud off and acidify, maturing into dense lysosomes, a fission process referred to as lysosome reformation. This event completes each cycle of endocytic and autophagic degradation, yielding dense lysosomes that are available to fuse with newly generated endosomes and autophagosomes.

Efficient processing of endo/autolysosomal substrates is essential for lysosome reformation. This is well illustrated in a study that monitored exogenous sucrose metabolism in rat kidney fibroblasts (Bright et al., 1997). Sucrose is a disaccharide composed of the monosaccharides glucose and fructose, and is itself indigestible by cells. In this study, sucrose-filled endosomes fused with lysosomes and formed large endolysosomes, which accumulated in the cytosol. A depletion of dense-core lysosomes was seen under these conditions however, dissolution of the accumulated sucrose by uptake of exogenous invertase resulted in the reappearance of dense-core lysosomes. This study and another more recent one from Yu et al. (2010) indicate that lysosome biogenesis does not occur de novo, but is rather born out of a reformation/budding from endolysosomes. Lysosome reformation appears to be defective in sialic acid storage disease as skin fibroblasts from diseased individuals lack dense lysosomes, while lysosomal enzymes persist in intermediate or light organelles (Schmid et al., 1999).

Interestingly, impairment of lysosome reformation appears to be the primary cellular defect in Niemann-Pick type C2 (NPC2)-deficient cells, indicating that the NPC2 protein has a crucial role in this process (Goldman and Krise, 2010). Considering that NPC1 and NPC2 deficiencies have the same pathological consequences (Niemann-Pick type C disease Table 1 ), this suggests that lysosome reformation is as essential as endosome/autophagosome–lysosome fusion, which is impaired in NPC1-deficient cells.

Recent reports have provided a mechanistic link between the failure of endo/autolysosomal clearance and the deficit of lysosome reformation. Central to this pathway is mTOR, a serine/threonine kinase that has an overarching role in coordinating cellular metabolism with nutritional status (Laplante and Sabatini, 2012). During the course of the autophagic process, mTOR goes through a cycle of phosphorylation-dependent inactivation and reactivation, with the latter being required for autophagic lysosome reformation (Yu et al., 2010). In turn, mTOR reactivation depends on the completion of autolysosomal substrate digestion, and sufficient levels of luminal amino acids (Zoncu et al., 2011). Limited information is currently available on the extent of lysosome reformation and mTOR reactivation in LSDs. However, inadequate autolysosomal degradation may preclude mTOR reactivation and, hence, also impede lysosome reformation, leaving affected cells deprived of dense lysosomes. Consequently, in addition to stalled autolysosomes, autophagosomes may persist due to a deficiency of dense lysosomes, explaining the low level of colocalization of autophagosomal and lysosomal markers. mTOR activity is reduced in the brain of a mouse model of juvenile neuronal ceroid lipofuscinosis (Cao et al., 2006), in fibroblasts from mucopolysaccharidosis type I S, Fabry disease and aspartylglucosaminuria subjected to starvation-induced autophagy (Yu et al., 2010), in NPC1- and NPC2-knockdown human umbilical vein endothelial cells (Xu et al., 2010), and in MCOLN1-deficient Drosophila pupae (Wong et al., 2012), but not in brain samples from Sandhoff, GM1-gangliosidosis, and NPC1 mice (Boland et al., 2010). Considering the myriad of cellular signaling pathways that mTOR is involved in (Laplante and Sabatini, 2012), it may be necessary to differentiate mTOR activity in affected cell populations of different brain regions. In addition, electron microscopy remains a powerful tool for the ultrastructural classification of autophagosomes and autolysosomes in LSD cells, and could also be used to monitor the extent of lysosome reformation.

Mitochondrial dysfunction and cytoplasmic protein aggregation.

In LSDs, a reduction of autophagic flux has a major impact on mitochondrial function and on cytoplasmic proteostasis. Constitutive macroautophagy maintains mitochondrial quality by selectively degrading dysfunctional mitochondria via a process known as mitophagy (Kim et al., 2007). Mitochondrial proteins are consistently found in the proteomes of highly purified autolysosomes, especially subunits of the mitochondrial ATPase (Schrr et al., 2010). Reduced autophagic flux in LSDs leads to the persistence of dysfunctional mitochondria, which is highly pronounced in Batten’s disease neurons (Ezaki et al., 1996). Several LSDs (mucolipidosis types IV, IIIA [pseudo-Hurler polydystrophy], and II [I-cell disease], late infantile neuronal ceroid lipifuscinosis [CLN2], mucopolysaccharidosis VI, and GM1 gangliosidosis) display mitochondrial abnormalities, including replacement of the extended filamentous mitochondrial network with high numbers of relatively short mitochondria, and loss of mitochondrial calcium-buffering capacity and membrane potential (Jennings et al., 2006 Settembre et al., 2008 Takamura et al., 2008 Tessitore et al., 2009). Studies into aging and autophagosome formation have shown that mitochondria are involved in signaling pathways regulating apoptosis and innate immunity, and that reduced autophagic flux and subsequent accumulation of dysfunctional, reactive oxygen species–generating mitochondria renders cells more sensitive to apoptotic and inflammatory stimuli (Terman et al., 2010 Green et al., 2011 Nakahira et al., 2011 Zhou et al., 2011). Therefore, the aberrant functioning of mitochondria may be responsible for apoptosis and inflammation in the CNS of multiple LSDs.

In addition, a lack of autophagy completion in LSDs leads to the persistence of ubiquitinated and aggregate-prone polypeptides in the cytoplasm, including p62/SQSTM1, α-synuclein, and Huntingtin protein (Ravikumar et al., 2002 Suzuki et al., 2007 Settembre et al., 2008 Tessitore et al., 2009). Alpha-synuclein itself contributes to neurodegeneration by reducing the efficiency of autophagosome formation (Winslow et al., 2010), and is also a main component of Lewy bodies that are notably elevated in Parkinson’s disease and other forms of dementia. Diminished quality control of cytosolic proteins may thus also contribute to LSD pathology.

Impairment of autophagy and escalation of cytoplasmic protein aggregation are shared between neurodegenerative LSDs and more common neurodegenerative disorders, such as Alzheimer’s, Parkinson’s, Huntington’s disease, and amyotrophic lateral sclerosis (ALS Garc໚-Arencibia et al., 2010 Wong and Cuervo, 2010). Mutations in presenilin-1, which cause a familial form of Alzheimer’s disease, is known to impair lysosomal clearance of autophagosomes (Esselens et al., 2004 Wilson et al., 2004 J.H. Lee et al., 2010). Different mechanisms have been proposed to explain how the partial loss of presenilin function impairs autophagic flux. Reports from J.H. Lee et al. (2010) indicate that presenilin 1 is need for the glycosylation and subsequent delivery of V0a1 protein to lysosomes, where it forms a subunit of lysosomal v-ATPase. This in turn is thought to impair lysosomal proteolysis by raising their pH above an optimal acidity of pH4𠄵. Alternatively, another recent report has indicated that mutations in presenilin 1 lead to a loss of lysosomal calcium regulation, which in turn affects fusion and clearance of autophagosomes (Coen et al., 2012). However, considering both groups confirmed that presenilin 1 mutations affect autophagic flux, Alzheimer’s disease is beginning to emerge as a neurodegenerative disorder that may share similarities in terms of underlying pathogenic mechanisms with lysosomal storage disorders.

Efflux of molecules from endo/autolysosomes.

Some storage molecules in LSDs (glycoconjugates, amino acids, or insoluble lipids) escape from cells and can be detected in blood and/or urine, which can be utilized for diagnostic purposes (Meikle et al., 2004). While glycoconjugates derived from storage cells in multiple tissues could escape as solutes in blood and urine, lipids extracted from urine are believed to be membrane associated and predominantly exosomal (Pisitkun et al., 2004).

At the cellular level, a big question that remains to be resolved concerns the way in which storage molecules escape the lysosomal system and affect the function of other organelles and cellular systems (Elleder, 2006). Theoretically, lipids can undergo redistribution within cells via membrane trafficking, fusion, or via altered trafficking pathways characteristic of these diseases (Chen et al., 1999). Endolysosomal macromolecules may also be disseminated via membrane contact sites between endolysosomes and the ER (Eden et al., 2010 Toulmay and Prinz, 2011), and by extracellular secretion of endolysosomal content, including exosome release. For example, primary kidney cells from arylsulfatase A�icient mice secrete the accumulating lipid (sulfogalactosylceramide) into the culture medium (Klein et al., 2005), and NPC1-deficient cells release higher amounts of cholesterol-rich exosomes (Chen et al., 2010 Strauss et al., 2010). Accordingly, the possibility needs to be considered that exosomes containing storage molecules are taken up by recipient cells, and that these macromolecules and lipids affect recipient cell function by distributing to the plasma membrane and other organelles outside the endolysosomal system (Simons and Raposo, 2009).

Due to the extraordinarily high levels of lipids in the endo/autolysosomal system, even a minor redistribution to other cellular membranes could have functional implications. Over the past few years, multiple examples have emerged suggesting that this not only occurs but can actively contribute to the pathogenic cascade (Vitner et al., 2010). A key challenge is to demonstrate experimentally that particular storage macromolecules are indeed ectopically present in the membrane of other organelles. This is technically challenging due to the limitations of conventional cell fractionation techniques. Currently, the presence of storage components in non-lysosomal sites is either inferred indirectly or evidence has been provided by immunostaining methods. To date, the best examples come from studying the effects of lipid storage in the ER (Sano et al., 2009 Futerman, 2010).

Lysosomal calcium homeostasis.

Endosomes and lysosomes are regulated calcium stores (Morgan et al., 2011) that release calcium in response to the second messenger nicotinic acid adenine dinucleotide phosphate (NAADP Churchill et al., 2002). NPC1 disease is unusual in having a profound block in late endosome–lysosome fusion (Kaufmann et al., 2009 Goldman and Krise, 2010), a process known to be calcium dependent (Lloyd-Evans et al., 2008). In NPC1 patient cells and cultured cells deficient in NPC1 protein, calcium levels within acidic organelles are approximately 30% of wild-type cells (Lloyd-Evans et al., 2008 H. Lee et al., 2010). NPC1 cells do respond to NAADP, but, due to the reduced luminal calcium levels, release less calcium, thus leading to the fusion deficiency associated with this disorder (Lloyd-Evans et al., 2008). Therefore, NPC1 disease demonstrates that acidic calcium stores play a central role in the regulation of fusion and trafficking within the endocytic system itself (Morgan et al., 2011).

Endoplasmic reticulum defects.

In addition to the endoplasmic reticulum (ER) being the major site of the secretory pathway responsible for protein folding/quality control and N-glycosylation, it is also a regulated calcium store. The lipid and protein content of the ER is tightly regulated to maintain its essential quality-control functions. Surprisingly, very few examples of ER stress (e.g., unfolded protein response) have been reported among LSDs, with GM1 gangliosidosis being the only sphingolipid storage disorder in which this has been demonstrated to date (Tessitore et al., 2004 Sano et al., 2009 Vitner et al., 2010). Instead, the major impact in lipid storage disorders is on ER calcium regulation (Futerman and van Meer, 2004 Futerman, 2010). ER calcium homeostasis is perturbed in the sphingolipid storage disorders, Gaucher disease, GM1 and GM2 gangliosidoses, and Niemann-Pick type A (Ginzburg and Futerman, 2005), leading to elevated cytosolic calcium. In these diseases, the characteristic lipids being stored, glucosylceramide, GM1 and GM2 ganglioside, and sphingomyelin, respectively, may hypothetically escape from endolysosomes and affect ER calcium channel function. Interestingly, the mechanisms leading to defective ER calcium homeostasis are specific to each disorder and have recently been reviewed (Vitner et al., 2010). In turn, aberrant ER calcium regulation may impact mitochondria through ER–mitochondria contact sites, resulting in mitochondrial calcium excess and an induction of mitochondria-mediated apoptosis, as seen in GM1 gangliosidosis (Sano et al., 2009).

The Golgi.

Dysfunction of the Golgi is a common feature of many lipid storage disorders, and has traditionally been thought to arise from alterations in sphingolipid trafficking from the Golgi to the lysosome (Pagano et al., 2000). However, recently Golgi involvement has been demonstrated in mucopolysaccharidosis IIIB (Sanfillipo B syndrome Vitry et al., 2010). Surprisingly, this study did not find any evidence that the endocytic and autophagic pathways were affected in Sanfillipo B syndrome instead, they noticed that large storage bodies were enriched in the Golgi matrix protein, GM130, which is required for vesicle tethering in pre- and cis-Golgi compartments. Furthermore, the morphology of the Golgi apparatus was altered in cells with distended cisternae connected to LAMP1-postive storage bodies. This study therefore suggests that Golgi biogenesis may be affected in this disease and further studies will shed light on the molecular mechanisms that underpin Golgi involvement in this neurodegenerative disorder.


There are reports of peroxisomal dysfunction occurring in some lipid lysosomal storage diseases, including Krabbe (globoid cell leukodystrophy Haq et al., 2006) and NPC1 disease (Schedin et al., 1997). In Krabbe disease, the major storage lipid galactosylceramide is converted into its lysosomal metabolite, galactosylsphingosine, which down-regulates the peroxisome proliferator�tivated receptor-α (PPAR-α). Loss of PPAR-α and subsequent cell death can be prevented using an inhibitor of secretory phospholipase A2, suggesting a novel therapeutic approach for Krabbe disease (Haq et al., 2006). In the NPC1 disease mouse model, peroxisomes appear normal at the ultrastructural level but have decreased peroxisomal β oxidation of fatty acids and catalase activity, which is an early event in disease pathogenesis (Schedin et al., 1997). In peroxisomal biogenesis disorders such as Zellweger syndrome and infantile Refsum disease, a-series gangliosides (e.g., GM1, GM2) and their precursor GM3 ganglioside are stored. As these gangliosides are common secondary storage metabolites in many LSDs, this raises the possibility that peroxisomal dysfunction underpins secondary ganglioside storage in LSDs and merits systematic study to test this hypothesis. How peroxisomal function affects ganglioside metabolism remains unknown but may be part of a broader lipid regulatory network in mammalian cells.

Cellular metabolic stress.

Considering that both endocytic and autophagic pathways are essential for maintaining cellular metabolic homeostasis, the diminished efflux of monomeric products from endo/autolysosomes is likely to induce a state of metabolic insufficiency, where key catabolic intermediates are unavailable to enter a variety of metabolic recycling pathways (Schwarzmann and Sandhoff, 1990 Walkley, 2007). For example, in some cell types, the majority of nascent glycosphingolipids are synthesized from endolysosome-derived sphingoid bases derived from ceramide catabolism (Tettamanti, 2004 Kitatani et al., 2008). Multiple endolysosomal exoglycosidases, including glucocerebrosidase, which is deficient in Gaucher disease, are involved in this process (Kitatani et al., 2009). The lack of reutilized sphingolipids/fatty acids that normally result from endolysosomal degradation would place such cells under significant metabolic stress. This may also apply to NPC disease, which is a particularly complex and enigmatic storage disease caused by mutations in either the NPC1 or NPC2 genes, with resulting storage of several lipids species including cholesterol and various sphingolipids (Lloyd-Evans and Platt, 2010). The NPC1 protein is an integral membrane protein of late endosomes that may function to efflux sphingosine (protonated at acidic pH) out of endolysosomes and into the sphingolipid salvage pathway or undergo phosphorylation to sphingosine-1-phosphate (S1P), raising the possibility that S1P deficiency contributes to NPC1 disease pathogenesis (Lloyd-Evans et al., 2008 Lloyd-Evans and Platt, 2010).

Therapeutic implications

Over the past two decades there has been a remarkable expansion in the number of therapeutic strategies for LSDs that target different cellular organelles ( Table 2 ). The first treatment that led to a licensed commercial product was enzyme replacement therapy (ERT) for type 1 Gaucher disease. The discoveries leading to that seminal therapeutic advance were recently reviewed by Roscoe Brady, who pioneered this approach (Brady, 2010). This therapy “replaces” the defective enzyme in the lysosome by delivering a fully functional wild-type enzyme that is endocytosed into macrophages via the macrophage mannose receptor. Wild-type glucocerebrosidase was initially purified from human placenta (now recombinant products are used) and typically given to patients every two weeks by intravenous infusion (Charrow, 2009). This strategy leads to a remarkable degree of therapeutic benefit and has transformed the lives of patients with this debilitating peripheral storage disease (Charrow, 2009). This success catalyzed the development of ERT for Fabry disease (Schiffmann and Brady, 2006 Angelini and Semplicini, 2012), Pompe disease (Angelini and Semplicini, 2012), and several of the mucopolysaccharide storage disorders (Kakkis, 2002). However, the clinical limitations of ERT are two-fold. First, product delivery is invasive and time-consuming to deliver, and second, lysosomal enzymes do not cross the blood𠄻rain barrier to any significant extent, so cannot effectively treat CNS disease, which is characteristic of most LSDs. To circumvent this problem, bone marrow (BM) transplantation from healthy donors has been evaluated in some of these diseases. Microglia are of BM origin and over time a few donor-derived monocytes enter the CNS and serve as local sites of wild-type enzyme production, which can be taken up via secretion-recapture by neighboring host cells. On the whole, BM transplantation is only effective if it is performed in early infancy, does not show efficacy in all LSDs, and is not curative (Wraith, 2001). Further complications include the need for human leukocyte antigen (HLA) matched donors, the high rate of mortality associated with recipients, and the lack of standardization amongst different BMT regimens in different clinical centers.

Table 2.

Status of approved treatments and experimental therapies for LSDs with selected bibliography

TherapyTarget organelleIn vitro POCIn vivo POCClinical trialsRegulatory approvalReferences
Enzyme replacement (ERT)Lysosome++++Brady, 2006b Neufeld, 2011
Bone marrow transplantation (BMT)Lysosome+++N/AKrivit, 2002 Brady, 2006a
Substrate reduction therapy (SRT)Golgi++++Platt and Butters, 2004 Platt and Jeyakumar, 2008 Cox, 2010
Enzyme enhancement therapy (EET)ER/lysosome+In progressOkumiya et al., 2007 Fan, 2008
Gene therapy (GT)Nucleus++In progressGritti, 2011 Tomanin et al., 2012
Stop codon read-throughNucleus+Brooks et al., 2006
Calcium modulation therapy (CMT)ER++Lloyd-Evans et al., 2008
Enhanced exocytosis therapy (ExT)Exosome+Strauss et al., 2010 Medina et al., 2011
Chaperone therapy by HSp70 (CT)Lysosome+Kirkegaard et al., 2010
Proteostasis regulation therapy (PRT)ER+Balch et al., 2008 Mu et al., 2008
Cholesterol removal using cyclodextrin in NPC1 diseaseLysosome++Davidson et al., 2009 Ward et al., 2010 Aqul et al., 2011

Another therapy to be developed and subsequently approved for LSDs was substrate reduction therapy using the oral small molecule imino sugar drug, miglustat (Lachmann, 2006). This has been approved for type 1 Gaucher disease (worldwide) for over a decade, and in 2009 for treating neurological manifestations in Niemann-Pick type C disease (now approved in most countries/regions, except the USA Patterson et al., 2007). Miglustat targets the Golgi enzyme, glucosylceramide synthase (Platt et al., 1994), and by partially inhibiting glycosphingolipid biosynthesis it reduces the catabolic burden of these molecules on lysosomes that cannot digest them. It has the potential to be used in diseases with glycosphingolipid storage, as miglustat inhibits the first committed step in the biosynthesis of this family of lipids. Also, miglustat crosses the blood𠄻rain barrier, hence its disease-modifying benefit in Niemann-Pick type C disease (Patterson et al., 2007). Like all drugs, this compound has side effects, the primary one being inhibition of disaccharidases, which can lead to gastrointestinal symptoms, particularly in the first 1𠄲 months of therapy. More recently, eliglustat tartrate (Genz-112638) has entered clinical trials in type 1 Gaucher disease as an oral substrate reduction therapy. As this drug has a different chemistry to miglustat, it also has a different side-effect profile (Cox, 2010).

There are currently several alternative therapeutic strategies that have shown utility in tissue culture models and/or in animal models of these diseases and are summarized in Table 2 . Many of these approaches target non-lysosomal organelles. No doubt as more is known about pathogenic cascades and their impact on cellular organelles, additional creative approaches to treatment will emerge and undergo pre-clinical testing. Due to the severity and complexity of these disorders it is likely that ultimately a combination therapy will be needed to target multiple steps/organelles in the pathogenic cascade.


In conclusion, we have provided some selective examples illustrating the complexity of how lysosomal dysfunction impinges upon multiple aspects of cell biology, often in unanticipated ways (summarized in Fig. 3 ). Many questions remain unanswered at the present time, and some of these are highlighted in Box 1. However, the study of these rare diseases ( Table 1 ) fills two voids in our knowledge, namely providing fundamental insights into lysosomal biology and in leading to novel approaches to generate next-generation therapeutic interventions for treating these truly fascinating yet devastating disorders ( Table 2 ). It is clear that although storage is primarily initiated in the late endosomal𠄺utophagic–lysosomal system, it induces a pathogenic cascade that impacts on multiple cellular systems and organelles, suggesting that conceptually we should view these diseases as cellular storage disorders and use this broader knowledge for the design of therapeutic interventions.

Box 1. Open Questions

• How does storage affect other aspects of lysosomal function, independent of the primary storage metabolite?

• How does storage trigger innate immune activation?

• How does lysosomal storage affect cell signaling?

• How do storage lipids escape the lysosome and affect the function of other organelles?

• What is the hierarchy of the pathogenic cascade in these diseases, which steps should be targeted for optimal therapy?

• Do the genetic defects in the neuronal ceroid lipofuscinoses (NCL disorders) cause convergent symptoms by chance, or are the disparate genes functioning in common cell biological pathways?

Synaptic capture

Retrograde signaling from the synapse to the nucleus

One of the features that fundamentally distinguishes the storage of long-term memory from short-term cellular changes is the requirement for the activation of gene expression. Given this requirement at the nucleus, one might expect that LTF would have to be cell-wide. However, experiments by Martin et al. using local applications of serotonin in the Aplysia bifurcated sensory neuron-two motor neuron culture preparation[63, 72], as well as parallel experiments by Frey and Morris in the hippocampus[73], demonstrated that synapses could be modified independently in a protein synthesis–dependent manner. Thus, LTF and the associated synaptic changes are synapse-specific, and this synapse specificity also requires CREB-1 and is blocked by an antibody to CREB-1. This implies that there must be not only retrograde signaling from the synapse back to the nucleus[72, 74], but also anterograde signaling from the nucleus to the synapse. Recently, Thompson et al.[75] have found that serotonin stimulation which produces LTF in Aplysia sensory-motor neuron co-cultures triggers the nuclear translocation of importins, proteins involved in carrying cargos through nuclear pore complexes (see also[74]). Similarly, in hippocampal neurons, NMDA activation or LTP induction, but not depolarization, leads to translocation of importin[75]. Although details underlying the translocation of these retrograde signals remain unknown, the effector molecules identified thus far appear to be conserved in both invertebrates and vertebrates. The future identification of the molecular cargoes of importin and its signaling role in the nucleus are likely to increase our understanding of how transcription-dependent memory is regulated.

Following transcriptional activation, newly synthesized gene products, both mRNAs and proteins, have to be delivered specifically to the synapses whose activation originally triggered the wave of gene expression. To explain how this specificity can be achieved in a biologically economical way in spite of the massive number of synapses in a single neuron, Martin et al.[49, 61, 72] and Frey and Morris[73] proposed the synaptic capture hypothesis. This hypothesis, also referred to some times as synaptic tagging, proposes that the products of gene expression are delivered throughout the cell, but are only functionally incorporated in those synapses that have been tagged by previous synaptic activity. The “synaptic tag” model has been supported by a number of studies both in the rodent hippocampus[73, 76–78] and Aplysia[63, 72].

Molecular mechanisms of synaptic capture

Studies of synaptic capture at the synapses between the sensory and motor neurons of the gill-withdrawal reflex in Aplysia have demonstrated that to achieve synapse-specific LTF more than the production of CRE-driven gene products in the nucleus is necessary. One also needs a PKA-mediated covalent signal to mark the stimulated synapses and local protein synthesis to stabilize that mark[63, 72]. Thus, injection into the cell body of phosphorylated CREB-1 gives rise to LTF at all the synapses of the sensory neuron by seeding these synapses with the protein products of CRE-driven genes. However, this facilitation is not maintained beyond 24–48 hours and not accompanied by synaptic growth unless the synapse is also marked by the short-term process, a single pulse of serotonin[63].

How is a synapse marked? Martin et al.[72] found two distinct components of marking in Aplysia, one that requires PKA and initiates long-term synaptic plasticity and growth, and one that stabilizes long-term functional and structural changes at the synapse and requires (in addition to protein synthesis in the cell body) local protein synthesis at the synapse. Since mRNAs are made in the cell body, the need for the local translation of some mRNAs suggests that these mRNAs are presumably dormant while they are transported from the cell body to the synapses of the neuron and are only activated at appropriate synapses in response to specific signals. If that were true, one way of activating protein synthesis at these specific synapses would be to recruit to these synapses a regulator of translation that is capable of activating dormant mRNA.

Kausik Si began to search for such a regulator of protein synthesis. In Xenopus oocytes, Joel Richter had found that maternal RNA is silent until activated by the cytoplasmic polyadenylation element binding protein (CPEB)[79]. Si searched for a homolog in Aplysia and found in addition to the developmental isoform studied by Richter a new isoform of CPEB with novel properties. Blocking this isoform at a marked (active) synapse prevented the maintenance but not the initiation of long-term synaptic facilitation[80, 81]. Indeed, blocking ApCPEB blocks memory days after it is formed. An interesting feature about this isoform of Aplysia CPEB is that its N-terminus resembles the prion domain of yeast prion proteins and endows similar self-sustaining properties to Aplysia CPEB. But unlike other prions which are pathogenic, ApCPEB appears to be a functional prion. The active self-perpetuating form of the protein does not kill cells but rather has an important physiological function.

The Si lab and the Barry Dickson lab have found, independently, that long-term memory in Drosophila also involves CPEB for a learned courtship behavior in which males are conditioned to suppress their courtship upon prior exposure to unreceptive females. When the prion domain of the Drosophila CPEB is mutated, there is loss of long-term courtship memory[82, 83].

Prion-like proteins represent auto-replicative structures that may serve as a persistent form of information[84]. Si and I have recently proposed a model based on the prion-like properties of Aplysia neuronal cytoplasmic polyadenylation element binding protein (CPEB)[85]. Neuronal CPEB can activate the translation of dormant mRNAs through the elongation of their poly-A tail. Aplysia CPEB has two conformational states: one is inactive or acts as a repressor, while the other is active. In a naive synapse, the basal level of CPEB expression is low and its state is inactive or repressive. According to the model of Si et al., serotonin induces an increase in the amount of neuronal CPEB. If a given threshold is reached, this causes the conversion of CPEB to the prion-like state, which is more active and lacks the inhibitory function of the basal state[85]. Once the prion state is established at an activated synapse, dormant mRNAs, made in the cell body and distributed cell-wide, would be translated but only at the activated synapses. Because the activated CPEB can be self-perpetuating, it could contribute to a self-sustaining synapse-specific long-term molecular change and provide a mechanism for the stabilization of learning-related synaptic growth and the persistence of memory storage.

The Spemann Organizer has conserved functions
As gastrulation proceeds, the A/P axis become specified and as it progresses the blastopore lip can only induce more posterior structures.
Hensen's node (the chick Organizer), at the anterior of the primitive streak contributes to notochord and somites & can induce another axis (more difficult than frogs)
Xenopus and mouse share a number of genes that are expressed in the organizer.
FGF (fibroblast growth factor)
Control of Hox genes is unknown.

Neural plate is induced by mesoderm
Dorsal lip transplantation experiments demonstrate that a nervous system can be induced from ectoderm.
BMP-4, secreted growth factor, inhibits cells from forming neural tissue.
Inhibition of BMP-4 allows neural tissue.
noggin (secreted by the organizer) inhibits BMP-4 acts to dorsalize the mesoderm.
noggin also induces neural tissue.
chordin, expressed by the future neural plate cells of the organizer.
Both noggin and chordin directly bind BMP-4 and inactive it to allow the induction of neural tissue.

Hensen's node is the chick organizer
Hensen's node (chick) can induce neural gene expression in Xenopus ectoderm.
This demonstrates the evolutionary conservation of neural induction signals and confirm similarity of Hensen's node and Spemann's Organizer.
Early nodes induce anterior structures.
Latter nodes induce posterior structures.
Stem cells that arise from the node can specify different A/P positional values over time to produce the spinal column.
The capacity to produce signals that generate anterior structure is lost.

Notochord and somite development in chick
In the chick, mesoderm forms anterior to the regressing node of the primitive streak.
Pre-somitic mesoderm is the region between the last formed somite and the regressing node.
This region will become 4 or 5 somites which form simultaneously as pairs on either side of the notochord.
The position of somites along A/P axis determines fate.
Anterior somites form cervical vertebrate.
Posterior ones, ribbed thoracic vertebrate.
Develop in a temporal and spatial order.
Rearranging pre-somitic mesoderm will not change the pre-established timing.
The pattern is laid down earlier by an A/P axis signal.

Zebrafish spinal cord is distant from organizer
Ectoderm that gives rise to spinal cord is far from the organizer in zebrafish.
This is induced in a two-step manner:
1) FGF from ventral-vegetal region induce neurectoderm, then
2) BMP promote posterior neural tissue formation.

Neural plate is induced by signaling mechanism(s)
Nervous system can be patterned by signals from the mesoderm
In newt neurula, mesoderm transplantation into younger newt embryos, anterior explants induce head and brain.
Posterior explants induce trunk & spinal cord.
Neural plate explants induce specific neural structures (depending upon position) when transplanted beneath the ectoderm of a gastrula.

Model: The two signal model of neural patterning
Signal 1 from the mesoderm induces ectoderm to become anterior neural tissue. (chordin and noggin are good candidates)
Signal 2 turns part of this into posterior neural tissue in a graded manner (FGF, Wnts & retinoic acid are candidates).
Mouse and chick grafts of the primitive streak (ie node or Hensen's node) can also induce neural tissue.
This model differs from another model which suggests that there may exist a number of region specific inducer molecules.

Neural plate signals travel within the neural plate
Mesoderm does not have to lie in contact with ectoderm to induce it.
In newt and Xenopus embryos under high salt, the mesoderm does not enter the embryo but develops outside.
This physically separates the mesoderm from the ectoderm.
This abnormal embryo is called an exogastrula.
N-CAM, a neural cell-cell adhesion protein, neurogenic factors and other neural specific proteins can be expressed in the ectoderm in the correct A/P order in exogastrula which suggests that the inducing signal can travel a relatively long distance through the tissues.

Neural crest cells
Neural crest cells migrate away from the neural tube to develop into.
1) skull (bone)
2) sensory and autonomic nervous systems
3) pigment cells
Somites are formed after gastrulation along the antero-posterior axis.

Mesoderm and homeobox genes
Homeobox genes are.
a large family of transcription factors.
Share a similar 60 amino acid DNA binding homeodomain which is encoded by 180 basepair homeobox sequence.
Homeobox gene family (transcription factor proteins).
Homeotic transformation is often observed in mutants of genes that have this domain.
Identified first in Drosophila (Bithorax and Antennapedia complexes) as a split cluster.
There are four separate clusters of Hox genes (subset of the homeobox genes) in vertebrates.

Hox gene clusters
Hox genes ( Hox gene clusters) are a subset of the homeobox genes of transcription factor genes.
Might have arisen by rounds of duplication of an ancestral gene, followed by a quaduplication of the cluster in mammals.
Paralogous groups are composed of the most similar members of each cluster.
Partially overlapping zones of expression which vary in the anterior extent of their expression define distinct regions.
Various genes respond to the combination of gene products expressed.
Most homeobox genes are not Hox genes (i.e. Pax genes)

Hox genes pattern the A/P axis
The differences between vertebrate (i.e. anterior -attach to skull cervical thoracic have ribs lumbar, sacral and caudal) clearly demonstrate that identity of somites differ along the A/P axis.
Hox genes are expressed along the A/P axis in mouse.
First, anterior Hox genes expressed in early gastrulation as mesoderm begins to leave the primitive streak.
More posterior Hox genes turn on as development continues.
Defined patterns of Hox gene expression are seen in.
1) mesoderm (after somite formation) &
2) neural tube (neuralation).

Hox genes pattern the somites
Hox genes show a sharp anterior border and a much less defined posterior border.
Lot of overlap by every region (almost) has distinct set of Hox gene expression.
Most anterior somites express Hoxa1 and Hoxb1 only.
Posterior regions express all Hox genes.
The anterior head, forebrain and midbrain do not express Hox genes but have other homeobox genes (etx & otx.)
For example, the Hoxa complex has members that are expressed very differently.

Hox gene expression is co-linear
Hoxa1 has its most anterior expression in the posterior head.
Hoxa11 has its most anterior expression in the sacral (lower back) region.
Hox gene expression is co-linear as order of genes on the chromosome (per cluster) reflects the order of spatial and temporal expression along the A/P axis.
Hox gene expression is conserved between mouse and chick.

Mesoderm becomes notochord and somites
The fate of somites depend upon adjacent tissues signals.
In chick-quail trans-species grafts (with distinctive nuclei), somite fate maps have been constructed.
Dermamyotome is dorsal & lateral somites (express Pax3) and becomes the myotome (forms muscle cells) and the dermatome (an epithelial sheet that forms dermis).
Medial somites (MyoD) forms axial & back muscles.
Lateral region of somites form abdominal & limb muscles.
Ventral medial region of the somite contain sclerotome cells (future cartilage which express Pax1) and migrate to surround the notochord and form the vertebrate.
Notochord induces sclerotome cells.

From notochord transplantation experiments, an additional notochord induces unsegmented pre-somitic mesoderm
to produce greatly increased amount of cartilage.
In the mouse, FGF and retinoic acid gradients help pattern the A/P axis.

Neural tube (ventral side: the floor plate) induces cartilage.
Lateral plate mesoderm and the ectoderm induce the dermamyotome.
Signals that may pattern the somites are secreted signaling proteins that may include.
Sonic hedgehog which may specify the ventral somites.
BMP-4 which may specify the lateral somites.
Wnt family proteins which may specify the dorsal somites.

Regulation of the Pax homeobox genes (transcription factors)
Pax genes are regulated by signals from the notochord and neural tube to control the somitic cell fate.
Pax3 is expressed early in all cells that will form somites.
Pax3 is modulated by BMP-4 and Wnt to confine it to muscle precursors.
Pax3 is further down regulated in back muscle precursors but remains active in future limb muscle cells.
In mice, Splotch (Pax3-minus) mutants lack limb muscles.

Altering Hox gene expression alters axial patterning
In mice, gene knock-out experiments produce mutants.
There is redundancy, where a missing gene can be at least partially compensated for the expression of related genes.
Paralogous genes from another Hox complex may compensate for gene loss.
Posterior prevalence: mutation affects the anterior extent of gene expression.
Homeotic transformations (conversion of one body part to another) result from Hox gene loss.
Loss leads to cells assuming a "more anterior value" i.e. Hoxc8 mutant mice have extra ribs.
Abnormal expression of Hox genes in anterior regions lead to tissues becoming more like posterior positioned tissues.

Retinoic acid can alter positional value
Retinoic acid is a derivative of vitamin A.
It has very important role in signaling vertebrate development.
In early development, retinoic acid can cause homeotic transformation of the vertebrate.
It can diffuse across plasma membranes to bind protein receptors and form an active transcription factor.
Retinoic acid interferes with the normal expression of Hox genes.
Later, it can alter positional development in limb development.

Hindbrain rhombomeres restrict cell lineage
Posterior head & hindbrain development requires regionalisation and segmentation of the anterior neural tube.
Segmentation events in the 3 day chick embryo's posterior head include.
1) somite formation from mesoderm on either side of notochord,
2) the hindbrain (rhombocephalon) is divided into 8 rhombomeres, and
3) the lateral mesoderm forms the branchial arches.
(Note: spinal cord is segmented into dorsal root ganglia and ventral motor nerves by the somites)

Development of posterior head involves interactions
Neural crest cells innervate the face & neck to form the segmental cranial nerves.
Neural crest cells also give rise to peripheral nerves and bones including jaw (from the first branchial arch) and the bony parts of the ear from the second arch).
Eight rhombomeres form by constricting the freshly closed neural tube into eight evenly spaced sections.
Lineage restriction occurs with cells and their descendants remaining in their rhombomere.
Cell movement restriction depends on adhesive properties which depends upon ephrins and their receptors.
Within each rhombomere, the cells are under the control of the same genes & act as a developmental unit.

Neural crest cells have positional values
Chick neural crest cells can be labled & their fate mapped.
The cranial neural crest cells migrate out from the rhombomeres of the dorsal region of the hindbrain.
Branchial arch 1 is populated by cells from rhombomere 2,
Branchial arch 2 by rhombomere 4 cells &
Branchial arch 3 by rhombomere 6 cells.
Neural crest cells from rhombomeres 3 & 5 die by apoptosis (programmed cell death).
Transplantation of rhombomere 2 cells to where rhombomere 4 cells should be, results in formation of a second jaw.


Characterization of a set of genes with 5'UTR introns

To investigate the functional properties of human 5UIs, we used NCBI's Reference Sequence (RefSeq) collection. These are curated, full-length sequences with annotated UTR boundaries, and expression data are available for many of them. The lack of a translation reading frame makes the computational prediction of splice sites in 5'UTRs inherently more difficult [37], necessitating the choice of such a validated set. In humans, approximately 8.5k (35%) out of 24.5k RefSeq mRNAs contained at least one intron in their 5'UTR (Additional file 1). Previous estimates of the percentage of genes with 5UIs ranged between 22% and 26% [18] and 38% [19] in humans, suggesting that the RefSeq collection had no major bias in terms of presence or absence of 5UIs compared to other previously used datasets. The distribution of total 5'UTR intronic length for genes in our dataset was also similar to that observed previously (Figure 1a). The inter-quartile range of total length of 5UIs within each gene was approximately 1.3 - 16 kb. Some 5UIs were extremely long -- 16% were longer than 27 kb, the length of the average protein coding gene in the human genome [38], and 5% were longer than 76 kb (Figure 1a). As previously reported [18, 19], most genes had few 5UIs. More than 90% had a single intron, and the percentage of genes with two or more introns decreased exponentially (Figure 1b).

Characterization of fundamental properties of 5'UTR introns. (a) Histogram of the total 5'UTR intron length. A well annotated set of RefSeq transcript IDs are used in this analysis and this histogram shows the distribution of the log10 of the total number of intronic nucleotides in the 5'UTR. (b) Distribution of the number of introns in the 5'UTR. The log10 of number of transcripts that have a given number of introns in their 5'UTR is shown. The number of transcripts with a given number of 5'UTR introns decreases exponentially. (c) Heat map depicting the relationship between total lengths of 5'UTR introns and 5'UTR exons. (d) Heat map depicting the relationship between total lengths of 5'UTR introns and non-5'UTR introns. In both heatmaps, darker shades of gray indicate more transcripts.

We next considered the relationship between the total lengths of 5'UTR exons and of 5UIs. Even though there was a correlation between the lengths of 5UIs and 5'UTR exons overall, this correlation was slight and was driven by the genes with the longest 5UIs (Figure 1c Pearson correlation coefficient or Pearson correlation coefficient (PCC) = 0.21, P < 2.2e-16). In fact, when genes with 5UI lengths in the lowest 25th percentile were analyzed, the correlation was no longer significant (Figure 1c PCC = -0.005, P = 0.84). A statistically significant, albeit slight, correlation was found for genes with 5UI length below the median (Figure 1c PCC = 0.07, P = 8.4e-05). Among the genes with 5UIs, a similar relationship was evident between the total length of 5UIs and the total length of the remaining introns (Figure 1d). Although these two variables were significantly correlated (Figure 1d PCC = 0.18, P < 2.2e-16), the relationship was clearly driven by the genes with longer 5UIs. When genes with 5UI lengths either in the lowest 25th or 50th percentile were considered, correlation was negligible (Figure 1d PCC = -0.02 and 0.04, P = 0.53 and 0.04, respectively).

Thus, genes with long 5UIs tend to have a high total intronic length and longer 5'UTR exons. While this tendency holds in genes with additional introns, several genes with total 5UI lengths greater than 10 kb lack any coding-region or 3'UTR introns (Figure 1d). On the other hand, amongst genes with short 5UIs, the total length of 5UIs is uncorrelated with the lengths of either 5'UTR exons or the remaining introns.

Gene expression analysis

We next examined gene expression-related predictions of the two principal models of intron evolution. Previous studies have suggested that the genes with the highest expression levels are selected to have shorter introns [23]. If a similar selective pressure were acting on 5UIs (in conjunction with neutral evolutionary processes [19]), one would expect a tendency towards reduced gene expression level as a function of increased 5UI length in a subset of genes. We therefore compared gene expression from 79 tissues as a function of the total 5'UTR intronic length. We divided 5UI-containing genes into three categories with respect to the total 5'UTR intronic length (short, 0 to 25% intermediate, 25 to 75% long, 75 to 100% in length). The short 5UI-containing genes were highly overrepresented in the top 1% of mean expression level for the genes with 5UIs (Fisher's exact test, P = 3.3e-15) and also in the top 5% (Fisher's exact test, P = 1.7e-14) (Figure 2a). These genes were 12.7 times more likely than all other genes with 5UIs to be in the highest 1% of mean expression and 3 times more likely to be in the highest 5% of mean expression. There was also a global trend for genes with short 5UIs to be expressed at a higher level compared to genes with longer 5UIs (25 to 100 percentile in length one-sided Wilcoxon rank sum test, P = 2.98e-05 Figure 2a).

Expression analysis as a function of total 5'UTR intron length. (a) Heat map of the mean expression level versus the total 5'UTR intron length. The shade of gray represents the number of transcripts in each bin with darker shades implying more transcripts. The overrepresentation of short 5'UTR-intron-containing genes among the highest expression levels is apparent. (b) Quantile-quantile plot of total 5'UTR intron length of short 5'UTR intron-containing genes divided into highly expressed (top 5%) and other genes. The most highly expressed genes tend to have shorter 5'UTR introns. (c) Smoothed histogram of the mean expression level with respect to presence/absence of 5'UTR intron and its length. A kernel density estimator was fitted to the expression data and the corresponding probability density is plotted as a function of the mean expression level. The black line corresponds to the probability density for transcripts without any 5'UTR introns. Genes with long 5'UTR introns are represented by the red line while genes with short 5'UTR introns are represented by the blue line. The vertical line represents the top 5% of mean expression level of all genes. (d) Total 5'UTR intron length of genes in different expression level categories. The width of the boxes represents the relative number of data points in each category. Transcripts in the top 1% and top 5% in expression level tend to have shorter 5'UTR introns.

The enrichment for high expression in genes with short 5UIs held even when genes with the longest 25% of 5UIs were removed. In this case, the genes with the highest 1% and 5% expression were, respectively, 9.5 times and 2.5 times more likely to have short 5UIs as opposed to intermediate length 5UIs (25 to 75 percentile in length Fisher's exact test, P = 1.53e-11 and P = 3.21e-10, respectively).

The most highly expressed 5UI-bearing genes show a striking tendency to harbor short 5UIs. Of all 5UI-containing genes, 26% had a total 5UI length below 1.3 kb. By contrast, the corresponding fractions for genes in the top 5% and 1% by expression were 50% and 83%, respectively. We then separated short 5UI-containing genes into two groups: the most highly expressed genes (top 5% in expression) and the remaining genes. For the most highly expressed genes, the inter-quartile range of total 5UI length was 215 to 734 nucleotides compared with 289 to 870 nucleotides for the remaining genes (Figure 2b). Thus, the most highly expressed genes in humans are very strongly enriched for short 5UIs.

Interestingly, no expression dependence was observed among genes with intermediate or long 5UIs: genes with long 5UIs (top 25th percentile in length) did not tend to be expressed less than those with the intermediate length 5UIs (Wilcoxon rank sum test, P = 0.25). Also, no statistically significant depletion for the long 5UI category was observed in either the top 1% or the top 5% expression group (Fisher's exact test, P = 0.29, odds ratio = 0.25, and P = 0.017, odds ratio = 0.58, respectively). Thus, we did not observe the inverse relationship between expression and total 5UI length that might have been expected under the energetic cost model.

Next, we considered all RefSeq genes and asked whether having an intron in the 5'UTR has an effect on overall expression. We found no differences in 5UI representation in the top 1% or the top 5% of the mean expression groups. Furthermore, no difference was detected in the distribution of mean expression between genes with and without 5UIs (two-sided Wilcoxon rank sum test, P = 0.17). However, genes with short 5UIs were 1.8 times more likely to be in the top 5% and 3.3 times more likely to be in the top 1% in overall expression level than genes with no 5UIs (Fisher's Exact Test, P = 3.15e-08 and P = 7.57e-07, respectively) than genes with no 5UIs (Figure 2c). Thus, the presence of short 5UIs is correlated with high mean expression.

The observed expression trends could reflect the influence of genomic features other than 5UIs. Yet, short 5UIs do not seem to predict a short total length of either non-5'UTR introns or 5'UTR exons (Figure 1c, d). Furthermore, when genes in the top 5% in mean expression were divided into two groups with respect to 5UI presence or absence, we observed no differences in total non-5'UTR intron length between genes with 5UIs and those that lack these introns (Wilcoxon rank sum test, P = 0.20, data not shown). Therefore, the tendency of highly expressed genes to have short 5UIs is unlikely to be confounded by the effects of 5'UTR exons or the remaining introns.

For genes with the highest expression levels, these results are in contrast to the neutral model of 5UI evolution, which predicts that 5'UTR intronic length should not depend on expression level. These results are also not explained by the energetic cost hypothesis, which would predict that genes with the highest expression levels should be less likely to have 5UIs. In stark contrast to the predictions of each model, we found the most highly expressed genes to be significantly enriched in short 5UIs. Furthermore, the energetic cost hypothesis would also predict a linear decrease in the total 5UI length as a function of increasing gene expression. Yet, we found no overall differences with respect to 5UI length except for the most highly expressed genes. Even though a neutral model of 5UI evolution is plausible for most genes, our results for the most highly expressed genes are inconsistent with both neutral and energetic cost models (Figure 2d).

We next used expression to assess the applicability to 5UIs of the other major hypothesis of intron evolution, the 'genome design model', which predicts that intermediate or long introns should be enriched in tissue-specific genes as a consequence of complex regulation. As originally outlined, the genome design model explicitly disregards 5UIs [27] however, a direct corollary of this hypothesis is that genes with higher variance in expression across tissues should have intermediate or long introns in their 5'UTRs as well.

We sought to address two potential sources of bias. First, gene expression levels vary greatly and variance is strongly correlated with mean expression. Therefore, we calculated the standard deviation-to-mean ratio (coefficient of variation or CV) [39], a normalized measure of dispersion, for each gene across all tissues. Second, due to technological limitations of expression arrays, precise measurement of expression level is more difficult for genes with low or no expression in a given tissue therefore, artificially high variance in expression might be observed for genes with low mean expression across all tissues. We therefore calculated a robust measure of dispersion that minimizes this effect:

where CV xis the CV of expression of gene x across all tissues, y xrepresents the vector of CV values for all 201 genes in a window centered around gene x, while μ1/2 and MAD represent the median and median absolute deviation, respectively. As expected, genes with low expression tended to have much more variability across tissues (Figure 3a). Based on the observed trend line, the genes with the lowest 25% expression were removed from further analysis (Figure 3a). The remaining genes were sorted into three categories with respect to the total intronic 5'UTR length as before (short, 0 to 25% intermediate, 25 to 75% long, 75 to 100%). We found no significant differences between these groups with respect to inter-tissue variability as measured by the coefficient of variation (Figure 3b Kruskal-Wallis rank sum test, df = 2, P = 0.23). We then examined the lengths of the introns as a function of variability in expression (Figure 3c). The genes with the highest 5% variability across tissues did not differ from the other genes with respect to their 5UI lengths (Wilcoxon rank sum test, P = 0.07, 95% confidence interval between -0.008 and 0.25), but the genes with highest 1% across-tissue variability tended to have slightly shorter 5UIs (Wilcoxon rank sum test, P = 0.006, 95% confidence interval between -0.67 and -0.11). Genes with short 5UIs were also overrepresented in the top 1% across-tissue variability category (Fisher's Exact Test, P = 0.005, odds-ratio = 2.7). Our results suggested that length of the 5UI was not a major factor in determining across-tissue variability but there was a preference for shorter 5UIs in the most variable genes.

Analysis of variability in expression across tissues as a function of the total 5'UTR intron length. (a) Transcripts with low mean expression have higher normalized expression variability. A standardized measure of the variability in gene expression across tissues was calculated and plotted against the natural logarithm of mean expression level. The black vertical line represents the lowest 25th percentile in mean expression. Since transcripts with low levels of mean expression tend to exhibit an artificially high variability in expression, they are removed from further analysis. (b) Boxplot of the coefficient of variation (standard deviation-to-mean ratio) of genes grouped by the total length of 5'UTR intron. The width of the boxes represents the relative number of data points in each category. There are no apparent differences between the three groups (c) Boxplot of log10 of total 5'UTR intron length of genes grouped by their across-tissue variability. Genes are divided into six categories depending on their coefficient of variation. Error bars correspond to standard deviation of the mean. No obvious dependence of expression variability to total 5UI length can be observed except for the most highly variable genes, which tend to have slightly shorter 5'UTR introns. (d) Boxplot of log10 of total 5'UTR intron length for gene groups defined by the number of tissues in which expression of each gene was detected. A gene was defined to have detectable expression in a given tissues if its expression was higher than the 25th percentile of mean expression of all genes. We found no differences in total 5'UTR intron length amongst the different gene groups. (e) Histogram of number of genes divided by the presence of 5'UTR introns and by the number of tissues in which expression was detected. The number of tissues in which expression was detected was independent of the presence of 5'UTR introns.

Although our approach reliably captures across-tissue variability in gene expression, it disregards any potential effects of 5UI presence or length on how widely a gene is expressed. To consider the potential impact of such effects, we calculated the number of tissues in which expression was detected for each gene. Based on our analysis presented in Figure 3a, we defined a given gene as 'present' in a given tissue if its expression was greater than the 25th percentile in the distribution of mean expression over all tissues, calculated for all genes. Genes were placed into one of five classes according to the number of tissues in which they were present. No significant difference was detected amongst the corresponding five distributions of total 5UI length (Figure 3d Kruskal-Wallis rank sum test, df = 4, P = 0.19). Furthermore, the distribution of number of tissues in which each gene was present did not differ between genes containing and lacking 5UIs (Figure 3e). These results clearly contradict predictions of the 'genome design' hypothesis, in that narrowly expressed genes did not show a greater tendency to contain 5UIs nor did they tend to have longer 5UIs. These results strongly suggest that the evolution of 5UIs is not driven primarily by the selective pressures proposed by the 'genome design' hypothesis.

Functional enrichment of Gene Ontology categories

Under the neutral model, genes with 5UIs should be uniformly distributed across functional groups. We used Gene Ontology (GO) function annotations to determine which groups of genes are enriched or depleted in 5UIs, if any. Two popular functional trend analysis tools, FuncAssociate [40] and GoStat [41], were used for this analysis. One key challenge was the translation of the gene identifiers from RefSeq RNA IDs to those used in the GO database. There are different approaches to this problem and the two software packages differ from each other in this respect. FuncAssociate uses the Synergizer [42] software to resolve the problem of synonyms while GoStat uses definitions in the UniGene database as well as the information provided in the GO databases. Both software packages yielded very similar results, suggesting that our general conclusions were independent of the methods of synonym resolution or enrichment calculation.

A significant overrepresentation of genes with 5UIs was found in many regulatory pathways (Table 1). Non-receptor protein tyrosine kinases (NRTKs) formed the most highly overrepresented group, followed by genes involved in the regulation of actin organization, transcriptional regulators, and zinc ion binding proteins (Table 1). NRTKs lack transmembrane domains and therefore do not recognize extracellular ligands, unlike the majority of protein tyrosine kinases. Nevertheless, they play crucial roles in nearly all aspects of biology and are implicated in many cancers (reviewed in [43]). Among NRTKs, genes harboring 5UIs encode key regulatory kinases, such as the proto-oncogene tyrosine kinase SRC, c-src tyrosine kinase (CSK), janus kinases (JAK), spleen tyrosine kinase (SYK), tec protein tyrosine kinase (TEC), and Bruton agammaglobulinemia tyrosine kinase (BTK) among others.

To gain insight into the evolution of NRTK 5UIs, we identified orthologous genes in mouse and rat genomes corresponding to each human NRTK. We collected 5'UTR features for these genes in each genome using RefSeq annotations (Additional file 2). More widely studied organisms tend to have more accurate transcript structures and include many more splice variants in the RefSeq collection. For example, 18 human genes were represented by more than one transcript, while only four mouse and no rat NRTKs had more than one splice variant. The paucity of transcripts in some mammalian species is more likely to have arisen from limited testing rather than biology, given recent studies suggesting that alternative splicing is ubiquitous across several taxa [9].

UTRs are also generally less well defined in less intensively studied organisms. For example, ABL2, BTK, FRK and SRC all lack defined 5'UTR boundaries in the rat RefSeq collection, even though EST evidence suggests that SRC, BTK and ABL2 all have 5'UTR-containing transcripts (data not shown). Another current limitation is ambiguity in identifying the specific branch in which a given deletion or insertion event took place. Despite these shortcomings, a comparison of orthologs already provides insight into the dynamics of the evolution of 5UIs in NRTK genes.

When every ortholog of a given NRTK had at least one annotated 5UI, the lengths of those introns were generally highly correlated (Figure 4a). Given the number of different splice variants for each human gene, we used three different approaches to calculate the 5UI length for each gene. We either used the mean length of splice variants with non-zero 5UI lengths, or picked the variant with the longest 5UIs, or the one whose length was closest to its ortholog in either of the rat or mouse genomes. All three measures resulted in high correlation overall between 5UI lengths across species (PCC ranged between 89 and 91% for human-mouse and 79 and 89% for human-rat comparisons P < 0.0001 for all Figure 4a). As expected from evolutionary distances, the highest correlation in 5UI lengths was observed between rat and mouse orthologs of NRTKs (PCC = 93%, P = 1.4e-07).

Comparative genomics of 5'UTR introns within non-receptor tyrosine kinases. Several human NRTKs have multiple splice isoforms and for these we used three different methods for calculating total 5'UTR intron length: mean of 5'UTR intron length for isoforms with 5'UTR introns (HS_Mean) longest total 5'UTR intron length (HS_Longest) 5'UTR intron length most similar to its ortholog in the genome of interest (HS_Closest). (a) Heatmap of length correlation (considering genes with non-zero 5'UTR intron lengths) was plotted for the specified comparisons. As expected from the evolutionary distances between the analyzed species, the highest correlation (93%) was observed between mouse and rat NRTKs. (b) For each mouse ortholog of a human NRTK, the heatmap depicts the changes in total 5'UTR intron length (color reflects log10 of total 5'UTR intron length). The histogram above the color scale summarizes the distribution of changes in 5'UTR intron length. A 5'UTR intron may be present in mouse but not in the compared species (light blue) or vice versa (dark blue). Comparisons require an annotated 5'UTR for each ortholog, and were therefore not possible in some cases (white). (c) Same as (b) but substituting 'rat' for 'mouse'. (d) Human genomic region containing the 5'UTR and first few coding exons (UCSC Genome Browser view). '7X Regulatory Potential', for which higher scores indicate a greater potential for harboring regulatory sequence elements, was calculated using alignments of seven mammalian genomes as previously described [44].

Despite a generally strong correlation in 5UI length among orthologs, some sets of orthologs had a widespread distribution of length changes. While the total 5UI length of FES changed by less than five nucleotides in all possible comparisons, rat PTK2 and mouse PTK2 5UIs differed by approximately 63.5 kb (Figure 4b, c). The length conservation observed for the FES 5UI is notably consistent with the high regulatory potential previously calculated for this 5UI [44] (Figure 4d). More broadly, introns containing regulatory regions might be expected to have high length conservation.

When each orthologous group of NRTKs was analyzed, we found variability with respect to presence/absence of 5UIs in some of these groups. For example, STYK1 and WEE1 both had 5UIs in humans, but not in mouse or rat (Figure 4b, c). In the case of human WEE1, two transcripts were identified in the human RefSeq collection - while one variant had a 512-nucleotide 5UI, the other variant lacked 5UIs entirely. This observation suggested the possibility that intron-containing variants might be present in mouse and rat without being represented in the RefSeq transcript collection. Indeed, we found EST evidence that rat WEE1 has a splice variant that includes a 5UI [GenBank:CK603528.1]. On the other hand, mouse FRK (Figure 4b) and rat TXK (Figure 4c) had 5UIs while their orthologs did not. We also observed several NRTKs having 5UIs in two of the species but not in the other one. For example, both human and mouse orthologs of LCK, BTK, CSK, TNK1, and YES1 had annotated 5UIs, while both human and rat orthologs of JAK3 and TEC had annotated 5UIs (Figure 4b, c). Our results suggest that NRTK 5UIs are frequently conserved, a conclusion that would be further strengthened should the apparent gain/loss events be attributable to incomplete transcript annotation.

The appearance of 5UIs in most human NRTKs (Table 1) suggested the potential for a common regulatory mechanism acting via shared motifs. To search for shared and conserved motifs in these introns, human NRTK 5UI sequences were located in human-to-mouse and human-to-rat genome alignments. For 37 out of 42 human NRTKs, more than 10% of the 5UIs could be aligned to both genomes only these conserved fragments were used for motif finding. Overrepresented RNA and DNA motifs were sought in these aligned sequences using the PhyloGibbs software [45]. In our search for overrepresented RNA elements, we identified two complementary motifs, so that the motif in these 5UIs is more likely to be relevant at the DNA level. A representative DNA motif (Figure 5a) with the highest log-posterior-probability was compared to the TRANSFAC v11.3 database of known transcription factor binding sites and to a list of conserved human predicted motifs [46] using the STAMP website [47] (Figure 5b, c). In both comparisons, the known binding site motif of the MAZ transcription factor was the most likely match. However, this does not rule out the possibility of this motif being the target of another DNA binding protein.

Characterization of an 8-nucleotide DNA motif in the 5'UTR of human NRTKs. (a) Representative motif and its reverse complement. (b) Comparison of the representative motif to the TRANSFAC v11.3 database of known transcription factor binding sites. (c) Comparison of the representative motif to a list of conserved human predicted motifs [46]. STAMP website was used for the comparisons [47]. The default ungapped Smith-Waterman alignment was used and the P-value was calculated using the methods of Sandelin and Wasserman [74].

Comparison between 5'UTR and 5'-proximal coding introns

5UIs are, by definition, the most 5'-proximal introns in their transcript. However, not all 5'-proximal introns need lie within the 5'UTR. We sought to understand whether the observed functional properties of 5UIs were shared with 5'-proximal coding region introns (5PCIs). Given that the median position of the first 5UI was approximately 130 nucleotides away from the transcription start site regardless of the number of 5UIs [19], we defined the genes without a 5UI but with a coding region intron within 150 nucleotides of the transcription start site as 5PCI-containing genes. This criterion resulted in 24% of 5UI-lacking genes having a coding region intron that was deemed to be a 5PCI.

We next used GO annotations to compare the functional properties of 5UI-lacking genes with 5PCIs to those without 5PCIs. We observed the strongest enrichment of 5PCIs among genes in the following functional groups: MHC protein complex 1, cytosolic ribosome, hemoglobin complex, glutathione transferase activity, and transmembrane transporters (Additional file 3). This result contrasts the observed enrichment of 5UIs in regulatory genes. The differences in the enrichment profiles suggest that distinct functional groups of genes prefer early introns in either the 5'UTR or the coding region but not in both.

To assess the possible effect of 5' proximity on gene expression, we analyzed microarray data from the human gene expression atlas for 5UI-lacking genes. We found that genes with 5PCIs were more highly expressed on average (one-sided Wilcoxon rank sum test, P = 6e-08 Figure 6). We also observed a 2.3- and 3.7-fold enrichment for genes with 5PCIs among the most highly expressed top 5% and 1% of genes, respectively (Fisher's Exact Test, P = 4e-15 and P = 4e-09, respectively Figure 6). The correlation between high expression and 5PCI presence was evident without any consideration of these introns' lengths. In contrast, no expression difference was observed between genes with or without 5UIs, on average, but short 5UIs were highly enriched among the most highly expressed genes (Figure 2c). These results suggest that early introns (both 5PCIs and 5UIs) are associated with the most highly expressed genes, but that this correlation is limited to short introns for 5UIs.

The effect of 5'-proximal coding intron presence on gene expression. (a) Smoothed histogram of the mean expression level with respect to presence/absence of 5'-proximal coding region introns (5PCIs). A kernel density estimator was fitted to the expression data and the corresponding probability density is plotted as a function of the mean expression level. The black line corresponds to the probability density for transcripts without any 5'UTR introns or any 5PCIs. The red line represents the probability density for 5'UTR intronless transcripts that have 5PCIs. The vertical line represents the top 5% of mean expression level of all genes without 5'UTR introns.


Food security is threatened by both the growing human population, estimated to reach around 9.3 billion by the year 2050, and the loss of crops due to climate changes and soil deterioration [1, 2]. Seeds are the centre to crop production, human nutrition, and food security [3, 4], they contain the full genetic complement of the plant allowing it to survive even under prolonged periods of stress conditions [5, 6]. Then it is of important concern to collect and preserve the germplasm of commercial species as well as their wild relatives, which have survived several climate changes and are valuable resources of genetic information that could be useful in the development of crop breeding strategies to solve current and future agricultural challenges [1, 3, 4].

Orthodox seeds are able to survive the removal of most of their cellular water and can be stored in dry state for a long period of time. Desiccation tolerance and maintenance of seeds quiescent state are associated with wide range of systems related with cell protection, detoxification, and repair [6, 7]. The presence of particular proteins such as the late embryogenesis abundant (LEA) proteins, heat shock proteins (HSPs), and seed storage proteins (SSPs) confer seeds desiccation tolerance, allowing them to survive in dry state preserving their germination ability and propagation after long-term storage conditions [8, 9].

LEA proteins are suggested to play an important role in seed desiccation tolerance [10], they are known to stabilize membranes against the deleterious effects of drying. Further, LEAs are able to prevent protein aggregation during freezing and drying and interact with and stabilize liposomes in the dry state [11]. Some LEAs can stabilize sugar glasses [12] suggesting that they play a role in longevity, which is a crucial factor for the conservation of genetic resources and to ensure proper seedling establishment and crop yield [13]. On the other hand, SSPs are a major source of dietary protein for human nutrition. SSPs beyond serving as a nutrient reservoir they may play specific functions during seed formation [7, 14] and could have a key role in seed longevity [15]. SSPs play a fundamental role in germination and seedling growth [16]. Due to their abundance and high propensity to oxidation, SSPs are considered a powerful reactive oxygen species (ROS) scavenging system that could protect cellular components that are important for embryo survival [17, 18].

Amaranth is a crop that had great importance for Aztec, Mayan, and Inca cultures. However, Spaniards prohibited its cultivation due to its link with pagan ceremonies [19]. Nevertheless, during the past two decades, reports on amaranth nutritional and nutraceutical characteristics have increased, leading to a new era in the history of amaranth cultivation [20]. The importance of amaranth as a crop for human nutrition is due to the high quality of its proteins. Amaranth seed proteins contain an adequate balance of essential amino acids [21], with values close to nutritional human requirements, being particularly rich in lysine and methionine, which are deficient in cereals and legumes, respectively [20, 22]. Furthermore, the content of prolamins, the SSPs fraction responsibles for the manifestation of celiac disease, is negligible or practically null [23]. The genus Amaranthus consists of about 70 species distributed in very diverse habitats in terms of climatic conditions and geographical location [24, 25], of which only three species, A. caudatus, A. cruentus, and A. hypochondriacus are cultivated as grain amaranths for human consumption, the last two being native to Mexico [26]. The most probable ancestors or wild relatives of these species are A. powellii and A. hybridus, which grow under harsh conditions throughout the Mexican territory. The wide natural variation in amaranth offers the opportunity to identify markers that could be important for the nutrition, protection and longevity of seeds, which would result in the development of high productivity cultivars.

The aim of this study was to characterize the morphological and molecular traits of seeds from wild species A. powellii and A. hybridus and compared them with the cultivated amaranth species such as A. hypochondriacus and A. cruentus. The seeds phenotypic analysis was carried by microscopy observations and molecular characterization was carried out using proteomics tools (1-DE and nLC-MS/MS) as well as in silico analyses.

5: Regionalization and Organizers - Biology

The five prime untranslated region (5' UTR), also known as the leader sequence, is a particular section of messenger RNA (mRNA) and the DNA that codes for it. It starts at the +1 position and ends just before the start codon (usually AUG) of the coding region. It.
Full article >>>

The three prime untranslated region (3' UTR) is a particular section of messenger RNA (mRNA). It follows the coding region.
Full article >>>

Five prime untranslated region The five prime untranslated region ( 5' UTR ), also known as the leader sequence , is a particular section of messenger
Full article >>>

Three prime untranslated region The three prime untranslated region (3' UTR) is a particular . article "Three prime untranslated region". Read more .
Full article >>>

3'UTR RNA from Alpha-Tropomyosin Gene Suppresses Tumor Activity. . Tropomyosin-2 cDNA lacking the 3' untranslated region riboregulator induces .
Full article >>>

Untranslated,region,biological ,biology dictionary,biology terminology,biology terms,biology abbreviations . in the 3' untranslated region of caudal mRNA .
Full article >>>

The coding region plus 3' untranslated region (UTR) of PC mRNA is 3926 bases and . 5' Untranslated Regions. RNA. Ribonucleases. Pyruvate Carboxylase .
Full article >>>

Translational efficiency is regulated by the length of the 3' untranslated region. . Do the poly(A) tail and 3' untranslated region control mRNA translation? .
Full article >>>

Register today for a free trial, credit card req'd. Find Genetics articles plus many other academic journal articles, magazine articles & newspaper archives.
Full article >>>

. of beta-globin transcripts linked to the c-myc 3' untranslated region. . Distinct regions in the 3' untranslated region are responsible for targeting and .
Full article >>>

An electronic protocol book with 500 protocols . Untranslated region (UTR) Untranslated region (UTR) . Untranscribed Region. Other Resources. PubMed Google .
Full article >>>

within a transcribed but untranslated region of the gene. upstream from the start codon. . analysis of the 5'-untranslated region of SPR of. patient's DNA. .
Full article >>>

Key role of the 3' untranslated region in the cell cycle regulated expression of . histone H2A genes: minor synergistic effect of the 5' untranslated region .
Full article >>>

. based predictions of secondary structure of the 3' untranslated region (UTR) . 4 vaccine candidate with a 30 nucleotide deletion in its 3'-untranslated region. .
Full article >>>

. the 5'-untranslated region (-575) of the p27 gene and (b) the antibiotic . The -575 p27 (5'-untranslated region (5'UTR) of p27 gene) is unlikely to contain .
Full article >>>

BioInfoBank Library :: Enhancement of dengue virus translation: role of the 3' untranslated region and the terminal 3' stem-loop domain. Control of translation by .
Full article >>>

. of the 5' Untranslated Region Cannot Accurately Distinguish Genotypes . of Hepatitis C Virus with Reference to C/E1 and 5' Untranslated Region Sequences. .
Full article >>>

. loop in the coronavirus 5' untranslated region plays a functional role in . Stem-Loop III in the 5' Untranslated Region Is a cis-Acting Element in Bovine .
Full article >>>

Watch the video: Organizers and Embryonic Induction Part-1. Developmental Biology.. (January 2023).