Lysozyme amino acid sequence: N-terminal extension

I looked up the amino acid sequence of lysozyme here:

Then I crossed referenced that with the lysozyme sequence on UniProt:

My question is why does the sequence on UniProt start with eighteen other amino acids before getting to the start of the sequence described in the first source?

The sequence on UniProt starts with "MRSLLILVLCFLPLAALG". I know the M comes from the start codon in the gene for the protein, but where does the rest of this come from?

If you scroll down through the uniprot entry you will come to a section with the heading PTM/Processing.

From this you can see that the first 18 amino acids are a signal peptide.

You can learn more about signal peptides from any good introductory textbook covering cell-biology - alternatively Khan Academy is a good source for learning the basics.

Lysozyme amino acid sequence: N-terminal extension - Biology

Signal sequences play a central role in the initial membrane translocation of secretory proteins. Their functions depend on factors such as hydrophobicity and conformation of the signal sequences themselves. However, some characteristics of mature proteins, especially those of the N-terminal region, might also affect the function of the signal sequences. To examine this possibility, several mutants of human lysozyme modified in the N-terminal region of the mature protein were constructed, and their secretion in yeast as well as in vitro translocation into canine pancreatic microsomes were analyzed using an idealized signal sequence L8 (MR(L)8PLAALG). Our results show the following. (1) Change in the charge at the N-terminal residue of the mature protein does not affect secretion drastically. (2) Substitution of a proline residue at the N terminus prevents cleavage of the signal sequence, although translocation itself is not impaired. (3) Excessive positive charges in the N-terminal region delay translocation of the precursor protein across the membrane. (4) Polar and negatively charged residues introduced into the N-terminal region affect the secretion of the mature protein by preventing its correct folding.

<p>This section provides any useful information about the protein, mostly biological knowledge.<p><a href='/help/function_section' target='_top'>More. </a></p> Function i

<p>This subsection of the <a href="">Function</a> section describes the catalytic activity of an enzyme, i.e. a chemical reaction that the enzyme catalyzes.<p><a href='/help/catalytic_activity' target='_top'>More. </a></p> Catalytic activity i

<p>Information which has been generated by the UniProtKB automatic annotation system, without manual validation.</p> <p><a href="/manual/evidences#ECO:0000256">More. </a></p> Automatic assertion according to rules i


Feature keyPosition(s)Description Actions Graphical viewLength
<p>This subsection of the <a href="">Function</a> section is used for enzymes and indicates the residues directly involved in catalysis.<p><a href='/help/act_site' target='_top'>More. </a></p> Active site i 53 UniRule annotation

Automatic assertion according to rules i

Automatic assertion according to rules i

Automatic assertion according to rules i

<p>The <a href="">Gene Ontology (GO)</a> project provides a set of hierarchical controlled vocabulary split into 3 categories:<p><a href='/help/gene_ontology' target='_top'>More. </a></p> GO - Molecular function i

GO - Biological process i

<p>UniProtKB Keywords constitute a <a href="">controlled vocabulary</a> with a hierarchical structure. Keywords summarise the content of a UniProtKB entry and facilitate the search for proteins of interest.<p><a href='/help/keywords' target='_top'>More. </a></p> Keywords i

Automatic assertion according to rules i

<p>Information which has been imported from another database using automatic procedures.</p> <p><a href="/manual/evidences#ECO:0000313">More. </a></p> Automatic assertion inferred from database entries i

Automatic assertion according to rules i

Protein family/group databases


Norovirus is a single-stranded RNA (+) virus belonging to the family Caliciviridae. It is transmitted orally by infected people or contaminated food. It causes severe vomiting, diarrhea, and fever 24–48 h after infection [1]. The main foods associated with norovirus gastroenteritis are oysters and other bivalves, although recently, numerous outbreaks of norovirus caused by unheated food products, such as salads and ready-to-eat food, have been reported [2, 3]. Every year, norovirus is responsible for 64,000 episodes of diarrhea requiring hospitalization, and up to 200,000 deaths of children < 5 years of age in developing countries [4].

Although many norovirus-inactivating methods have been reported, including thermal treatment, ultraviolet irradiation, high hydrostatic pressure, hypochlorous acid, and the use of food-derived components [1, 5–7] these methods are suboptimal as they affect the taste and color of food products. Consequently, the development of an anti-norovirus disinfectant agent is an important issue for food hygiene.

Lysozyme is a single-chain polypeptide consisting of 129 amino acids [8]. It catalyzes the hydrolysis of peptidoglycan of gram-positive bacteria. It is contained in secretions, such as tears and saliva, and egg white [8]. In addition, lysozyme is extracted on an industrial scale from chicken egg white and is widely used as a food additive or as a raw material for pharmaceuticals.

We have reported that thermally denatured lysozyme (DL) inactivates norovirus [9]. The particle size of murine norovirus strain 1 (MNV-1), a surrogate for norovirus, shows an average expansion of 16.37 nm after exposure to DL for 1 h, and the N-terminal region of lysozyme possibly contributes to the inactivating effect [9]. Furthermore, DL inactivates norovirus in several foods and is also effective against the hepatitis A virus [10–12]. However, the conditions for optimal DL norovirus-inactivating effects and the underlying mechanism remain unclear.

This study aimed to analyze the norovirus-inactivating conditions and mechanisms of DL. We evaluated the conditions under which DL is highly effective against norovirus, together with the changes in the gene expression of norovirus-infected host cells infected with DL-treated MNV-1. We also analyzed the involvement of specific lysozyme domains in the antiviral effect. The data suggest that residues 5–39 of lysozyme contribute to the antiviral effect of DL. These observations will inform the use of DL as an anti-norovirus disinfectant of foods.


Evolution of Caenorhabditis lysozymes

Caenorhabditis nematodes are among the organisms with the highest number and the most extreme diversity of lysozyme genes. Their lysozymes fall into three distinct clades, one being part of the invertebrate-type and the other two of the evolutionary very distant protist-type lysozymes. Moreover, the Cel-lys-9 gene from C. elegans, which undoubtedly belongs to the protist-type lysozymes (Fig. ​ (Fig.2), 2 ), shows only limited similarities to the other nematode genes and it may thus represent a class of its own. To date, it is impossible to say whether the invertebrate-type and the protist-type lysozymes evolved from a common ancestor or not. In the latter case, their general similarity as lysozymes would be a consequence of convergent evolution towards a similar function in defence or digestion. Additional data from more basal nematode as well as metazoan taxa (e.g. cnidarians, poriferans, platyhelminths) is required to distinguish between these alternatives.

Some of the Caenorhabditis lysozyme genes are found in clusters within the genome, as known for about one fifth of the protein-coding genes of C. elegans and apparently characteristic for genes involved in interactions with the environment [26]. Thus, lysozymes may be subject to similar evolutionary dynamics recently described for several of the clustered gene families [27]. These clustered gene families are most likely shaped by concerted molecular evolution. They are characterized by species-specific clades of the gene clusters, the presence of inverted genes that have been proposed to stabilize concerted evolution of clusters over time, and strong purifying selection [27]. However, the inferred evolutionary history of lysozyme clearly contrasts with such patterns. Genes in close genomic proximity do not form species-specific phylogenetic clades. None of the genomic lysozyme clusters contain "stabilizing" genes with inverted orientation in the middle of the cluster. Furthermore, although the majority of genes appears to be subject to purifying selection, we did obtain a strong indication for several episodes of diversifying selection.

We conclude that the lysozymes follow a different evolutionary trajectory. Our analysis reveals three main patterns.

(i) Gene duplication prior to species separation and maintenance of the duplicated genes. This scenario is most evident where lysozyme orthologues are monophyletic and distributed in synteny across genomes in all three taxa, e.g. the protist-type lys-1, lys-2, and lys-3 genes. Other likely cases are the protist-type lys-6, lys-8, lys-10, and the invertebrate-type ilys-4 and ilys-5 genes, for which corresponding orthologues fall into monophyletic clades. In all these cases, the orthologous genes must have an age of at least three million years, which is the minimum time since the last most common ancestor of the three Caenorhabditis species [28]. Their maintenance across time suggests an important conserved biological role for each group of orthologues. In this case, their original divergence after gene duplication may have been favoured by diversifying selection and thus, it may associate with signatures of adaptive sequence evolution. Such a signature is indeed found for the clade 1 protist-type lysozymes (including lys-1 to lys-3, lys-8, and orthologues).

(ii) Recent gene duplication and diversification. Phylogenetic analysis revealed five cases of lineage-specific duplication events (Figs. ​ (Figs.4, 4 , ​ ,7, 7 , and ​ and8). 8 ). One of these cases (Cre-ilys-4.1 and Cre-ilys4.2) is associated with a significant signature of adaptive sequence evolution, suggesting that diversifying selection favoured lysozyme differentiation upon duplication. The other four cases (Cre-lys-8.1 and Cre-lys-8.2 Cbr-lys-6.1 and Cbr-lys-6.2 Cel-lys-5 and Cel-lys-6 Cel-lys-7 and Cel-lys-8) appear to be subject to purifying selection. This pattern indicates strong selection for maintenance of gene function after the duplication event.

(iii) Gene duplication prior to species separation followed by differential gene loss. This scenario appears to apply to the Cel-ilys-1, Cel-ilys-2, Cel-ilys-3, and Cel-lys-4 genes, which are each present in only one of the species and diverge from internal nodes, some of them along long branches indicative of old evolutionary age. Loss of genes after duplication events in the other Caenorhabditis species may then suggest redundant functions of lysozymes in these taxa. As above under (i), their original diversification may have been driven by diversifying selection. Indeed, two episodes of adaptive sequence evolution were found to associate with these genes (Figs. ​ (Figs.4A, 4A , ​ ,8A 8A ).

Phylogenetic inferences can only yield an approximation of the past and thus come with some uncertainty. Considering that the inferred relationships are generally supported by high bootstrap values and that they are based on the maximum likelihood approach, which was shown in the past to be less susceptible to biases (e.g. long-branch attraction) than other tree reconstruction methods [29], our results should provide a realistic image of Caenorhabditis lysozyme evolution. Taken together, their lysozyme repertoire is shaped by both ancestral and recent gene duplications. Sequence evolution is to a large extent determined by purifying selection. Yet, it also includes several episodes of diversifying selection, which associate with ancient as well as recent duplications. To our knowledge, similar evolutionary dynamics have not as yet been inferred for the lysozymes from other taxa.

It is worth noting that we did not find an indication for adaptive sequence evolution between the two main protist-type clades (Fig. ​ (Fig.6, 6 , Table ​ Table4). 4 ). Two explanations are conceivable. On the one hand, differentiation of the two clades was not subject to diversifying selection. On the other hand, diversifying selection was important but could not be detected due to a lack of power of the analysis, which had to be based on a reduced data set including only the conserved sequence regions that could be reliably aligned across the different genes and taxa. At the same time, this specific result (as well as all other cases of comparatively long branches with dN/dSrate ratios below 1) strongly suggests that our analysis is not compromised by a possible saturation of synonymous substitutions along long branches, which could have led to underestimated dSrates and thus artificially high dN/dSrate ratios. It is also worth noting that only a single alignment site was inferred to be under positive selection in our analyses. This is unusual because in most immunity gene data sets associated with adaptive sequence evolution a larger number of positively selected sites is identified, e.g. in MHC class I receptors [30,31]. A possible reason is that the different evolutionary lineages vary as to the position of the positively selected sites or that only few lineages are subject to positive selection on specific sites. In both cases, the method employed would hinder detection of these positively selected sites because it assumes the same pattern of selection across all lineages [25]. We did not attempt to perform an analysis, in which dN/dSratios are allowed to vary simultaneously across sites and lineages, because these types of analyses may be liable to higher error rates [32,33]. The single site, which we identified to be under positive selection, is thus predicted to be of main – albeit currently unknown – functional importance.

Functional diversification

Gene duplications are likely to be one of the main sources of evolutionary innovation [16]. The duplicated genes may acquire new functions (neo-functionalisation) or they may partition the multiple functions of the ancestral gene (sub-functionalisation) [17]. The relevance of either alternative as well as additional scenarios is a topic of intense current debate [34-38]. Importantly, in all cases the genetic diversity of duplicates is predicted to translate into functional diversity. Such a pattern is found in the ruminantia, which possess at least five different lysozyme types: the stomach, tracheal, intestinal, kidney, and milk lysozymes [10]. The first type is involved in digestion, whereas the others may function as antibacterial enzymes in immunity [10]. A similar pattern is observed for the at least eleven different Drosophila lysozymes. Most of them have a digestive role and show specialisation as to their time and site of expression [5,14]. A recent study additionally suggested an anti-fungal immune function for some of the genes (Lys B, C, D, E and CG16756) [39]. A further example includes the nine lysozymes of the mosquito Anopheles gambiae, which vary as to their role in immunity and digestion and also as to their time and location of expression [15].

The Caenorhabditis lysozymes show clear signatures of functional diversification. Pronounced differences between the three main clades and also within each of the clades are observed for molecular characteristics of the genes, their pathogen-induced expression, and also their regulation by the immune system. Based on the current data, it appears that the protist-type clade 1 lysozymes play an important role in immunity: They are all induced upon pathogen exposure. Most of them are under positive control of immunity pathways, including components of the insulin-like signalling cascade (DAF-16) [40-42], the p38 mitogen-activated protein kinase (MAPK) pathway (SEK-1 and PMK-1) [43], the TGF-β pathway (DBL-1, SMA-2) [44], or the GATA transcription factor ELT-2 [45]. Most interestingly, the different genes from this clade vary in their response to pathogens and immunity pathways. This variation may contribute to high immune specificity, as has recently been identified phenomenologically for invertebrates [46-48] and which is consistent with highly specific C. elegans-pathogen interactions [49]. Although the underlying molecular mechanisms are currently unknown, they are likely to be based on the genetic diversification of pathogen recognition receptors and/or immune effectors such as the lysozymes [21,50,51]. They may also include the synergistic interaction between different components of the immune system [51], as generally known for lysozymes and antimicrobial peptides [5,7,52,53]. In C. elegans, the immune function has been tested for two genes of the clade 1 protist-type lysozymes. Overexpression of Cel-lys-1 enhances resistance against S. marcescens [20], whereas silencing of Cel-lys-7 increases susceptibility to M. nematophilum [19]. The importance of lysozyme diversification for immunity in general and also for immune specificity clearly warrants further investigation.

The role of the invertebrate-type and also the clade 2 protist-type lysozymes is as yet unclear. The only exception may be Cel-ilys-3. Its silencing enhances susceptibility to M. nematophilum [19]. In the same study, no effect was observed after Cel-ilys-2 knock-down [19]. In general, both invertebrate-type and clade 2 protist-type lysozymes are less often activated by pathogens than the clade 1 protist-type lysozymes. At the same time, several of the genes are downregulated by pathogens and by known immunity pathways. The latter observation may suggest that their main function somehow interferes with the immune response. A similar finding was made for some of the digestive lysozymes from D. melanogaster, which are also downregulated upon immune challenge [5]. This particular similarity may indicate that the primary function of these nematode lysozymes is also digestion. The information on their molecular characteristics (e.g. isoelectric point) or the localization of gene expression is consistent with a role in both immunity and digestion. Unfortunately, the nematode's intestines are the main location for bacterial digestion and at the same time immune defence against pathogens that are easily taken up during feeding [54]. Therefore, lysozymes are expected to have similar characteristics (e.g. regarding pH optimum) even if they vary in function. Future analyses should thus be performed with either exclusive food bacteria or exclusive pathogens, in order to distinguish between the alternative functions.


Ubiquitin (originally, ubiquitous immunopoietic polypeptide) was first identified in 1975 [1] as an 8.6 kDa protein expressed in all eukaryotic cells. The basic functions of ubiquitin and the components of the ubiquitylation pathway were elucidated in the early 1980s at the Technion by Aaron Ciechanover, Avram Hershko, and Irwin Rose for which the Nobel Prize in Chemistry was awarded in 2004. [11]

The ubiquitylation system was initially characterised as an ATP-dependent proteolytic system present in cellular extracts. A heat-stable polypeptide present in these extracts, ATP-dependent proteolysis factor 1 (APF-1), was found to become covalently attached to the model protein substrate lysozyme in an ATP- and Mg 2+ -dependent process. [13] Multiple APF-1 molecules were linked to a single substrate molecule by an isopeptide linkage, and conjugates were found to be rapidly degraded with the release of free APF-1. Soon after APF-1-protein conjugation was characterised, APF-1 was identified as ubiquitin. The carboxyl group of the C-terminal glycine residue of ubiquitin (Gly76) was identified as the moiety conjugated to substrate lysine residues.

Ubiquitin properties (human) [ which? ]
Number of residues 76
Molecular mass 8564.8448 Da
Isoelectric point (pI) 6.79
Gene names RPS27A (UBA80, UBCEP1), UBA52 (UBCEP2), UBB, UBC


Ubiquitin is a small protein that exists in all eukaryotic cells. It performs its myriad functions through conjugation to a large range of target proteins. A variety of different modifications can occur. The ubiquitin protein itself consists of 76 amino acids and has a molecular mass of about 8.6 kDa. Key features include its C-terminal tail and the 7 lysine residues. It is highly conserved throughout eukaryote evolution human and yeast ubiquitin share 96% sequence identity. [ citation needed ]

Ubiquitin is encoded in mammals by 4 different genes. UBA52 and RPS27A genes code for a single copy of ubiquitin fused to the ribosomal proteins L40 and S27a, respectively. The UBB and UBC genes code for polyubiquitin precursor proteins. [3]

Ubiquitylation (also known as ubiquitination or ubiquitinylation) is an enzymatic post-translational modification in which a ubiquitin protein is attached to a substrate protein. This process most commonly binds the last amino acid of ubiquitin (glycine 76) to a lysine residue on the substrate. An isopeptide bond is formed between the carboxyl group (COO − ) of the ubiquitin's glycine and the epsilon-amino group (ε- NH +
3 ) of the substrate's lysine. [14] Trypsin cleavage of a ubiquitin-conjugated substrate leaves a di-glycine "remnant" that is used to identify the site of ubiquitylation. [15] [16] Ubiquitin can also be bound to other sites in a protein which are electron-rich nucleophiles, termed "non-canonical ubiquitylation". [9] This was first observed with the amine group of a protein's N-terminus being used for ubiquitylation, rather than a lysine residue, in the protein MyoD [17] and has been observed since in 22 other proteins in multiple species, [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] including ubiquitin itself. [37] [38] There is also increasing evidence for nonlysine residues as ubiquitylation targets using non-amine groups, such as the sulfhydryl group on cysteine, [33] [34] [39] [40] [41] [42] [43] [44] [45] [46] and the hydroxyl group on threonine and serine. [33] [34] [39] [45] [46] [47] [48] [49] [50] The end result of this process is the addition of one ubiquitin molecule (monoubiquitylation) or a chain of ubiquitin molecules (polyubiquitination) to the substrate protein. [51]

Ubiquitination requires three types of enzyme: ubiquitin-activating enzymes, ubiquitin-conjugating enzymes, and ubiquitin ligases, known as E1s, E2s, and E3s, respectively. The process consists of three main steps:

  1. Activation: Ubiquitin is activated in a two-step reaction by an E1 ubiquitin-activating enzyme, which is dependent on ATP. The initial step involves production of a ubiquitin-adenylate intermediate. The E1 binds both ATP and ubiquitin and catalyses the acyl-adenylation of the C-terminus of the ubiquitin molecule. The second step transfers ubiquitin to an active sitecysteine residue, with release of AMP. This step results in a thioester linkage between the C-terminal carboxyl group of ubiquitin and the E1 cysteine sulfhydryl group. [14][52] The human genome contains two genes that produce enzymes capable of activating ubiquitin: UBA1 and UBA6. [53]
  2. Conjugation: E2 ubiquitin-conjugating enzymes catalyse the transfer of ubiquitin from E1 to the active site cysteine of the E2 via a trans(thio)esterification reaction. In order to perform this reaction, the E2 binds to both activated ubiquitin and the E1 enzyme. Humans possess 35 different E2 enzymes, whereas other eukaryotic organisms have between 16 and 35. They are characterised by their highly conserved structure, known as the ubiquitin-conjugating catalytic (UBC) fold. [54]

In the ubiquitination cascade, E1 can bind with many E2s, which can bind with hundreds of E3s in a hierarchical way. Having levels within the cascade allows tight regulation of the ubiquitination machinery. [7] Other ubiquitin-like proteins (UBLs) are also modified via the E1–E2–E3 cascade, although variations in these systems do exist. [57]

E4 enzymes, or ubiquitin-chain elongation factors, are capable of adding pre-formed polyubiquitin chains to substrate proteins. [58] For example, multiple monoubiquitylation of the tumor suppressor p53 by Mdm2 [59] can be followed by addition of a polyubiquitin chain using p300 and CBP. [60] [61]

Types Edit

Ubiquitination affects cellular process by regulating the degradation of proteins (via the proteasome and lysosome), coordinating the cellular localization of proteins, activating and inactivating proteins, and modulating protein-protein interactions. [4] [5] [6] These effects are mediated by different types of substrate ubiquitination, for example the addition of a single ubiquitin molecule (monoubiquitination) or different types of ubiquitin chains (polyubiquitination). [62]

Monoubiquitination Edit

Monoubiquitination is the addition of one ubiquitin molecule to one substrate protein residue. Multi-monoubiquitination is the addition of one ubiquitin molecule to multiple substrate residues. The monoubiquitination of a protein can have different effects to the polyubiquitination of the same protein. The addition of a single ubiquitin molecule is thought to be required prior to the formation of polyubiquitin chains. [62] Monoubiquitination affects cellular processes such as membrane trafficking, endocytosis and viral budding. [10] [63]

Polyubiquitin chains Edit

Polyubiquitination is the formation of a ubiquitin chain on a single lysine residue on the substrate protein. Following addition of a single ubiquitin moiety to a protein substrate, further ubiquitin molecules can be added to the first, yielding a polyubiquitin chain. [62] These chains are made by linking the glycine residue of a ubiquitin molecule to a lysine of ubiquitin bound to a substrate. Ubiquitin has seven lysine residues and an N-terminus that serves as points of ubiquitination they are K6, K11, K27, K29, K33, K48, K63 and M1, respectively. [8] Lysine 48-linked chains were the first identified and are the best-characterised type of ubiquitin chain. K63 chains have also been well-characterised, whereas the function of other lysine chains, mixed chains, branched chains, M1-linked linear chains, and heterologous chains (mixtures of ubiquitin and other ubiquitin-like proteins) remains more unclear. [16] [38] [62] [63] [64]

Lysine 48-linked polyubiquitin chains target proteins for destruction, by a process known as proteolysis. Multi-ubiquitin chains at least four ubiquitin molecules long must be attached to a lysine residue on the condemned protein in order for it to be recognised by the 26S proteasome. [65] This is a barrel-shape structure comprising a central proteolytic core made of four ring structures, flanked by two cylinders that selectively allow entry of ubiquitinated proteins. Once inside, the proteins are rapidly degraded into small peptides (usually 3–25 amino acid residues in length). Ubiquitin molecules are cleaved off the protein immediately prior to destruction and are recycled for further use. [66] Although the majority of protein substrates are ubiquitinated, there are examples of non-ubiquitinated proteins targeted to the proteasome. [67] The polyubiquitin chains are recognised by a subunit of the proteasome: S5a/Rpn10. This is achieved by a ubiquitin interacting motif (UIM) found in a hydrophobic patch in the C-terminal region of the S5a/Rpn10 unit. [4]

Lysine 63-linked chains are not associated with proteasomal degradation of the substrate protein. Instead, they allow the coordination of other processes such as endocytic trafficking, inflammation, translation, and DNA repair. [10] In cells, lysine 63-linked chains are bound by the ESCRT-0 complex, which prevents their binding to the proteasome. This complex contains two proteins, Hrs and STAM1, that contain a UIM, which allows it to bind to lysine 63-linked chains. [68] [69]

Less is understood about atypical (non-lysine 48-linked) ubiquitin chains but research is starting to suggest roles for these chains. [63] There is evidence to suggest that atypical chains linked by lysine 6, 11, 27, 29 and methionine 1 can induce proteasomal degradation. [67] [70]

Branched ubiquitin chains containing multiple linkage types can be formed. [71] The function of these chains is unknown. [8]

Structure Edit

Differently linked chains have specific effects on the protein to which they are attached, caused by differences in the conformations of the protein chains. K29-, K33-, [72] K63- and M1-linked chains have a fairly linear conformation they are known as open-conformation chains. K6-, K11-, and K48-linked chains form closed conformations. The ubiquitin molecules in open-conformation chains do not interact with each other, except for the covalent isopeptide bonds linking them together. In contrast, the closed conformation chains have interfaces with interacting residues. Altering the chain conformations exposes and conceals different parts of the ubiquitin protein, and the different linkages are recognized by proteins that are specific for the unique topologies that are intrinsic to the linkage. Proteins can specifically bind to ubiquitin via ubiquitin-binding domains (UBDs). The distances between individual ubiquitin units in chains differ between lysine 63- and 48-linked chains. The UBDs exploit this by having small spacers between ubiquitin-interacting motifs that bind lysine 48-linked chains (compact ubiquitin chains) and larger spacers for lysine 63-linked chains. The machinery involved in recognising polyubiquitin chains can also differentiate between K63-linked chains and M1-linked chains, demonstrated by the fact that the latter can induce proteasomal degradation of the substrate. [8] [10] [70]

The ubiquitination system functions in a wide variety of cellular processes, including: [73]

Membrane proteins Edit

Multi-monoubiquitination can mark transmembrane proteins (for example, receptors) for removal from membranes (internalisation) and fulfil several signalling roles within the cell. When cell-surface transmembrane molecules are tagged with ubiquitin, the subcellular localization of the protein is altered, often targeting the protein for destruction in lysosomes. This serves as a negative feedback mechanism, because often the stimulation of receptors by ligands increases their rate of ubiquitination and internalisation. Like monoubiquitination, lysine 63-linked polyubiquitin chains also has a role in the trafficking some membrane proteins. [10] [62] [65] [75]

Fougaro System Edit

The fougaro system (Greek Fougaro, chimney) is a sub-organelle system in the nucleus that may be a mechanism to recycle or remove molecules from the cell to the external medium. The molecules or peptides are ubiquitinated before being released from the nucleus of the cells. The ubiquitinated molecules are released independently or associated with endosomal proteins such as Beclin. [76]

Genomic maintenance Edit

Proliferating cell nuclear antigen (PCNA) is a protein involved in DNA synthesis. Under normal physiological conditions PCNA is sumoylated (a similar post-translational modification to ubiquitination). When DNA is damaged by ultra-violet radiation or chemicals, the SUMO molecule that is attached to a lysine residue is replaced by ubiquitin. Monoubiquitinated PCNA recruits polymerases that can carry out DNA synthesis with damaged DNA but this is very error-prone, possibly resulting in the synthesis of mutated DNA. Lysine 63-linked polyubiquitination of PCNA allows it to perform a less error-prone mutation bypass known by the template switching pathway. [6] [77] [78]

Ubiquitination of histone H2AX is involved in DNA damage recognition of DNA double-strand breaks. Lysine 63-linked polyubiquitin chains are formed on H2AX histone by the E2/E3 ligase pair, Ubc13-Mms2/RNF168. [79] [80] This K63 chain appears to recruit RAP80, which contains a UIM, and RAP80 then helps localize BRCA1. This pathway will eventually recruit the necessary proteins for homologous recombination repair. [81]

Transcriptional regulation Edit

Histones can be ubiquitinated and this is usually in the form of monoubiquitination (although polyubiquitinated forms do occur). Histone ubiquitination alters chromatin structure and allows the access of enzymes involved in transcription. Ubiquitin on histones also acts as a binding site for proteins that either activate or inhibit transcription and also can induce further post-translational modifications of the protein. These effects can all modulate the transcription of genes. [82] [83]

Deubiquitinating enzymes (DUBs) oppose the role of ubiquination by removing ubiquitin from substrate proteins. They are cysteine proteases that cleave the amide bond between the two proteins. They are highly specific, as are the E3 ligases that attach the ubiquitin, with only a few substrates per enzyme. They can cleave both isopeptide (between ubiquitin and lysine) and peptide bonds (between ubiquitin and the N-terminus). In addition to removing ubiquitin from substrate proteins, DUBs have many other roles within the cell. Ubiquitin is either expressed as multiple copies joined in a chain (polyubiquitin) or attached to ribosomal subunits. DUBs cleave these proteins to produce active ubiquitin. They also recycle ubiquitin that has been bound to small nucleophilic molecules during the ubiquitination process. Monoubiquitin is formed by DUBs that cleave ubiquitin from free polyubiquitin chains that have been previously removed from proteins. [84] [85]

Ubiquitin-binding domains (UBDs) are modular protein domains that non-covalently bind to ubiquitin, these motifs control various cellular events. Detailed molecular structures are known for a number of UBDs, binding specificity determines their mechanism of action and regulation, and how it regulates cellular proteins and processes. [86] [87]

Pathogenesis Edit

The ubiquitin pathway has been implicated in the pathogenesis of a wide range of diseases and disorders including: [88]

Neurodegeneration Edit

Ubiquitin is implicated in neurodegenerative diseases associated with proteostasis dysfunction, including Alzheimer's disease, motor neurone disease, [89] Huntington's disease and Parkinson's disease. [90] Transcript variants encoding different isoforms of ubiquilin-1 are found in lesions associated with Alzheimer's and Parkinson's disease. [91] Higher levels of ubiquilin in the brain have been shown to decrease malformation of amyloid precursor protein (APP), which plays a key role in triggering Alzheimer's disease. [92] Conversely, lower levels of ubiquilin-1 in the brain have been associated with increased malformation of APP. [92] A frameshift mutation in ubiquitin B can result in a truncated peptide missing the C-terminal glycine. This abnormal peptide, known as UBB+1, has been shown to accumulate selectively in Alzheimer's disease and other tauopathies.

Infection and immunity Edit

Ubiquitin and ubiquitin-like molecules extensively regulate immune signal transduction pathways at virtually all stages, including steady-state repression, activation during infection, and attenuation upon clearance. Without this regulation, immune activation against pathogens may be defective, resulting in chronic disease or death. Alternatively, the immune system may become hyperactivated and organs and tissues may be subjected to autoimmune damage.

On the other hand, viruses must block or redirect host cell processes including immunity to effectively replicate, yet many viruses relevant to disease have informationally limited genomes. Because of its very large number of roles in the cell, manipulating the ubiquitin system represents an efficient way for such viruses to block, subvert or redirect critical host cell processes to support their own replication. [93]

The retinoic acid-inducible gene I (RIG-I) protein is a primary immune system sensor for viral and other invasive RNA in human cells. [94] The RIG-I-like receptor (RLR) immune signaling pathway is one of the most extensively studied in terms of the role of ubiquitin in immune regulation. [95]

Genetic Disorders Edit

    is caused by a disruption of UBE3A, which encodes a ubiquitin ligase (E3) enzyme termed E6-AP. involves disruption of a ubiquitin E3 ligase termed the VHL tumor suppressor, or VHL gene. : Eight of the thirteen identified genes whose disruption can cause this disease encode proteins that form a large ubiquitin ligase (E3) complex. is an autosomal-recessive growth retardation disorder associated with mutations of the Cullin7 E3 ubiquitin ligase. [96]

Diagnostic use Edit

Immunohistochemistry using antibodies to ubiquitin can identify abnormal accumulations of this protein inside cells, indicating a disease process. These protein accumulations are referred to as inclusion bodies (which is a general term for any microscopically visible collection of abnormal material in a cell). Examples include:

Link to cancer Edit

Post-translational modification of proteins is a generally used mechanism in eukaryotic cell signaling. [97] Ubiquitination, or ubiquitin conjugation to proteins, is a crucial process for cell cycle progression and cell proliferation and development. Although ubiquitination usually serves as a signal for protein degradation through the 26S proteasome, it could also serve for other fundamental cellular processes, [97] e.g. in endocytosis, [98] enzymatic activation [99] and DNA repair. [100] Moreover, since ubiquitination functions to tightly regulate the cellular level of cyclins, its misregulation is expected to have severe impacts. First evidence of the importance of the ubiquitin/proteasome pathway in oncogenic processes was observed due to the high antitumor activity of proteasome inhibitors. [101] [102] [103] Various studies have shown that defects or alterations in ubiquitination processes are commonly associated with or present in human carcinoma. [104] [105] [106] [107] [108] [109] [110] [111] Malignancies could be developed through loss of function mutation directly at the tumor suppressor gene, increased activity of ubiquitination, and/or indirect attenuation of ubiquitination due to mutation in related proteins. [112]

Direct loss of function mutation of E3 ubiquitin ligase Edit

Renal cell carcinoma Edit

The VHL (Von Hippel–Lindau) gene encodes a component of an E3 Ubiquitin Ligase. VHL complex targets member of the hypoxia-inducible transcription factor family (HIF) for degradation by interacting with the oxygen-dependent destruction domain under normoxic condition. HIF activates downstream targets such as the vascular endothelial growth factor (VEGF), promoting angiogenesis. Mutations in VHL prevent degradation of HIF and thus lead to the formation of hypervascular lesions and renal tumors. [104] [112]

Breast cancer Edit

The BRCA1 gene is another tumor suppressor gene in human which encodes the BRCA1 protein that is involved in response to DNA damage. The protein contains a RING motif with E3 Ubiquitin Ligase activity. BRCA1 could form dimer with other molecules, such as BARD1 and BAP1, for its ubiquitination activity. Mutations that affect the ligase function are often found and associated with various cancers. [108] [112]

Cyclin E Edit

As processes in cell cycle progression is the most fundamental processes for cellular growth and differentiation, and are the most common to be altered in human carcinomas, it is expected for cell cycle-regulatory proteins to be under tight regulation. The level of cyclins, as the name suggests, are high only at certain time point during cell cycle. This is achieved by continuous control of cyclins/CDKs levels through ubiquitination and degradation. When cyclin E is partnered with CDK2 and gets phosphorylated, an SCF-associated F-box protein Fbw7 recognizes the complex and thus targets it for degradation. Mutations in Fbw7 have been found in more than 30% of human tumors, characterizing it as a tumor suppressor protein. [111]

Increased ubiquitination activity Edit

Cervical cancer Edit

Oncogenic types of the human papillomavirus (HPV) are known to hijack cellular ubiquitin-proteasome pathway for viral infection and replication. The E6 proteins of HPV will bind to the N-terminus of the cellular E6-AP E3 ubiquitin ligase, redirecting the complex to bind p53, a well-known tumor suppressor gene that inactivation is found in many types of cancer. [106] Thus, p53 undergoes ubiquitination and proteasome-mediated degradation. Meanwhile, E7, another one of the early-expressed HPV genes, will bind to Rb, also a tumor suppressor gene, mediating its degradation. [112] The loss of p53 and Rb in cells allows limitless cell proliferation to occur.

P53 regulation Edit

Gene amplification often occur in various tumor cases, including of MDM2, a gene encodes for a RING E3 Ubiquitin ligase responsible for downregulation of p53 activity. MDM2 targets p53 for ubiquitination and proteasomal degradation thus keeping its level appropriate for normal cell condition. Overexpression of MDM2 causes loss of p53 activity and therefore allowing cells to have a limitless replicative potential. [107] [112]

P27 Edit

Another gene that is a target of gene amplification is SKP2. SKP2 is an F-box protein that roles in substrate recognition for ubiquitination and degradation. SKP2 targets p27 Kip-1 , an inhibitor of cyclin-dependent kinases (CDKs). CDKs2/4 partner with the cyclinsE/D, respectively, family of cell cycle regulator to control cell cycle progression through the G1 phase. Low level of p27 Kip-1 protein is often found in various cancers and is due to overactivation of ubiquitin-mediated proteolysis through overexpression of SKP2. [109] [112]

Efp Edit

Efp, or estrogen-inducible RING-finger protein, is an E3 ubiquitin ligase that overexpression has been shown to be the major cause of estrogen-independent breast cancer. [103] [113] Efp's substrate is 14-3-3 protein which negatively regulates cell cycle.

Evasion of Ubiquitination Edit

Colorectal cancer Edit

The gene associated with colorectal cancer is the adenomatous polyposis coli (APC), which is a classic tumor suppressor gene. APC gene product targets beta-catenin for degradation via ubiquitination at the N-terminus, thus regulating its cellular level. Most colorectal cancer cases are found with mutations in the APC gene. However, in cases where APC gene is not mutated, mutations are found in the N-terminus of beta-catenin which renders it ubiquitination-free and thus increased activity. [105] [112]

Glioblastoma Edit

As the most aggressive cancer originated in the brain, mutations found in patients with glioblastoma are related to the deletion of a part of the extracellular domain of the epidermal growth factor receptor (EGFR). This deletion causes CBL E3 ligase unable to bind the receptor for its recycling and degradation via a ubiquitin-lysosomal pathway. Thus, EGFR is constitutively active in the cell membrane and activates its downstream effectors that are involved in cell proliferation and migration. [110]

Phosphorylation-dependent ubiquitination Edit

The interplay between ubiquitination and phosphorylation has been an ongoing research interest since phosphorylation often serves as a marker where ubiquitination leads to degradation. [97] Moreover, ubiquitination can also act to turn on/off the kinase activity of a protein. [114] The critical role of phosphorylation is largely underscored in the activation and removal of autoinhibition in Cbl protein. [115] Cbl is an E3 ubiquitin ligase with a RING finger domain that interacts with its tyrosine kinase binding (TKB) domain, preventing interaction of the RING domain with an E2 ubiquitin-conjugating enzyme. This intramolecular interaction is an autoinhibition regulation that prevents its role as a negative regulator of various growth factors and tyrosine kinase signaling and T-cell activation. [115] Phosphorylation of Y363 relieves the autoinhibition and enhances binding to E2. [115] Mutations that renders the Cbl protein dysfunctional due to the loss of its ligase/tumor suppressor function and maintenance of its positive signaling/oncogenic function have been shown to cause development of cancer. [116] [117]

As a drug target Edit

Screening for ubiquitin ligase substrates Edit

Identification of E3 ligase substrates is critical to understand its implication in human diseases since deregulation of E3-substrate interactions are often served as major cause in many. To overcome the limitation of mechanism used to identify the substrates of the E3 Ubiquitin Ligase, a system called the 'Global Protein Stability (GPS) Profiling' was developed in 2008. [118] This high-throughput system made use of reporter proteins fused with thousands of potential substrates independently. By inhibition of the ligase activity (through the making of Cul1 dominant negative thus renders ubiquitination not to occur), increased reporter activity shows that the identified substrates are being accumulated. This approach added a large number of new substrates to the list of E3 ligase substrates.

Possible therapeutic applications Edit

Blocking of specific substrate recognition by the E3 ligases, e.g. Bortezomib. [113]

Challenge Edit

Finding a specific molecule that selectively inhibits the activity of a certain E3 ligase and/or the protein-protein interactions implicated in the disease remains as one of the important and expanding research area. Moreover, as ubiquitination is a multi-step process with various players and intermediate forms, consideration of the much complex interactions between components needs to be taken heavily into account while designing the small molecule inhibitors. [103]

Although ubiquitin is the most-understood post-translation modifier, there is a growing family of ubiquitin-like proteins (UBLs) that modify cellular targets in a pathway that is parallel to, but distinct from, that of ubiquitin. Known UBLs include: small ubiquitin-like modifier (SUMO), ubiquitin cross-reactive protein (UCRP, also known as interferon-stimulated gene-15 ISG15), ubiquitin-related modifier-1 (URM1), neuronal-precursor-cell-expressed developmentally downregulated protein-8 (NEDD8, also called Rub1 in S. cerevisiae), human leukocyte antigen F-associated (FAT10), autophagy-8 (ATG8) and -12 (ATG12), Few ubiquitin-like protein (FUB1), MUB (membrane-anchored UBL), [119] ubiquitin fold-modifier-1 (UFM1) and ubiquitin-like protein-5 (UBL5, which is but known as homologous to ubiquitin-1 [Hub1] in S. pombe). [120] [121] Whilst these proteins share only modest primary sequence identity with ubiquitin, they are closely related three-dimensionally. For example, SUMO shares only 18% sequence identity, but they contain the same structural fold. This fold is called "ubiquitin fold". FAT10 and UCRP contain two. This compact globular beta-grasp fold is found in ubiquitin, UBLs, and proteins that comprise a ubiquitin-like domain, e.g. the S. cerevisiae spindle pole body duplication protein, Dsk2, and NER protein, Rad23, both contain N-terminal ubiquitin domains.

These related molecules have novel functions and influence diverse biological processes. There is also cross-regulation between the various conjugation pathways, since some proteins can become modified by more than one UBL, and sometimes even at the same lysine residue. For instance, SUMO modification often acts antagonistically to that of ubiquitination and serves to stabilize protein substrates. Proteins conjugated to UBLs are typically not targeted for degradation by the proteasome but rather function in diverse regulatory activities. Attachment of UBLs might, alter substrate conformation, affect the affinity for ligands or other interacting molecules, alter substrate localization, and influence protein stability.

UBLs are structurally similar to ubiquitin and are processed, activated, conjugated, and released from conjugates by enzymatic steps that are similar to the corresponding mechanisms for ubiquitin. UBLs are also translated with C-terminal extensions that are processed to expose the invariant C-terminal LRGG. These modifiers have their own specific E1 (activating), E2 (conjugating) and E3 (ligating) enzymes that conjugate the UBLs to intracellular targets. These conjugates can be reversed by UBL-specific isopeptidases that have similar mechanisms to that of the deubiquitinating enzymes. [73]

Within some species, the recognition and destruction of sperm mitochondria through a mechanism involving ubiquitin is responsible for sperm mitochondria's disposal after fertilization occurs. [122]

Prokaryotic origins Edit

Ubiquitin is believed to have descended from bacterial proteins similar to ThiS ( O32583 ) [123] or MoaD ( P30748 ). [124] These prokaryotic proteins, despite having little sequence identity (ThiS has 14% identity to ubiquitin), share the same protein fold. These proteins also share sulfur chemistry with ubiquitin. MoaD, which is involved in molybdopterin biosynthesis, interacts with MoeB, which acts like an E1 ubiquitin-activating enzyme for MoaD, strengthening the link between these prokaryotic proteins and the ubiquitin system. A similar system exists for ThiS, with its E1-like enzyme ThiF. It is also believed that the Saccharomyces cerevisiae protein Urm-1, a ubiquitin-related modifier, is a "molecular fossil" that connects the evolutionary relation with the prokaryotic ubiquitin-like molecules and ubiquitin. [125]

Archaea have a functionally closer homolog of the ubiquitin modification system, where "sampylation" with SAMPs (small archaeal modifier proteins) is performed. The sampylation system only uses E1 to guide proteins to the proteosome. [126] Proteoarchaeota, which are related to the ancestor of eukaryotes, possess all of the E1, E2, and E3 enzymes plus a regulated Rpn11 system. Unlike SAMP which are more similar to ThiS or MoaD, Proteoarchaeota ubiquitin are most similar to eukaryotic homologs. [127]

Prokaryotic ubiquitin-like protein (Pup) is a functional analog of ubiquitin which has been found in the gram-positive bacterial phylum Actinobacteria. It serves the same function (targeting proteins for degradations), although the enzymology of ubiquitination and pupylation is different, and the two families share no homology. In contrast to the three-step reaction of ubiquitination, pupylation requires two steps, therefore only two enzymes are involved in pupylation.

In 2017, homologs of Pup were reported in five phyla of gram-negative bacteria, in seven candidate bacterial phyla and in one archaeon [128] The sequences of the Pup homologs are very different from the sequences of Pup in gram-positive bacteria and were termed Ubiquitin bacterial (UBact), although the distinction has yet not been proven to be phylogenetically supported by a separate evolutionary origin and is without experimental evidence. [128]

The finding of the Pup/UBact-proteasome system in both gram-positive and gram-negative bacteria suggests that either the Pup/UBact-proteasome system evolved in bacteria prior to the split into gram positive and negative clades over 3000 million years ago or, [129] that these systems were acquired by different bacterial lineages through horizontal gene transfer(s) from a third, yet unknown, organism. In support of the second possibility, two UBact loci were found in the genome of an uncultured anaerobic methanotrophic Archaeon (ANME-1locus CBH38808.1 and locus CBH39258.1).


A c-type lysozyme cDNA was amplified from an important agricultural pest, the beet webworm L. sticticalis. The mature LsLysozyme contains an ORF of 426 bp that encodes a 121-amino acid protein (from amino acids 22 to 141). The alignment multiple sequences between the L. sticticalis and other Lepidoptera mature c-type lysozyme amino acid shows that LsLysozyme is highly conserved in these Lepidoptera c-type lysozymes including the 8 cysteine residues. The high conservation cysteine residues indicate their importance in the formation of disulfide bridges and three dimensional structure of the molecule. There are three features in a typical c-type lysozyme. For example, Glu 53 and Asp 71 of the catalytic site, 11 residues Asn 52 , Glu 53 , Gly 55 , Gly 62 , Gln 76 , Asn 78 , Tyr 81 , Ile 116 , Arg 118 , Ala 123 , and Trp 124 which formed the catalytic cleft, and the resuides Thr 102 , Lys 107 , and Ala 108 which were the Ca 2+ binding site (Fig 2). Especially, Glu 53 and Asp 71 of the catalytic site were fundamental for the biological activity of the lysozymes [26]. One major difference in the active site residues in terms of hydrophobicity between LsLysozyme and the other two Orthoptera insects lysozymes occurs at Ala 100 (hydropathy index: 1.8) of LsLysozyme, which is Gly (hydropathy index: -0.4) in two Orthoptera insects lysozymes.

The active site of HEWL included 6 subsites from A to F which are able to bind 6 sugar residues [27, 28]. These subsites are present in the LsLysozyme as Asn 31 , Gly 34 , Tyr 60 , Trp 61 , Arg 97 , and Trp 103 (Fig 2) and are forecasted to interact with sugar rings placed in the subsites E, F, B, C, A and D, respectively. The substitution of the conserved residues with others was supposed to modulate the lysozyme activity. For example, when Asn 37 and Trp 62 of HEWL were replaced by Gly 33 and Tyr 59 of S. gregaria, respectively, exhibited about 2–3 fold higher bacteriolytic activity than HEWL [29]. As a matter of fact, such substitution also occurred naturally in LsLysozyme, since the sequence of LsLysozyme also contains a Gly in positon 34 and a Tyr in position 60, an enhanced bacteriolytic activity may be observed in L. sticticalis than HEWL.

One 7-peptide motif (GI/LF/YQIND/N) and two 3-peptide motifs (N/DGS, Y/FWC) are conserved among C-type lysozymes [5], and they are also found in the lysozyme from L. sticticalis. Together, these sequence motifs as well as the alignment and phylogenetic results confirm that L. sticticalis lysozyme belongs to the C-type lysozyme family of proteins. C-type lysozymes are divided into two major types, non-calcium binding and calcium binding, and L. sticticalis lysozyme C appears to belong the latter.

Lysozyme was found to be constitutively expressed during the entire life cycle of L. sticticalis, and the expression level increased constantly during the larval stage, which was accord with the expression pattern in Anopheles dirus [27] [30]. A similar expression profile of lysozyme was also reported for the housefly Musca domestica [5]. Lysozyme expression in L. sticticalis was detected at lower levels in prepupae, medium levels in larvae and higher levels in adults, peaking in pupae. The highest expression level during the pupal stage was also found in Helicoverpa armigera [28] [31]. Such significant changes in lysozyme expression may be partially explained by the complete reorganization of body tissues during the transition from larva to adult. However, the needs to be elucidated by further study.

L. sticticalis lysozyme is expressed in all tissues, with high mRNA levels in the fat bodies and midgut and low mRNA levels in the epidermis and hemolymph. This expression pattern is similar to Ostrinia nubilalis lysozymes, which are also expressed in the epidermis, fat body, midgut and hemolymph [32]. Except for its role in antimicrobial defense, lysozyme has been reported to have a digestion role in some Diptera and Hemiptera [32, 33]. The phylogenetic analysis based on the deduced amino acid sequences showed that lysozyme from L. sticticalis is grouped with those molecules from Lepidoptera, and far from that of the cyclorrhaphan Diptera. Both maximum evolution and minimum likelihood analysis supported the above results. Lysozymes from Lepidoptera are immune-related, however, those from cyclorrhaphan Diptera are of digestive function [29, 34]. Therefore, LsLysozyme gene might play an important role in immunity rather than in digestive functions.

As expression of the L. sticticalis lysozyme gene was up-regulated when larvae were challenged with B. bassiana (Fig 5C), this enzyme may play an important role in L. sticticalis larval defense against this fungus. Prior to infection with B. bassiana, the expression level of the gene in larvae reared at the density of 30 larvae per jar was significantly higher than that in larvae reared at densities of 1 and 10 larvae per jar. The results of our previous study on the antimicrobial activity of lysozyme in L. sticticalis larvae also showed significantly higher activity in larvae at a density of 30 larvae per jar compared to larvae reared individually [22]. This expression pattern may be caused by larval crowding. Prophylaxis in Anticarsia gemmatalis larvae is reported to be triggered by the presence of conspecifics [35], and when the larval density reached 30 larvae per jar, the crowded conditions up-regulated lysozyme expression, increasing prophylaxis.

After larvae were infected by B. bassiana, the expression level of LsLysozyme of larvae reared at a density of 10 larvae per jar was significantly higher than that of larvae reared at densities of 1 and 30 larvae per jar. However, this pattern is different from the expression pattern of non-immunized larvae before infection. The difference may be related to variation in LsLysozyme gene expression in infected versus non-infected larvae, as expression was significantly increased when the larvae were infected with B. bassiana (Fig 5C). Transcripts of lysozyme c-1 and c-2 genes were also significantly increased after immune challenge with Escherichia coli or Micrococcus luteus [36]. Therefore, infection with B. bassiana may have influenced the expression pattern of the lysozyme gene in larvae reared at the three different densities.

We identified significant transcriptional up-regulation of lysozyme expression in the larval hemolymph in response to larval rearing density. And increased phenoloxidase activity, total haemocyte count and lysozyme activity were found in the crowded L. sticticalis [22]. So, defense strategy for enhancing specific effectors (such as lysozyme) by activate the entire immune pathways might be taken in the Lepidoptera L. sticticalis. However, this defense strategy may be different from the Orthoptera Locusta migratoria. Wang et al. (2013) [37] found that gregarious (crowded) migratory locusts exhibited high level of circulation PRPs but not AMPs, and there were no significantly differences in the PO activity, encapsulation response, and total hemocyte count between gregaria and solitaria locusts [23]. These results suggested a selection of other tolerance strategy for inhibiting pathogen spread and for increasing the “distance” between infected and susceptible individuals that together improved the immune defense of gregarious locusts [37].

This is the first report of the isolation and characterization of a c-type lysozyme from L. sticticalis. These results suggest that larval crowding could increase lysozyme gene expression. Therefore, larvae may use crowding as a cue to induce lysozyme expression to protect against possible infection. To better understand the immune function of LsLysozyme among larvae reared at different densities, the activity and expression of the protein in response to infection should be further analyzed. Additional research involving functional characterization of population density-regulated genes would reveal the precise gene pathways and regulatory mechanism of density stress at the molecular level.

Anderson DG, McKay LL (1983) Simple and rapid method for isolating large plasmid DNA from lactic streptococci. Appl Environ Microbiol 46:549–552

Bhunia AK, Johnson MC, Ray B, Belden EL (1990) Antigenic property of pediocin AcH produced byPediococcus acidilactici H. J Appl Bacteriol 69:211–215

Daba H, Pandian S, Gasselin JF, Simard RE, Huang J, Lacroix C (1991) Detection of activity of a bacteriocin produced by aLeuconostoc mesenteroides. Appl Environ Microbiol 57:3450–3455

Daba H, lacroix C, Huang J, Simard RE (1993) Influence of growth conditions on production and activity of mesenterocin 5 by a strain ofLeuconostoc mesenteroides. Appl Microbiol Biotechnol 39:166–173

Daeschel MA (1989) Antimicrobial substances from lactic acid bacteria for use as food preservatives. Food Technol 43:164–167

Delves-Broughton J (1990) Nisin and its uses as a food preservative. Food Technol 44:100–117

Eckner KF (1992) Bacteriocins and food applications. Dairy Food Environ Sanit 12:204–209

Fernandes CF, Shahani KM (1990) Anticarcinogenic and immunological properties of dietary lactobacilli. J Food Protect 53:704–710

Gonzalez CF, Kunka BS (1987) Plasmid associated bacteriocin production and sucrose fermentation inPediococcus acidilactici. Appl Environ Microbiol 53:2534–2538

Harding CD, Shaw BG (1990) Antimicrobial activity ofLeuconostoc gelidum against closely related species andListeria monocytogenes. J Appl Bacteriol 69:648–654

Hastings JW, Stiles ME (1991) Antibiosis ofLeuconostoc gelidum isolated from meat. J Appl Bacteriol 70:127–134

Hastings JW, Sailer M, Johnson K, Roy KL, Vederas JC, Stiles ME (1991) Characterization of leucocin A-UAL187 and cloning of the bacteriocin gene fromLeuconostoc gelidum. J Bacteriol 173:7491–7500

Héchard Y, Derijard B, Letellier F, Cenatiempo Y (1992) Characterization and purification of mesentericin Y105, an anti-Listeria bacteriocin fromLeuconostoc mesenteroides. J Gen Microbiol 138:2725–2731

Holck A, Axelsson L, Birkeland S-E, Aukrust T, Blom H (1992) Purification and amino acid sequence of sakacin A, a bacteriocin fromLactobacillus sake Lb706. J Gen Microbiol 138:2715–2720

Jiménez-Díaz R, Rios-Sanchez RM, Desmazeaud M, Ruiz-Barba JL (1993) Plantaricins S and T, two new bacteriocins produced byLactobacillus plantarum LPC010 isolated from a green olive fermentation. Appl Environ Microbiol 59:1416–1424

Klaenhammer TR (1988) Bacteriocins of lactic acid bacteria. Biochimie 70:337–349

Klaenhammer TR (1993) Genetics of bacteriocins produced by lactic acid bacteria. FEMS Microbiol Rev 12:39–86

Lewus CB, Kaiser A, Montville TR (1991) Inhibition of food-borne bacterial pathogens by bacteriocins from lactic acid bacteria isolated from meat. Appl Environ Microbiol 57:1683–1688

Lewus CB, Sun S, Montville TJ (1992) Production of an amylase-sensitive bacteriocin by an atypicalLeuconostoc paramesenteroides strain. Appl Environ Microbiol 58:143–149

Marugg JD, Gonzalez CF, Kunka BS, Ledeboer AM, Pucci MJ, Toonen MY, Walker SA, Zoetmulden LCM, Vandenbergh PA (1992) Cloning, expression, and nucleotide sequence of genes involved in production of pediocin PA-1, a bacteriocin fromPediococcus acidilactici PAC1.0. Appl Microbiol 58:2360–2367

Mathieu F, Suwandhi IS, Rekhif N, Milliere JB, Lefebvre G (1993) Mesenterocin 52, a bacteriocin produced byLeuconostoc mesenteroides ssp.mesenteroides FR 52. J Appl Bacteriol 74:372–379

Nam K, Blasi U, Zagotta MT, and Young RY (1990) Conservation of a dual-start motif in P22 lysis gene regulation. J Bacteriol 172:204–211

Papathansopoulos MA, Franz CMAP, Dykes GA, von Holy A (1991) Antimicrobial activity of meat spoilage lactic acid bacteria. SA J Sci 87:243–246

Pucci MJ, Vedamuthu ER, Kunka BS, Vandenbergh PA (1988) Inhibition ofListeria monocytogenes by using bacteriocin PA-1 produced byPediococcus acidilactici PAC1.0. Appl Environ Microbiol 54:2349–2353

Rammelsberg M, Muller E, Radler F (1990) Caseicin 80: purification and characterization of a new bacteriocin fromLactobacillus casei. Arch Microbiol 154:249–252

Sambrook J, Fritsch EF, Maniatis T (1989) Molecular cloning. A laboratory manual. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press

Schillinger U, Lücke F-K (1987) Identification of lactobacilli from meat and meat products. Food Microbiol 4:119–208

Schillinger U, Lücke F-K (1989) Antibacterial activity ofLactobacillus sake isolated from meat. Appl Environ Microbiol 55:1901–1906

Shaw BG, Harding CD (1989)Leuconostoc gelidum sp. nov. andLeuconostoc carnosum sp. nov. from chilli-stored meats. Int J Syst Bacteriol 39:217–223

Stiles ME, Hastings JW (1991) Bacteriocin production by lactic acid bacteria: potential for use in meat preservation. Trends Food Sci Technol 2:247–251

Tagg JR, Dajani AS, Wannamaker LM (1976) Bacteriocins of Gram-positive bacteria. Bacteriol Rev 40:722–756

Tichaczek PS, Nissen-Meyer J, Nes IF, Vogal RF, Hammes WP (1992) Characterization of the bacteriocins curvacin A fromLactobacillus curvatus LTH1174 and sakacin P fromL. sake LTH673. Syst Appl Microbiol 15:460–468

van Belkum MJ, Hayema BJ, Geis A, Kok J, Venema G (1989) Cloning of two bacteriocin genes from a lactococcal bacteriocin plasmid. Appl Environ Microbiol 55:1187–1191

van Laack RLJM, Schillinger U, Holzapfel WH (1992) Characterization and partial purification of a bacteriocin produced byLeuconostoc carnosum LA44a. Int J Food Microbiol 16:183–195

von Holy A, Cloete TE, Holzapfel WH (1991) Quantification and characterization of microbial populations associated with spoiled vacuum-packed Vienna sausages. Food Microbiol 8:95–104

<p>This section provides information about the protein and gene name(s) and synonym(s) and about the organism that is the source of the protein sequence.<p><a href='/help/names_and_taxonomy_section' target='_top'>More. </a></p> Names & Taxonomy i

<p>Information which has been imported from another database using automatic procedures.</p> <p><a href="/manual/evidences#ECO:0000313">More. </a></p> Automatic assertion inferred from database entries i

Automatic assertion inferred from database entries i

Brawand, D., C. E. Wagner, Y. I. Li, M. Malinsky, I. Keller, S. Fan, O. Simakov, A. Y. Ng, Z. W. Lim, E. Bezault, J. Turner-Maier, J. Johnson, R. Alcazar, H. J. Noh, P. Russell, B. Aken, J. Alfoldi, C. Amemiya, N. Azzouzi, J. F. Baroiller, F. Barloy-Hubler, A. Berlin, R. Bloomquist, K. L. Carleton, M. A. Conte, H. D’Cotta, O. Eshel, L. Gaffney, F. Galibert, H. F. Gante, S. Gnerre, L. Greuter, R. Guyon, N. S. Haddad, W. Haerty, R. M. Harris, H. A. Hofmann, T. Hourlier, G. Hulata, D. B. Jaffe, M. Lara, A. P. Lee, I. MacCallum, S. Mwaiko, M. Nikaido, H. Nishihara, C. Ozouf-Costaz, D. J. Penman, D. Przybylski, M. Rakotomanga, S. C. Renn, F. J. Ribeiro, M. Ron, W. Salzburger, L. Sanchez-Pulido, M. E. Santos, S. Searle, T. Sharpe, R. Swofford, F. J. Tan, L. Williams, S. Young, S. Yin, N. Okada, T. D. Kocher, E. A. Miska, E. S. Lander, B. Venkatesh, R. D. Fernald, A. Meyer, C. P. Ponting, J. T. Streelman, K. Lindblad-Toh, O. Seehausen & F. Di Palma, 2014. The genomic substrate for adaptive radiation in African cichlid fish. Nature 513: 375–381.

Callewaert, L. & C. W. Michiels, 2010. Lysozymes in the animal kingdom. Journal of Biosciences 35: 127–160.

Clabaut, C., W. Salzburger & A. Meyer, 2005. Comparative phylogenetic analyses of the adaptive radiation in Lake Tanganyika cichlid fishes: nuclear sequences are less homoplasious but also less informative than mitochondrial DNA. Journal of Molecular Evolution 31: 666–681.

Cohen, A. S., M. J. Soreghan & C. A. Scholz, 1993. Estimating the age of formation of lakes: an example from Lake Tanganyika, East African Rift system. Geology 21: 511–514.

Cohen, A. S., K. E. Lezzar, J. J. Tiercelin & M. Soreghan, 1997. New palaeogeographic and lake-level reconstructions of Lake Tanganyika: implications for tectonic, climatic and biological evolution in a rift lake. Basin Research 9: 107–132.

Conant, G. C. & K. H. Wolfe, 2008. Turning a hobby into a job: how duplicated genes find new functions. Nature Reviews Genetics 9: 938–950.

Coulter, G. W., 1991. The Benthic Fish Community. Oxford University Press, London.

Delvaux, D., 1995. Age of Lake Malawi (Nyasa) and Water Level Fluctuations. Muses Royal de l’Afrique Centrale, Tervuren (Belgium), Department of Geology and Mineralogy, Rapp Annual: 99–108.

Dobson, D. E., E. M. Prager & A. C. Wilson, 1984. Stomach lysozyme of ruminants. Journal of Biological Chemistry 259: 11607–11616.

Friedman, M., B. P. Keck, A. Dornburg, R. I. Eytan, C. H. Martin, C. D. Hulsey, P. C. Wainwright & T. J. Near, 2013. Molecular and fossil evidence place the origin of cichlid fishes long after Gondwanan rifting. Proceedings of the Royal Society of London B: Biological Sciences 280: 20131733.

Fryer, G. & T. D. Iles, 1972. The Cichlid Fishes of the Great Lakes of Africa. Oliver and Boyd, Edinburgh.

Genner, M. J., O. Seehausen, D. H. Lunt, D. A. Joyce, P. W. Shaw, G. R. Carvalho & G. F. Turner, 2007. Age of cichlids: new dates for ancient lake fish radiations. Molecular Biology and Evolution 24: 1269–1282.

Hikima, J.-I., S. Minagawa, I. Hirono & T. Aoki, 2001. Molecular cloning, expression and evolution of the Japanese flounder goose-type lysozyme gene, and the lytic activity of its recombinant protein. Biochimica Biophysica Acta 1520: 35–44.

Irwin, D. M., 1995. Evolution of the bovine lysozyme gene family: changes in gene expression and reversion of function. Journal of Molecular Evolution 41: 299–312.

Irwin, D. M., E. M. Prager & A. C. Wilson, 1992. Evolutionary genetics of ruminant lysozymes. Animal Genetics 23: 193–202.

Johnson, T. C., C. A. Scholz, M. R. Talbot, K. Kelts, R. D. Ricketts, G. Ngobi, K. Beuning, I. I. Ssemmanda & J. W. McGill, 1996. Late Pleistocene desiccation of Lake Victoria and rapid evolution of cichlid fishes. Science 273: 1091–1093.

Jolles, J., E. M. Prager, E. S. Alnemri, P. Jolles, I. M. Ibrahimi & A. C. Wilson, 1990. Amino-acid-sequences of stomach and nonstomach lysozymes of ruminants. Journal of Molecular Evolution 30: 370–382.

Jolles, J., A. Fiala-Medioni & P. Jolles, 1996. The ruminant digestion model using bacteria already employed early in evolution by symbiotic molluscs. Journal of Molecular Evolution 43: 523–527.

Klein, D., H. Ono, C. O’hUigin, V. Vincek, T. Goldschmidt & J. Klein, 1993. Extensive MHC variability in cichlid fishes of Lake Malawi. Nature 364: 330–334.

Koblmuller, S., U. K. Schliewen, N. Duftner, K. M. Sefc, C. Katongo & C. Sturmbauer, 2008a. Age and spread of the haplochromine cichlid fishes in Africa. Molecular Phylogenetics and Evolution 49: 153–169.

Koblmüller, S., K. M. Sefc & C. Sturmbauer, 2008b. The Lake Tanganyika cichlid species assemblage: recent advances in molecular phylogenetics. Hydrobiologia 615: 5–20.

Kocher, T. D., 2004. Adaptive evolution and explosive speciation: the cichlid fish model. Nature Reviews Genetics 5: 288–298.

Kondrashov, F. A., 2012. Gene duplication as a mechanism of genomic adaptation to a changing environment. Proceedings of the Royal Society of London B: Biological Sciences 279: 5048–5057.

Kornegay, J. R., 1996. Molecular genetics and evolution of stomach and nonstomach lysozymes in the Hoatzin. Journal of Molecular Evolution 42: 676–684.

Kuroiwa, A., Y. Terai, N. Kobayashi, K. Yoshida, M. Suzuki, A. Nakanishi, Y. Matsuda, M. Watanabe & N. Okada, 2013. Construction of chromosome markers from the Lake Victoria cichlid Paralabidochromis chilotes and their application to comparative mapping. Cytogenetic and Genome Research 142: 112–120.

Makino, T. & M. Kawata, 2012. Habitat variability correlates with duplicate content of Drosophila genomes. Molecular Biology and Evolution 29: 3169–3179.

Mariano, D. C., F. L. Pereira, P. Ghosh, D. Barh, H. C. Figueiredo, A. Silva, R. T. Ramos & V. A. Azevedo, 2015. MapRepeat: an approach for effective assembly of repetitive regions in prokaryotic genomes. Bioinformation 11: 276–279.

Mayer, W. E., H. Tichy & J. Klein, 1998. Phylogeny of African cichlid fishes as revealed by molecular markers. Heredity 80: 702–714.

Ohno, S., 1970. Evolution by Gene Duplication. Springer, New York.

Pooart, J., T. Torikata & T. Araki, 2005. Enzymatic properties of rhea lysozyme. Bioscience, Biotechnology, and Biochemistry 69: 103–112.

Saitou, N. & M. Nei, 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4: 406–425.

Salzburger, W., A. Meyer, S. Baric, E. Verheyen & C. Sturmbauer, 2002. Phylogeny of the Lake Tanganyika cichlid species flock and its relationship to the Central and East African haplochromine cichlid fish faunas. Systematic Biology 51: 113–135.

Salzburger, W., T. Mack, E. Verheyen & A. Meyer, 2005. Out of Tanganyika: genesis, explosive speciation, key-innovations and phylogeography of the haplochromine cichlid fishes. BMC Evolutionary Biology 5: 17.

Sankaran, K. & S. Gurnani, 1972. On the variation in the catalytic activity of lysozyme in fishes. Indian Journal of Biochemistry and Biophysics 9: 162–165.

Saurabh, S. & P. K. Sahoo, 2008. Lysozyme: an important defence molecule of fish innate immune system. Aquaculture Research 39: 223–239.

Stewart, C. B., J. W. Schilling & A. C. Wilson, 1987. Adaptive evolution in the stomach lysozymes of foregut fermenters. Nature 330: 401–404.

Sturmbauer, C., W. Salzburger, N. Duftner, R. Schelly & S. Koblmuller, 2010. Evolutionary history of the Lake Tanganyika cichlid tribe Lamprologini (Teleostei: Perciformes) derived from mitochondrial and nuclear DNA data. Molecular Phylogenetics and Evolution 57: 266–284.

Swanson, K. W., D. M. Irwin & A. C. Wilson, 1991. Stomach lysozyme gene of the langur monkey: tests for convergence and positive selection. Journal of Molecular Evolution 33: 418–425.

Takahashi, T. & S. Koblmüller, 2011. The adaptive radiation of cichlid fish in Lake Tanganyika: a morphological perspective. International Journal of Evolutionary Biology 2011: 620754.

Takahashi, T. & T. Sota, 2016. A robust phylogeny among major lineages of the East African cichlids. Molecular Phylogenetics and Evolution 100: 234–242.

Tamura, K., D. Peterson, N. Peterson, G. Stecher, M. Nei & S. Kumar, 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular Biology and Evolution 28: 2731–2739.

Taoka, Y., H. Maeda, J. Y. Jo, S. M. Kim, S. I. Park, T. Yoshikawa & T. Sakata, 2006. Use of live and dead probiotic cells in tilapia Oreochromis niloticus. Fisheries Science 72: 755–766.

Terai, Y., K. Takahashi, M. Nishida, T. Sato & N. Okada, 2003. Using SINEs to probe ancient explosive speciation: “hidden” radiation of African cichlids? Molecular Biology and Evolution 20: 924–930.

Turner, G. F., O. Seehausen, M. E. Knight, C. J. Allender & R. L. Robinson, 2001. How many species of cichlid fishes are there in African lakes? Molecular Ecology 10: 793–806.

Wang, Y., L. Geer, C. Chappey, J. A. Kans & S. H. Bryant, 2000. Cn3D: sequence and structure views for Entrez. Trends in Biochemical Sciences 6: 300–302.

Weiss, J. D., F. P. Cotterill & U. K. Schliewen, 2015. Lake Tanganyika – A ‘Melting Pot’ of ancient and young cichlid lineages (Teleostei: Cichlidae)? PLoS One 10: e0125043.

Welker, T. L., C. Lim, M. Yildirim-Aksoy & P. H. Klesius, 2007. Growth, immune function and disease and stress resistance of juvenile Nile tilapia (Oreochromis niloticus) fed graded level of bovine lactoferrin. Aquaculture 262: 156–162.

Yazawa, R., I. Hirono & T. Aoki, 2006. Transgenic zebrafish expressing chicken lysozyme show resistance against bacterial diseases. Transgenic Research 15: 385–391.

Zhang, J., 2003. Evolution by gene duplication: an update. Trends in Ecology and Evolution 18: 292–298.


Cloning of the gene encoding PB polymerase I large fragment

The gene encoding DNA polymerase I from Psychrobacillus sp. (Additional file 4) was cloned into the vector pET151/D-TOPO® using the Gateway® Technology (Thermo Fisher). The starting material for the polymerase chain reaction was the genomic DNA of Psychrobacillus sp., kindly provided by Marcin M. Pierechod. The bacterium has been collected from marine biota on a cruise of the research vessel Jan Mayen (Norway) around the Lofoten, an archipelago in Northern Norway. The genomic DNA has been isolated using the bead-beating method with the MP Biomedicals™ FastPrep-24™ Classic Instrument (Thermo Fisher Scientific). By use of the forward and reverse primer (Table 1) the gene has been truncated to the so-called large fragment of the DNA polymerase I (Additional file 5), i.e. omitting the 5′-3′ exonuclease domain of the protein.

Evolution library creation

To generate an evolution library of PB pol I LF a fragment thereof covering amino-acid residue 174 to 580, i.e. omitting the first third of the protein, was submitted to codon optimization and molecular evolution experiments (Gene™ Controlled Randomization technology, Thermo Fisher Scientific) with a default of an average number of 3.5 amino-acid residue mutations per construct in the pET-11a vector. According to the manufacturer the amplified library was digested with NheI/BamHI and ligated into the pET-11a vector. Ligation reactions were transformed into E. coli strain DH5a and the transformation rate was determined by plating of dilution series. The total number of transformants was 1.53 × 10 5 cfu. The evolution library was received as glycerol stock preparation, i.e. total cells from the transformation were resuspended in 50% glycerol at 1.55 × 10 10 cells/ml.

Small-scale protein production and semi-purification in 96-well plate format

The evolution library from Thermo Fisher Scientific was received as glycerol stock preparation. These glycerol stocks consisted of the cloned library in pET-11a vector in DH5α cells. Plasmid isolation has been performed in 96-well format with PureLink™ Pro Quick96 Plasmid Purification Kit (Thermo Fisher Scientific) from single colonies after striking out the glycerol stock onto LB/Amp plates and overnight cultivation in 1.5 ml Luria Bertani (LB)/ampicillin (100 μg/ml) thereof. Subsequently the isolated plasmids, each representing a single variant of PB pol I LF with one or more mutations, have been transformed into in-house produced chemically competent Rosetta 2 (DE3) cells in 48-well format for recombinant protein production. For the overnight culture 1.5 ml LB/ampicillin (100 μg/ml) were inoculated with 5–6 colonies of each variant. After incubation overnight at 37 °C and 220 rpm 250 μl were transferred into 3 ml fresh Terrific Broth (TB)/ampicillin (100 μg/ml) media. Cells grew at 37 °C until OD600 nm reached 0.5–1.0. Gene expression was then induced by addition of 0.1 mM IPTG and carried out at 15 °C, 220 rpm for 6–8 h. Cells were harvested by centrifugation with a plate rotor at 500 x g for 10 min. Cell pellets were resuspended in 1 ml 50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 10 mM imidazole, 5% glycerol, 0.25 mg/ml lysozyme. Cell disruption was performed by sonication with the VCX 750 from Sonics® (pulse 1.0/1.0, 1 min, amplitude 25%). Subsequent semi-purification of the proteins was performed in 96-well plate format with His MultiTrap™ HP (GE Healthcare) according to the instructor’s manual. Proteins were eluted in 50 μl 50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 500 mM imidazole, 5% glycerol. Protein concentration was determined with the Bradford assay [20] in 96-well format using 10 μl of semi-purified protein. During the whole procedure the wild type enzyme has been used as a control.

Cloning of genes encoding polymerase I large fragment from Geobacillus stearothermophilus and Ureibacillus thermosphaericus

The codon-optimized genes encoding polymerase I large fragment from Geobacillus stearothermophilus (Gbst pol I LF, NCBI protein database: 3TAN_A) and Ureibacillus thermosphaericus (Ubts pol I LF, NCBI protein database: WP_016837139) were purchased from the Invitrogen GeneArt Gene Synthesis service from Thermo Fisher Scientific. The genes were cloned into the vector pTrc99a (encoding an N-terminal His6-tag) by FastCloning after Li et al. [21]. The corresponding substitution from Asp to Ala at position 422 (PB pol I LF) was introduced using the QuikChange II Site-Directed Mutagenesis Kit (Agilent Technologies) and confirmed by sequencing analysis. Primer sequences for cloning and site-directed mutagenesis are listed in Table 1.

Recombinant protein production and purification PB pol I LF and its D422A variant

Recombinant protein production of PB pol I LF and its D422A variant was performed in Rosetta 2 (DE3) cells (Novagen®). Cells grew in TB/ampicillin (100 μg/ml) media and gene expression was induced at OD600 nm = 1.0 by addition of 0.1 mM IPTG. Protein production was carried out at 15 °C, 180 rpm for 6–8 h. For protein purification the pellet of a 1-l cultivation was resuspended in 50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 10 mM imidazole, 5% glycerol, 0.15 mg/ml lysozyme, 1 protease inhibitor tablet (cOmplete™, Mini, EDTA-free Protease Inhibitor Cocktail, Roche) and incubated on ice for 30 min. If not stated otherwise all steps during the protein purification have been performed either on ice or cooled at 4 °C. Cell disruption was performed by French press (1.37 kbar) and subsequently by sonication with the VCX 750 from Sonics® (pulse 1.0/1.0, 5 min, amplitude 25%). In the first step the soluble part of the His6-tagged protein present after centrifugation (48,384 x g, 45 min, 4 °C) was purified by immobilized Ni 2+ -affinity chromatography. After a wash step with 50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 50 mM imidazole, 5% glycerol the protein was eluted at an imidazole concentration of 250 mM and further transferred into 50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 10 mM MgCl2, 5% glycerol by use of a desalting column. The second step was cleavage of the tag by the TEV protease performed overnight at 4 °C in 50 mM Tris pH 8.0 (at 25 °C), 0.5 mM EDTA and 1 mM DTT. To separate the protein from the His6-tag and the His6-tagged TEV protease a second Ni 2+ -affinity chromatography has been performed in the third step in 50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 5% glycerol. The tag-free protein eluted in the flow through after applying the TEV-cleavage reaction onto the column. The His6-tag and the His6-tagged TEV protease have been eluted from the column with 50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 500 mM imidazole, 5% glycerol. The final protein solution was concentrated and stored with 50% glycerol at − 20 °C for activity assays.

Recombinant protein production and purification Gbst and Ubts pol I LF

Gbst and Ubts pol I LF and their D422A variants have been produced recombinant in Rosetta 2 (DE3) cells (Novagen®). Cultivation of cells has been performed in LB/ampicillin (100 μg/ml) media and incubation at 37 °C. After induction of gene expression at OD600 nm = 0.5 by addition of 0.5 mM IPTG, protein production was carried out at 37 °C for 4 h. If not stated otherwise all steps during the subsequent protein purification have been performed either on ice or cooled at 4 °C. The pellet of a 0.5-l cultivation was resuspended in 50 mM Tris pH 8.0 (at 25 °C), 300 mM NaCl, 1 mM EDTA, 1 mM DTT, 10 mM imidazole, 0.15 mg/ml lysozyme, 1 protease inhibitor tablet (cOmplete™, Mini, EDTA-free Protease Inhibitor Cocktail, Roche), incubated on ice for 30 min and then subjected to sonication with the VCX 750 from Sonics® (pulse 1.0/1.0, 15 min, amplitude 25%) for cell disruption. The soluble part of the His6-tagged protein present after centrifugation (48,384 x g, 45 min, 4 °C) was purified by immobilized Ni 2+ -affinity chromatography. After a wash step with 50 mM Tris pH 8.0 (at 25 °C), 300 mM NaCl, 1 mM EDTA, 1 mM DTT, 10 mM imidazole the protein was eluted with gradually increasing the imidazole to 500 mM. Fractions containing the protein were collected, and buffer exchange was performed into 20 mM Tris pH 7.1 (at 25 °C), 100 mM KCl, 2 mM DTT, 0.2 mM EDTA and 0.2% Triton X-100 by desalting. After concentration the final protein solution was stored with 50% glycerol at − 20 °C for activity assays.

Single-nucleotide incorporation assay

For determination of optimal temperature for polymerase activity 10 μl reactions contained 30 nM substrate, fluorophore-labeled primer annealed to template DNA (Table 2), and 10 μM dATP. For PB pol I LF the reaction further contained 5 mM MgCl2 in 50 mM Tris pH 8.5, 100 mM NaCl, 1 mM DTT, 0.2 mg/ml BSA and 2% glycerol. The pH of the reaction buffer at room temperature was adjusted to pH 8.5 at the respective incubation temperature. The reactions were initiated by addition of protein solution and incubated for 15 min at various temperatures (0 °C–50 °C). As negative control protein dilution buffer (10 mM HEPES pH 7.5 (at 25 °C), 1% glycerol) has been used instead of protein solution.

To examine thermal stability of PB pol I LF 10 μl reactions contained 50 mM BIS-TRIS propane at pH 8.5 (at 25 °C), 100 mM NaCl, 5 mM MgCl2, 1 mM DTT, 0.2 mg/ml BSA and 2% glycerol. PB pol I LF was added to the reaction buffer, incubated at various temperatures (0 °C – 80 °C) for 15 min and afterwards cooled down on ice for 5 min. As negative control protein dilution buffer (10 mM HEPES pH 7.5 (at 25 °C), 1% glycerol) has been used instead of protein solution. The single-nucleotide extension reaction was initiated by addition of 30 nM substrate (Table 2) and 10 μM dATP. The mixture was incubated at 25 °C for 15 min.

Reactions were stopped by addition of 2.5 μl denaturing gel loading buffer (95% formamide, 10 mM EDTA, 0.1% xylene cyanol) and incubation at 95 °C for 5 min. For denaturing polyacrylamide gel electrophoresis (12% polyacrylamide/7 M urea) a sample volume of 6 μl was loaded onto the gel. Gel electrophoresis was performed in 0.5x TBE buffer (44.5 mM Tris, 44.5 mM boric acid, 1 mM EDTA) at 50 W for 1 h 15 min and the gel subsequently scanned for FAM with the PharosFX Plus Imager (Bio-Rad).

Enzyme activity was determined by densitometric measurement of bands representing the extended primer (intensity 1) and the unextended primer (intensity 0). Analysis of quantitative data has been performed using standard deviation. The relative conversion rate was calculated as follows:

conversion [%] = intensity 1/(intensity 0 + intensity 1)*100.

Polymerase activity assay

The polymerase activity assay is based on a molecular beacon probe (modified from [22]). Fifty microliter reactions consisted of 200 nM substrate, primer annealed to template DNA consisting of fluorophore and quencher (Table 2), and 200 μM dNTPs (equimolar amounts of dATP, dGTP, dCTP and dTTP). For PB pol I LF the reaction further contained 5 mM MgCl2 in 50 mM BIS-Tris propane at pH 8.5 (at 25 °C), 100 mM NaCl, 1 mM DTT, 0.2 mg/ml BSA and 2% glycerol. For Gbst and Ubts pol I LF the reaction further contained 20 mM Tris pH 7.9 (at 25 °C), 100 mM KCl, 10 mM (NH4)2SO4, 2 mM MgSO4, 0.1% Triton X-100.

The activity assay was carried out at 25 °C and 37 °C, respectively, in black 96-well fluorescence assay plates (Corning®). The reaction was initiated by addition of protein solution. The increase in Fluorescein fluorescence was measured as relative fluorescence units (RFUs) in appropriate time intervals by exciting at 485 nm and recording emission at 518 nm. Measurements were performed in a SpectraMax® Gemini Microplate Reader (Molecular Devices). Analysis of quantitative data has been performed using standard deviation.

Strand-displacement activity assay

Fifty microliter reactions consisted of 200 nM substrate, “cold” primer and reporter strand annealed to template DNA (Table 2), and 200 μM dNTPs (equimolar amounts of dATP, dGTP, dCTP and dTTP). For PB pol I LF and screening of variants from the evolution library the reaction further contained 5 mM MgCl2 in 50 mM BIS-Tris propane at pH 8.5 (at 25 °C), 100 mM NaCl, 1 mM DTT, 0.2 mg/ml BSA and 2% glycerol. For Gbst and Ubts pol I LF the reaction further contained 20 mM Tris pH 7.9 (at 25 °C), 100 mM KCl, 10 mM (NH4)2SO4, 2 mM MgSO4, 0.1% Triton X-100.

The activity assay was carried out at 25 °C and 37 °C, respectively, in black 96-well fluorescence assay plates (Corning®). The reaction was initiated by addition of protein solution. The increase in TAMRA fluorescence was measured as RFUs in appropriate time intervals by exciting at 525 nm and recording emission at 598 nm. Measurements were performed in a SpectraMax® M2 e Microplate Reader (Molecular Devices). Analysis of quantitative data has been performed using standard deviation.

Mutagenesis, protein production and semi-purification of PB pol I LF 422 variants

Amino-acid substitutions at position 422 of PB pol I LF have been introduced using the QuikChange II Site-Directed Mutagenesis Kit (Agilent Technologies) and confirmed by sequencing analysis. Starting material for the mutagenesis reaction was the gene encoding PB D422A in the vector pET-11a. Recombinant protein production has been performed in Rosetta 2 (DE3) cells (Novagen®) in 25 ml TB/ampicillin (100 μg/ml) media. At OD600 nm = 1.0 gene expression was induced by addition of 0.1 mM IPTG. Incubation temperature was lowered from 37 °C to 15 °C and protein production was carried out at 180 rpm for 6–8 h. Semi-purification has been performed with PureProteome™ Nickel Magnetic Beads (Millipore). Cells have been lysed by sonication with VCX 750 from Sonics® (pulse 1.0/1.0, 1 min, amplitude 20%) in 1 ml lysis buffer (50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 5% glycerol, 150 μg lysozyme) and processed further according to manufacturer’s instructions (washing buffer: 50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 5% glycerol). Final elution of the proteins has been performed with 50 μl elution buffer (50 mM HEPES pH 7.5 (at 25 °C), 500 mM NaCl, 500 mM imidazole, 5% glycerol). Protein concentrations have been determined using the Bradford assay [20]. SD activity of PB pol I wild type (Asp) and its variants containing amino-acid substitutions at position 422 has been determined using the time-resolved strand-displacement activity assay.

Watch the video: ΛΙΠΑΝΣΗ ΕΛΙΑΣ - ΑΜΙΝΟΞΕΑ (January 2022).