Construct a restriction map of a linear fragment of DNA using the following data?

I've attempted to do the single digests, and the double digests, but cannot complete the map… I've attached what I've done so far

DNA Sizes of Fragments (bp)

uncut DNA 900

DNA cut with EcoRI 500, 350, 50

DNA cut with HindIII 600, 300

DNA cut with BamHI 400, 300, 200

DNA cut with EcoRI + HindIII 350, 300, 200, 50

DNA cut with EcoRI + BamHI 300, 250, 200, 100, 50

DNA cut with HindIII + BamHI 300, 200, 100

Construct a restriction map of a linear fragment of DNA using the following data? - Biology

I've been trying for hours with not much luck.

This is my attempt to so far.

Hello, I just gave you some hints on that here. In my opinion, you should try to explain with more detail what you need and what you already tried to do, without expecting people to just do your homework. I think the suggestions provided for the other question should guide you to a solution of your problem. Also, to my eyes it is not clear whether your main difficulty lies in the conceptual resolution of the problem or rather in the final rendering of the map.

Hi, I don't expect anyone to just do it as I would not gain anything from that. I've attempted to complete it and the problem lies in the final rendering of the map.

Have you tried using ApE (A plasmid Editor) from the University Of Utah by M. Wayne Davis. I think it would work for what you are wanting to do.

Ok, in order to produce a graphical map I would proceed as follows. Open a Python shell and input print 'N'*900 . This would yield a 900 bp-long sequence, equal in size to your undigested DNA, copy this sequence and paste it into SnapGene (or any equivalent program). For each restriction site you previously located, you can either generate a misc_feature to annotate it in the map, or alternatively you can substitute those Ns in the cutting position with the actual restriction site of the pertinent enzyme (e.g. G^GATCC for BamHI, the ^ marking the position you identified) if you then display restriction enzyme sites with the appropriate option, you obtain your restriction map. You can also annotate restriction fragments for each of the digestion reactions you showed above. Hope this helps!

Login before adding your answer.

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Restriction map

A restriction map is a map of known restriction sites within a sequence of DNA. Restriction mapping requires the use of restriction enzymes. In molecular biology, restriction maps are used as a reference to engineer plasmids or other relatively short pieces of DNA, and sometimes for longer genomic DNA. There are other ways of mapping features on DNA for longer length DNA molecules, such as mapping by transduction. [1]

One approach in constructing a restriction map of a DNA molecule is to sequence the whole molecule and to run the sequence through a computer program that will find the recognition sites that are present for every restriction enzyme known.

Before sequencing was automated, it would have been prohibitively expensive to sequence an entire DNA strand. To find the relative positions of restriction sites on a plasmid, a technique involving single and double restriction digests is used. Based on the sizes of the resultant DNA fragments the positions of the sites can be inferred. Restriction mapping is a very useful technique when used for determining the orientation of an insert in a cloning vector, by mapping the position of an off-center restriction site in the insert. [2]

The experimental procedure first requires an aliquot of purified plasmid DNA (see appendix) for each digest to be run. Digestion is then performed with each enzyme(s) chosen. The resulting samples are subsequently run on an electrophoresis gel, typically on agarose gel.

The first step following the completion of electrophoresis is to add up the sizes of the fragments in each lane. The sum of the individual fragments should equal the size of the original fragment, and each digest's fragments should also sum up to be the same size as each other. If fragment sizes do not properly add up, there are two likely problems. In one case, some of the smaller fragments may have run off the end of the gel. This frequently occurs if the gel is run too long. A second possible source of error is that the gel was not dense enough and therefore was unable to resolve fragments close in size. This leads to a lack of separation of fragments which were close in size. If all of the digests produce fragments that add up one may infer the position of the REN (restriction endonuclease) sites by placing them in spots on the original DNA fragment that would satisfy the fragment sizes produced by all three digests.

See also restriction enzymes for more detail about the enzymes exploited in this technique.

Rapid Denaturation and Renaturation of a crude DNA preparation by alkaline lysis of the cells and subsequent neutralization

In this technique the cells are lysed in alkaline conditions. The DNA in the mixture is denatured (strands separated) by disrupting the hydrogen bonds between the two strands. The large genomic DNA is subject to tangling and staying denatured when the pH is lowered during the neutralization. In other words, the strands come back together in a disordered fashion, basepairing randomly. The circular supercoiled plasmids' strands will stay relatively closely aligned and will renature correctly. Therefore, the genomic DNA will form an insoluble aggregate and the supercoiled plasmids will be left in solution. This can be followed by phenol extraction to remove proteins and other molecules. Then the DNA can be subjected to ethanol precipitation to concentrate the sample.

Construct a restriction map of a linear fragment of DNA using the following data? - Biology

There are hundreds of known enzymes called restriction endonucleases that cleave DNA at very specific sites. For example the enzyme BamHI recognizes the sequence GGATCC and cuts the DNA between the two G's. If just one base is changed in the sequence (say GGTTCC) then the enzyme will not cut the DNA.

The 1978 Nobel Prize in Medicine was awarded for research on Restriction Enzymes.

Restriction enzymes can cut the DNA such that they leave a 5' overhang (BamHI), a 3' overhand (PvuI), or a blunt end with no overhang (DpnI).

Restriction Enzymes can also be classified by the numbers of bases in the recognition sequence. The numbers of bases will determine the frequency of that specific sequence in an average DNA sequence. For example,

DpnI recognizes a 4 bp sequence with would occur once every 4 4 or 256 bp.

PvuI recognizes a 6 bp sequence with would occur once every 4 6 or 4,096 bp.

NotI recognizes an 8 bp sequence with would occur once every 4 8 or 65,536 bp.

A DNA sequence can be run through a program that will identify these sites in the DNA. The results from this program will show all of the known sites in a given DNA sequence that are cut by restriction enzymes. This "restriction map" is very useful in designing cloning strategies, and in developing diagnostic assays.

We can either use the program TACG directly to generate a restriction map, or use the Biology WorkBench

Using Biology WorkBench to generate a restriction map or translation .

Log onto the Biology WorkBench and either create a new session or resume an existing session. Select Nucleic Tools .

Next select the DNA sequence that you would like to analyze. Push TACG.

You will then be given a screen allowing you to decide which enzymes to use based on numbers of bases in sequence, type of overhang, etc. You can type in the names of individual restriction enzymes if you only want to look at a few. The default is all enzymes. The default will also give you a translation in all six reading frames.

Once you have selected the parameters you wish to use, submit your sequence for analysis.

For information on interpretation of results click here.

Using TACG to generate a restriction map

2. In the box below the heading Sequence Entry paste in your DNA seqeunce. Be sure there are no blank spaces, returns, or letters other than a, c, g or t.

3. Try your first run with the default settings. Scroll down to the button Submit to WWWtacg .

4. Your results will come back in several formats. As a table of enzymes that cut and do not cut, and as a map, showing the location of the cut sites.

5. If you wish to only see where a few specific enzymes cut the DNA, go back to the submission page. Under Restriction Enzyme Selection by. push the button next to by . ..Explicit Pick from List . Now scroll up and select the desired enzyme(s).

6. The program will also perform several other calculations on your sequence if you select that feature. Note that you can also tell the computer if you are digesting linear or circular DNA.

      • Summary Table dispays the names of all enzymes that do not cut the DNA sequence and the number of times that other enzymes do cut the DNA.
      • GCG-like Ladder Map shows the positions of each restriction site on a graphical map of the DNA.
      • Pseudo Gel Map shows a diagram of the fragments that would be generated if you cut the DNA with the indicated enzyme and ran the DNA on an agarose gel.
      • Table of Cut Sites displays a list of each enzyme and the position(s) at which it will cut the DNA.
      • Table of Fragments gives the sizes of fragments generated by digesting the DNA with each enzyme.

      7. Scroll down to the button Submit to WWWtacg

      Interpretation of Results

      GCG-like Ladder Map: The program draws a slash at each site that will be cut by the indicated enzyme.

            • The "/" on BsgI indicates that a 3' overhang is generated.
            • The "" on BspMI indicates that a 5' overhang is generated.
            • The "!" on BsrBI indicates that a blunt end is generated

            Table of Fragments: The program gives the predicted sizes of any fragments generated by digestion with each enzyme.

            Table of Cut Sites: The program gives the predicted recognition sites for each enzyme.

            For example Stu I cuts the DNA at postions 25, 88 and 771.

              • The " ' " symbol indicates where the enzyme will cut the DNA on that strand.
              • The " _ " symbol indicates where the enzyme will cut the DNA on the complementary strand.
                • StuI cuts both strands in the same place, generating a blunt end.
                • AflII cuts both strands towards the 5' end of the recognition sequence, leaving a 5' overhang.
                • NsiI cuts both strands towards the 3' end of the recognition sequence, leaving a 3' overhang.
                • Bce83I and Eco57I both cut the DNA at sites 14 bp from the recogntion sequence.

                Pseudo Gel Map: Draws a diagram of the DNA sample digested with each enzyme. The dots each correspond to 100 bp. Fragments smaller than 100 bp are not distinguished. The top of the gel is to the right. For example AceIII generates fragments of approximately 150 bp and 750 bp.


                Terms and concepts

                • Linearize
                • Molecular weight marker. What marker will you use on your plasmid digest gel?
                • Restriction enzyme (endonuclease)
                • Restriction fragments
                • Restriction map. Given a restriction map, you should be able to determine what the gel result would be when DNA is cut with particular restriction enzymes.
                • Restriction site
                • Supercoiled DNA

                Review questions

                1. How does a restriction digest help us see the difference between pARO180 and pGLO?
                2. What is the difference between a nucleotide sequence and a DNA map?
                3. On Restriction Analyzer, why does it matter if you check the "linear" or "circular" box?
                4. Suppose you have a linear piece of DNA. You cut it with Hind III, and you get one fragment of 800 bp and one of 200 bp. Draw a restriction map of this DNA.
                5. Is your purified plasmid double-stranded DNA or single-stranded? Relate this to the sequence shown above.
                6. When you are going to perform a restriction digest, how do you determine how much DNA you need to start with in order to see all the bands on your gel? (This is a quantitative question. On a test, I could ask you how much DNA you need to use. What information would you need to answer this question?)
                7. In Bio 6B, you use two different approaches for analyzing your plasmid DNA: the total nucleic acid gel and the restriction digest gel. What are the advantages and disadvantages of each?
                8. Why does it matter if the DNA on a gel is supercoiled?

                Genetic mapping:

                The gene genetic mapping technique is entirely different than the physical mapping method. The genetic mapping method is based on the numbers of marker used and the size of the study population.

                The genetic mapping method is dependent on the principle of showing the positions of related genes or DNA sequence related to the phenotype or disease on a chromosome using the techniques such as pedigree analysis or cross-hybridization.

                (Instead, in the physical mapping, the DNA sequence related to the disease or phenotype is directly examined using the molecular genetic techniques).

                Thus the genetic map constructed using the genetic mapping shows the relative position of particular phenotype on a chromosome or genome.

                For constructing a genetic map, scientists collect the blood samples from the family members having a disease trait and other family members without the disease.

                After that, DNA extraction is performed from the collected samples, scientists then analyse the DNA of the family members having a disease trait as well as members who do not contain the disease trait.

                The results are developed as a marker for that particular trait or disease.

                The genetic map indicates the location of a particular phenotype related genotype on a chromosome.

                In early days the genetic map is used only for the visible or distinguishable characters like eye colour, body colour, hight and wing shape etc in fruit-fly.

                Now, genetic mapping is used for visible as well as biochemical characteristics and the disease-causing gene can be mapped on a chromosome.

                STR, short tandem repeats is one of useful microsatellite marker used in the genetic marker technique.

                The graphical illustration of the mapping of different traits on a fruit fly chromosome.

                Besides this, several other markers such as SSCP, VNTR, SNP, RFLP and AFLP are also used in the genetic or gene mapping.

                The STR (microsatellites) are frequently used in genetic mapping because the minisatellites are more prevalently present on the telomeric regions of chromosomes thus an accurate mapping can not be done.

                Construction of a SNP-based high-density genetic map for pummelo using RAD sequencing

                Pummelo (Citrus grandis) is one of the most important gene pools for citrus breeding programmes. A high-density linkage map is a valuable tool for functional genomics and genetic breeding studies. A newly developed genome sequence-based marker technology, restriction site-associated DNA (RAD) sequencing, has proven to be powerful for the rapid discovery and genotyping of genome-wide SNP markers and for the high-density genetic map construction. We present the construction of a high-density genetic map of pummelo using RAD sequencing. An F1 population of 124 individuals and its parents (‘Pingshan’ pummelo and ‘Guanxi’ pummelo) were applied to the map construction. One thousand five hundred forty three high-quality single nucleotide polymorphism (SNP) markers were developed and mapped to nine linkage groups. In addition, 20 simple sequence repeat (SSR) markers were included and showed general consistency with the SNP markers. These 1563 markers constituted a total genetic length of 976.58 cM and an average of 0.62 cM between adjacent loci. The number of markers within each linkage group (LG) ranged from 81 (for LG4) to 285 (for LG2). A comparison of the genetic maps to the published sweet orange (Citrus sinensis) genome revealed both conservation and variations. The alignment of the LGs from this map was also shown in comparison with a previously genetic linkage map from pummelo. This study showed that the RAD sequencing allowed the rapid discovery of a large number of SNPs in the pummelo. The SNP-based high-density genetic map for pummelo was successfully generated by using these designed SNP markers. The completed genetic map is a valuable resource for further pummelo genetic studies and provides useful information for gene positional cloning, MAS breeding and C. grandis genome assembly.

                This is a preview of subscription content, access via your institution.


                Prior to the 1970s, the understanding of genetics and molecular biology was severely hampered by an inability to isolate and study individual genes from complex organisms. This changed dramatically with the advent of molecular cloning methods. Microbiologists, seeking to understand the molecular mechanisms through which bacteria restricted the growth of bacteriophage, isolated restriction endonucleases, enzymes that could cleave DNA molecules only when specific DNA sequences were encountered. [6] They showed that restriction enzymes cleaved chromosome-length DNA molecules at specific locations, and that specific sections of the larger molecule could be purified by size fractionation. Using a second enzyme, DNA ligase, fragments generated by restriction enzymes could be joined in new combinations, termed recombinant DNA. By recombining DNA segments of interest with vector DNA, such as bacteriophage or plasmids, which naturally replicate inside bacteria, large quantities of purified recombinant DNA molecules could be produced in bacterial cultures. The first recombinant DNA molecules were generated and studied in 1972. [7] [8]

                Molecular cloning takes advantage of the fact that the chemical structure of DNA is fundamentally the same in all living organisms. Therefore, if any segment of DNA from any organism is inserted into a DNA segment containing the molecular sequences required for DNA replication, and the resulting recombinant DNA is introduced into the organism from which the replication sequences were obtained, then the foreign DNA will be replicated along with the host cell's DNA in the transgenic organism.

                Molecular cloning is similar to polymerase chain reaction (PCR) in that it permits the replication of DNA sequence. The fundamental difference between the two methods is that molecular cloning involves replication of the DNA in a living microorganism, while PCR replicates DNA in an in vitro solution, free of living cells.

                In standard molecular cloning experiments, the cloning of any DNA fragment essentially involves seven steps: (1) Choice of host organism and cloning vector, (2) Preparation of vector DNA, (3) Preparation of DNA to be cloned, (4) Creation of recombinant DNA, (5) Introduction of recombinant DNA into host organism, (6) Selection of organisms containing recombinant DNA, (7) Screening for clones with desired DNA inserts and biological properties.

                Although the detailed planning of the cloning can be done in any text editor, together with online utilities for e.g. PCR primer design, dedicated software exist for the purpose. Software for the purpose include for example ApE [1] (open source), DNAStrider [2] (open source), Serial Cloner [3] (gratis) and Collagene [4] (open source).

                Notably, the growing capacity and fidelity of DNA synthesis platforms allows for increasingly intricate designs in molecular engineering. These projects may include very long strands of novel DNA sequence and/or test entire libraries simultaneously, as opposed to of individual sequences. These shifts introduce complexity that require design to move away from the flat nucleotide-based representation and towards a higher level of abstraction. Examples of such tools are GenoCAD, Teselagen [5] (free for academia) or GeneticConstructor [6] (free for academics).

                Choice of host organism and cloning vector Edit

                Although a very large number of host organisms and molecular cloning vectors are in use, the great majority of molecular cloning experiments begin with a laboratory strain of the bacterium E. coli (Escherichia coli) and a plasmid cloning vector. E. coli and plasmid vectors are in common use because they are technically sophisticated, versatile, widely available, and offer rapid growth of recombinant organisms with minimal equipment. [3] If the DNA to be cloned is exceptionally large (hundreds of thousands to millions of base pairs), then a bacterial artificial chromosome [10] or yeast artificial chromosome vector is often chosen.

                Specialized applications may call for specialized host-vector systems. For example, if the experimentalists wish to harvest a particular protein from the recombinant organism, then an expression vector is chosen that contains appropriate signals for transcription and translation in the desired host organism. Alternatively, if replication of the DNA in different species is desired (for example, transfer of DNA from bacteria to plants), then a multiple host range vector (also termed shuttle vector) may be selected. In practice, however, specialized molecular cloning experiments usually begin with cloning into a bacterial plasmid, followed by subcloning into a specialized vector.

                Whatever combination of host and vector are used, the vector almost always contains four DNA segments that are critically important to its function and experimental utility: [3]

                • DNA replication origin is necessary for the vector (and its linked recombinant sequences) to replicate inside the host organism
                • one or more unique restriction endonuclease recognition sites to serves as sites where foreign DNA may be introduced
                • a selectable genetic marker gene that can be used to enable the survival of cells that have taken up vector sequences
                • a tag gene that can be used to screen for cells containing the foreign DNA

                Preparation of vector DNA Edit

                The cloning vector is treated with a restriction endonuclease to cleave the DNA at the site where foreign DNA will be inserted. The restriction enzyme is chosen to generate a configuration at the cleavage site that is compatible with the ends of the foreign DNA (see DNA end). Typically, this is done by cleaving the vector DNA and foreign DNA with the same restriction enzyme, for example EcoRI. Most modern vectors contain a variety of convenient cleavage sites that are unique within the vector molecule (so that the vector can only be cleaved at a single site) and are located within a gene (frequently beta-galactosidase) whose inactivation can be used to distinguish recombinant from non-recombinant organisms at a later step in the process. To improve the ratio of recombinant to non-recombinant organisms, the cleaved vector may be treated with an enzyme (alkaline phosphatase) that dephosphorylates the vector ends. Vector molecules with dephosphorylated ends are unable to replicate, and replication can only be restored if foreign DNA is integrated into the cleavage site. [11]

                Preparation of DNA to be cloned Edit

                For cloning of genomic DNA, the DNA to be cloned is extracted from the organism of interest. Virtually any tissue source can be used (even tissues from extinct animals), [12] as long as the DNA is not extensively degraded. The DNA is then purified using simple methods to remove contaminating proteins (extraction with phenol), RNA (ribonuclease) and smaller molecules (precipitation and/or chromatography). Polymerase chain reaction (PCR) methods are often used for amplification of specific DNA or RNA (RT-PCR) sequences prior to molecular cloning.

                DNA for cloning experiments may also be obtained from RNA using reverse transcriptase (complementary DNA or cDNA cloning), or in the form of synthetic DNA (artificial gene synthesis). cDNA cloning is usually used to obtain clones representative of the mRNA population of the cells of interest, while synthetic DNA is used to obtain any precise sequence defined by the designer. Such a designed sequence may be required when moving genes across genetic codes (for example, from the mitochrondria to the nucleus) [13] or simply for increasing expression via codon optimization. [14]

                The purified DNA is then treated with a restriction enzyme to generate fragments with ends capable of being linked to those of the vector. If necessary, short double-stranded segments of DNA (linkers) containing desired restriction sites may be added to create end structures that are compatible with the vector. [3] [11]

                Creation of recombinant DNA with DNA ligase Edit

                The creation of recombinant DNA is in many ways the simplest step of the molecular cloning process. DNA prepared from the vector and foreign source are simply mixed together at appropriate concentrations and exposed to an enzyme (DNA ligase) that covalently links the ends together. This joining reaction is often termed ligation. The resulting DNA mixture containing randomly joined ends is then ready for introduction into the host organism.

                DNA ligase only recognizes and acts on the ends of linear DNA molecules, usually resulting in a complex mixture of DNA molecules with randomly joined ends. The desired products (vector DNA covalently linked to foreign DNA) will be present, but other sequences (e.g. foreign DNA linked to itself, vector DNA linked to itself and higher-order combinations of vector and foreign DNA) are also usually present. This complex mixture is sorted out in subsequent steps of the cloning process, after the DNA mixture is introduced into cells. [3] [11]

                Introduction of recombinant DNA into host organism Edit

                The DNA mixture, previously manipulated in vitro, is moved back into a living cell, referred to as the host organism. The methods used to get DNA into cells are varied, and the name applied to this step in the molecular cloning process will often depend upon the experimental method that is chosen (e.g. transformation, transduction, transfection, electroporation). [3] [11]

                When microorganisms are able to take up and replicate DNA from their local environment, the process is termed transformation, and cells that are in a physiological state such that they can take up DNA are said to be competent. [15] In mammalian cell culture, the analogous process of introducing DNA into cells is commonly termed transfection. Both transformation and transfection usually require preparation of the cells through a special growth regime and chemical treatment process that will vary with the specific species and cell types that are used.

                Electroporation uses high voltage electrical pulses to translocate DNA across the cell membrane (and cell wall, if present). [16] In contrast, transduction involves the packaging of DNA into virus-derived particles, and using these virus-like particles to introduce the encapsulated DNA into the cell through a process resembling viral infection. Although electroporation and transduction are highly specialized methods, they may be the most efficient methods to move DNA into cells.

                Selection of organisms containing vector sequences Edit

                Whichever method is used, the introduction of recombinant DNA into the chosen host organism is usually a low efficiency process that is, only a small fraction of the cells will actually take up DNA. Experimental scientists deal with this issue through a step of artificial genetic selection, in which cells that have not taken up DNA are selectively killed, and only those cells that can actively replicate DNA containing the selectable marker gene encoded by the vector are able to survive. [3] [11]

                When bacterial cells are used as host organisms, the selectable marker is usually a gene that confers resistance to an antibiotic that would otherwise kill the cells, typically ampicillin. Cells harboring the plasmid will survive when exposed to the antibiotic, while those that have failed to take up plasmid sequences will die. When mammalian cells (e.g. human or mouse cells) are used, a similar strategy is used, except that the marker gene (in this case typically encoded as part of the kanMX cassette) confers resistance to the antibiotic Geneticin.

                Screening for clones with desired DNA inserts and biological properties Edit

                Modern bacterial cloning vectors (e.g. pUC19 and later derivatives including the pGEM vectors) use the blue-white screening system to distinguish colonies (clones) of transgenic cells from those that contain the parental vector (i.e. vector DNA with no recombinant sequence inserted). In these vectors, foreign DNA is inserted into a sequence that encodes an essential part of beta-galactosidase, an enzyme whose activity results in formation of a blue-colored colony on the culture medium that is used for this work. Insertion of the foreign DNA into the beta-galactosidase coding sequence disables the function of the enzyme so that colonies containing transformed DNA remain colorless (white). Therefore, experimentalists are easily able to identify and conduct further studies on transgenic bacterial clones, while ignoring those that do not contain recombinant DNA.

                The total population of individual clones obtained in a molecular cloning experiment is often termed a DNA library. Libraries may be highly complex (as when cloning complete genomic DNA from an organism) or relatively simple (as when moving a previously cloned DNA fragment into a different plasmid), but it is almost always necessary to examine a number of different clones to be sure that the desired DNA construct is obtained. This may be accomplished through a very wide range of experimental methods, including the use of nucleic acid hybridizations, antibody probes, polymerase chain reaction, restriction fragment analysis and/or DNA sequencing. [3] [11]

                Molecular cloning provides scientists with an essentially unlimited quantity of any individual DNA segments derived from any genome. This material can be used for a wide range of purposes, including those in both basic and applied biological science. A few of the more important applications are summarized here.

                Genome organization and gene expression Edit

                Molecular cloning has led directly to the elucidation of the complete DNA sequence of the genomes of a very large number of species and to an exploration of genetic diversity within individual species, work that has been done mostly by determining the DNA sequence of large numbers of randomly cloned fragments of the genome, and assembling the overlapping sequences.

                At the level of individual genes, molecular clones are used to generate probes that are used for examining how genes are expressed, and how that expression is related to other processes in biology, including the metabolic environment, extracellular signals, development, learning, senescence and cell death. Cloned genes can also provide tools to examine the biological function and importance of individual genes, by allowing investigators to inactivate the genes, or make more subtle mutations using regional mutagenesis or site-directed mutagenesis. Genes cloned into expression vectors for functional cloning provide a means to screen for genes on the basis of the expressed protein's function.

                Production of recombinant proteins Edit

                Obtaining the molecular clone of a gene can lead to the development of organisms that produce the protein product of the cloned genes, termed a recombinant protein. In practice, it is frequently more difficult to develop an organism that produces an active form of the recombinant protein in desirable quantities than it is to clone the gene. This is because the molecular signals for gene expression are complex and variable, and because protein folding, stability and transport can be very challenging.

                Many useful proteins are currently available as recombinant products. These include--(1) medically useful proteins whose administration can correct a defective or poorly expressed gene (e.g. recombinant factor VIII, a blood-clotting factor deficient in some forms of hemophilia, [17] and recombinant insulin, used to treat some forms of diabetes [18] ), (2) proteins that can be administered to assist in a life-threatening emergency (e.g. tissue plasminogen activator, used to treat strokes [19] ), (3) recombinant subunit vaccines, in which a purified protein can be used to immunize patients against infectious diseases, without exposing them to the infectious agent itself (e.g. hepatitis B vaccine [20] ), and (4) recombinant proteins as standard material for diagnostic laboratory tests.

                Transgenic organisms Edit

                Once characterized and manipulated to provide signals for appropriate expression, cloned genes may be inserted into organisms, generating transgenic organisms, also termed genetically modified organisms (GMOs). Although most GMOs are generated for purposes of basic biological research (see for example, transgenic mouse), a number of GMOs have been developed for commercial use, ranging from animals and plants that produce pharmaceuticals or other compounds (pharming), herbicide-resistant crop plants, and fluorescent tropical fish (GloFish) for home entertainment. [1]

                Gene therapy Edit

                Gene therapy involves supplying a functional gene to cells lacking that function, with the aim of correcting a genetic disorder or acquired disease. Gene therapy can be broadly divided into two categories. The first is alteration of germ cells, that is, sperm or eggs, which results in a permanent genetic change for the whole organism and subsequent generations. This “germ line gene therapy” is considered by many to be unethical in human beings. [21] The second type of gene therapy, “somatic cell gene therapy”, is analogous to an organ transplant. In this case, one or more specific tissues are targeted by direct treatment or by removal of the tissue, addition of the therapeutic gene or genes in the laboratory, and return of the treated cells to the patient. Clinical trials of somatic cell gene therapy began in the late 1990s, mostly for the treatment of cancers and blood, liver, and lung disorders. [22]

                Despite a great deal of publicity and promises, the history of human gene therapy has been characterized by relatively limited success. [22] The effect of introducing a gene into cells often promotes only partial and/or transient relief from the symptoms of the disease being treated. Some gene therapy trial patients have suffered adverse consequences of the treatment itself, including deaths. In some cases, the adverse effects result from disruption of essential genes within the patient's genome by insertional inactivation. In others, viral vectors used for gene therapy have been contaminated with infectious virus. Nevertheless, gene therapy is still held to be a promising future area of medicine, and is an area where there is a significant level of research and development activity.

                Plasmid Mapping Process

                A plasmid is an extrachromosomal circular DNA which can be replicated. DNA mapping is a technique for Plasmid mapping.

                Working a DNA into mapping is done using restriction endonuclease enzymes that are found in bacteria, to cut the DNA into fragments. These enzymes cut specific recognition sites of a DNA molecule which may result in a sticky end i.e. over hangs using EcoR1 in the 5’G-AATTC 3′.

                Why plasmid mapping? To explain, a plasmid has two genes on it with Ampicillin and Tetracycline resistant and we have a target DNA which is an insulin gene, the plasmid is cut at Amp site of the plasmid and EcoR1 cuts the target gene at two cuts in an over hang manner, the target gene is placed in the plasmid which is then sealed by a DNA ligase, introduced into a bacteria and cloned using PCR.

                These genes are cultured in a plate and the gene of interest for it to be determined, the plate with agar+Amp, will not grow. The plate with only agar will grow i.e. for both those that did pick up and those that didn’t pick up the plasmid grows. The plate with Tetracycline will grow because the plasmid is resistant to TeT, if a bacteria picked up the plasmid, it will also pick up that gene that makes insulin for example.

                Thus the DNA is purified using gel electrophoresis (Fig 1), which means running in an electric field. The molecules move from the b negative electrode to the positive electrode, the molecules are separated based on their molecular weight as the smallest particles move faster. DNA fragments must be separated from each other to determine the pattern of cuts. For DNA mapping, more than one restriction enzymes will be used and the distance a band travels in a gel for different enzymes is used to construct a map.

                Fig 1:
                Image showing purification of DNA using Gel Electrophoresis

                Experimental data is given below which shows the number of base pair per band, we have EcoR1 and HindIII, both enzymes are super imposed together i.e. double digest, so we want to find where these enzymes cut a plasmid.

                BASE PAIR PER BAND
                NUMBER OF
                BASE PAIR PER BAND
                NUMBER OF
                BASE PAIR PER BAND
                NUMBER OF
                BASE PAIR PER BAND
                Eco R1 30 15 5
                Hind III 50
                Hind III/Eco R1 20 15 10 5

                My plasmid is 50B.P long, so I drew a circle which shows 50 the did the same for others, all summing up to 50, which shows the original piece of DNA. I start with the enzyme which cuts the plasmid the most i.e. EcoR1 then on the map, I start at the 12 o’clock and measure 30, 15, 5. Then a cut is made each. To find where HindIII cuts the map, I looked at the double digest and found which B.P are repeated i.e. 15 & 5, therefore the area is highlighted below.

                Look to the other side and mark 20 & 10 the cut is shown in red meaning that is where HindIII cuts the plasmid. Thus you have a plasmid map.

                Fig 2:
                Circular restriction map at 12 o’clock which relies on restriction enzymes that cut specific DNA molecules at a restriction site.

                About Me

                I studied Biochemistry at the University of Lagos, Nigeria. As my first degree as a graduate, I plan to advance my career in the near future in the Biotechnology field. I love creativity and learning new things, and want to be a part of an evolution in life sciences.

                Results and discussion

                Selection of suitable restriction enzymes for RAD sequencing library construction

                In this study, we did not sequence the whole genome of all F1 plants rather, we sequenced the two ends of the

                300- to 400-bp RAD tags to simplify the grape genome and increase sequencing efficiency. Thus, selection of a suitable restriction enzyme for DNA digestion was key. Theoretically, two characteristics are required for an appropriate restriction enzyme: 1) because the NGS technology can only cover 75 to 100 bp of DNA at each end concurrently, the enzyme must be able to digest the genome of interest to a suitable size (e.g.

                300–400 bp) 2) the number of digested fragments of the expected size should be sufficient for subsequent manipulation (100,000–150,000 RAD tags). The V. vinifera Pinot noir PN40024 genome sequence was taken as the reference to search for an appropriate restriction enzyme.

                30 restriction enzymes showed great differences in recognition sites (data not shown). One restriction enzyme, MseI, which recognized 4 nucleotides (T/TAA), was predicted to produce 149,921 digested DNA fragments for a grape genome of 300–400 bp in size, suiting our requirements. The distribution of binding sites for this restriction enzyme is shown in Figure 1. Based on these results, we selected MseI as the restriction enzyme to construct the DNA sequencing library.

                Distribution of the restriction enzyme MseI's predicted digestion sites. X axis indicates the size of the digested fragments Y axis indicates the number of fragments.

                SNP markers and their characteristics

                Once the DNA of the F1 individuals and their parents had been treated with MSeI, all samples were genotyped by high-throughput sequencing. In total,

                16 G of raw sequence data containing 117,084,991 pair-end (PE) reads was obtained, with each read being

                70 bp in length. To avoid sequence errors, only reads showing < 5 bases with Q score > 20 were further analyzed. Of these high-quality data,

                149 Mb were from one of the parents, Beihong, with 2,136,496 reads, and

                148 Mb were from Z180 with 2,126,872 reads. To assign these reads to their corresponding loci, a cluster strategy was used for the two parents' data (described in Materials and Methods). As the grape genome harbors a large number of repeat sequences [30, 31], these might affect the coverage calculation and lead misidentification of polymorphisms. To overcome this obstacle, clusters containing highly redundant reads were excluded (clusters with > 200 reads), which removed the repeat sequences from the data. Clusters with a low number of reads were also excluded due to little coverage of the loci (clusters with < 5 reads). Finally, 37,871,193 high-quality reads without repeat sequences were retained, and were assigned to 80,709 clusters for the whole F1 population (Table 1). Thus we obtained 80,709 valid loci representing the whole grape genome. This number was less than the expected number of digested fragments (100,000–150,000) however, it excluded the repeat sequences and thus roughly corresponded to the in-silico digestion result. Further calculation indicated that the coverage of these loci was

                469-fold at the population level (number of valid reads: 37,871,193 per number of clusters: 80,709). With the aim of screening polymorphisms for these 80,709 clusters, a strict in-silico procedure was carried out for SNP identification (described in Materials and Methods). In total, 21,599 clusters showed more than one genotype according to their sequence diversity in the whole F1 population (Table 1). This indicated an average 26.8% polymorphism rate for the F1 population. A total of 11,144,665 reads were involved in these polymorphic loci and thus the average coverage was

                516-fold at the population level. In addition, we calculated the polymorphic loci for each F1 plant and its parents. According to Figure 2, we obtained an average of

                12,840 reads involved in the polymorphic loci and thus a 17.0-fold coverage per cluster per each individual. The reads number involved in the polymorphic loci ranged from 10,912 to 13,649 and the coverage ranged from 7.7 to 41.5-fold (Figure 2).

                Valid read number and coverage for each plant in the F1 population and their parents. The X axis in a and b indicates the plant accession, including the two parents and their average one the Y axis in a indicates read number, and in b, cluster (locus) coverage.

                As already noted, the main advantages of NGS technology are low cost and high throughput. However, it also has a very serious disadvantage in its high probability of sequence error [32]. To overcome this problem, high coverage of a specific sequence must be obtained. We digested the DNA and only then sequenced the RAD tags, greatly reducing the size of the genome. Jaillon et al. (2007) claimed that grapevine harbors a sequence that is

                470 Mb. During the genotyping of our 102 plants, we only manipulated

                80,709 valid clusters and each contained an

                70-bp sequence. Thus the grape genome was simplified to

                5.65 Mb (80,709 × 70 bp). This amounts to an

                83-fold reduction compared with the original 470 Mb reference genome, resulting in the requirement of very little data to achieve high coverage. According to our data, the average coverage for each tag was 17.0-fold in an individual plant. Moreover, because all sequence tags were from the two parents, Beihong and Z180, the number of alleles for each locus was ≤ 4. The total coverage for each tag at the population level was

                469-fold, leading us to adjust the SNPs in some loci where their coverage in an individual plant was insufficient. In addition, with these and subsequent strict criterions, we found the coverage of clusters corresponding to final SNP markers on the genetic map showed almost larger than 7 in an individual plant only 24 showed from 5- to 7-fold coverage. Based on the above analyses, we concluded that the applied strategy provides high-throughput and high-quality identification of SNPs.

                There were a number of possible patterns for the polymorphic markers in an F1 population (ab × cd, ef × eg, hk × hk, lm × ll, nn × np and aa × bb). However, the last pattern, aa × bb, could not be applied to the genetic map construction due to its lack of segregation in our F1 population, even though it probably constituted the largest proportion of all marker types. Thus, calculation of the segregating patterns for all loci would be necessary before a linkage map could be constructed. In addition, despite a high average coverage for the predicted RAD tag clusters, there were still a number of RAD tag clusters with low coverage in some F1 plants. To increase the accuracy of our data, only the clusters showing three or more fold coverage of > 80% of the F1 plants were used for subsequent development of SNP markers. We screened all 21,599 polymorphic clusters based on the above criteria and obtained 1,814 valid SNP markers with segregating patterns of ab × cd, ef × eg, hk × hk, lm × ll or nn × np (note that if two polymorphic clusters came from the same MseI-digested fragment, they were regarded as one marker). In addition to the coverage of the sequence data, the integrity for each locus among these 100 F1 individuals and their two parents was a key parameter in controlling map quality. We therefore investigated the data on missing rate for these plants, and found full integrity for the two parents, Z180 and Beihong, and 92.3% integrity on average for the 100 F1 plants. For a single SNP marker, the lowest integrity was

                85%, meeting the requirement for LG construction. Of these 1,814 SNP markers, 1,545 were homozygous for one parent and heterozygous for the other (960 for lm × ll and 585 for nn × np), constituting 85.2% of all selected SNP markers. However, the other three types of markers that could be mapped on both female and male linkage maps only amounted to 14.8% (ab × cd: 77, ef × eg: 171 and hk × hk: 21). This indicated that at most, 269 SNP markers could be used as shared markers for the integration of the two parents’ maps into one.

                Because all of the SNP markers in this study were uniquely developed and no LG information was available, we identified a set of anchor markers that would indicate their chromosomal location. As described in Materials and Methods, the chromosome location of the 269 markers with ab × cd, ef × eg and hk × hk segregation patterns were detected according to their sequence alignment to the grape reference genome. After a series of strict selections and calculations, 212 markers clearly showed their chromosome location (Additional file 1: Table S1). Of these anchor markers, two were located on random chromosomes because the grape genomic sequence has not been completely assembled. The lowest number of anchor markers was on chromosome 15, with only two being usable for map construction (Additional file 1: Table S1). The average number of anchor markers for each chromosome was

                11.2 and only one chromosome had < 5 markers. This indicated that these anchor markers were sufficient for LG assignment.

                Genetic maps

                When the data preparation was complete, the 1,814 SNP markers were imported into JoinMap4.0 for map construction. In total, 1,121 markers fell into 19 LGs for Z180 (female), 759 markers for the Beihong (male), and 1,646 markers for the integrated map, with a grouping LOD value of 7 to 13 (Figure 3, 4, 5, and 6, Additional file 2: Figure S1 and Additional file 3: Table S2). The difference in the number of markers between Z180 and Beihong might indicate the heterozygosity of Z180 is larger than Beihong and it is corresponding to the result of an ongoing research which is conducting in our group for investigation of diversity among different vitis germplasm (unpublished). For these 19 LGs, the Z180 LG08 and Beihong LG14 did not form a uniform bar, but divided into two short LGs. Of the 212 anchor markers, 19 did not map to either Z180 or Beihong LGs, and 5 markers were specific to Beihong LGs. Thus 188 markers could be mapped on both Z180 and Beihong maps (Table S1). Further analysis of the location of the anchor markers revealed that their assignment to each chromosome by alignment to the reference genome and by LG clustering was identical. This suggested conservation of the genome structure among different species and the accuracy of our genotyping data.

                Integrated linkage group1 to 5 for Z180×Beihong.

                Taking into account the size of all LGs, marker coverage amounted to1,884.3 cM for Z180 (female), 1,740.5 cM for Beihong (male), and 1,917.3 cM for the integrated map (Table 2). The average intervals between two adjacent mapped markers were 1.68 cM, 2.29 cM and 1.16 cM for the Z180, Beihong and integrated maps, respectively. The total physical size of the grape genome was

                470 Mb [30, 31], meaning that each 1,000-kb DNA sequence was equal to an average of

                4.0 cM genetic distance in this study. Though we found there was no significant correlation between genetic and physical size in the subsequent analysis, the data still could indicate that the average intervals between two adjacent mapped markers on their genome were

                420 kb (1.68/4.0 × 1000) for Z180,

                573 kb for the Beihong, and 290 kb for the integrated map. Comparing previous reports of vitis genetic map, the total marker number on the linkage groups (LGs) of these existing maps is generally < 1,000 [6–16], therefore, the density for linkage maps developed for the F1 population of Z180 × Beihong was very high. In addition, the total sizes of grape genetic map ranged from

                1700 cM in previous study [6–16] and were much smaller than our map. More markers applied and interspecies crossed F1 population in this study might be attributed to this difference. More markers applied in the genetic map could detect more recombination, whereas, interspecies cross could produce more recombination. Further analysis revealed that the markers on these 19 LGs were not evenly distributed. The maximum number of markers occurred on LG18, with 95 markers for the female, 74 for the male and 148 for the integrated map. The minimum number of markers occurred on LG15—15 for Z180, 22 for Beihong and 34 for the integrated map. The size of the LGs also varied widely (Table 2): the longest LGs were LG05 for Z180 (133.2 cM), LG07 for Beihong (122.8 cM) and LG13 for the integrated map (118.5 cM) the shortest were LG15, LG11 and LG11 for Beihong, Z180 and the integrated maps, with 57.4 cM, 76.3 cM and 79.2 cM, respectively. Compared with the physical size of the corresponding chromosomes [31], the longest and shortest chromosomes were LG18 and LG17 with 34.4 and 17.9 Mb, respectively. The different physical and genetic rankings of the LGs led us to investigate the correlation between the two. Both females and males showed a very weak correlation (r = 0.25) between genetic and physical size among these 19 LGs/chromosomes, which might indicate that different recombination rates exist on the different chromosomes during meiosis.

                A number of future studies can be based on the high-density genetic map developed in this work. First, several excellent traits exist in one of the two parents. Thus, a given trait might be improved by selection of markers which are linked to elite loci or alleles after QTL detection. Moreover, several excellent traits might be combined in one grape plant, thereby producing a new cultivar, through a series of crosses and marker-assisted selection (MAS). Second, compared to other genetic maps for grape, there are two obvious advantages: high density and complete sequence information for all markers (Additional file 3: Table S2). These advantages could greatly benefit comparative mapping and genome assembly. The markers' combined 60-bp sequences mapped to the LGs could be used as anchors for the genome. Although the genome sequence of grapevine was published several years ago, it still has a number of gaps and random sequences [30, 31]. In this study, a set of markers could be aligned to the random chromosomes of V. vinifera Pinot noir PN40024 (data not shown). According to their positions on LGs, it might be easy to put the random chromosomes into the common one. On the other hand, the published grape genome is only for V. vinifera, and the genome structures of different Vitis species are expected to be more or less different due to the long evolutionary history of the Vitaceae [33]. Thus, comparing the genome characteristics of the different species could give us a better understanding of grape. The 1,646 mapped markers' combined 60-bp sequences could be used as shared anchors to compare genetic and physical maps (Additional file 3: Table S2). These studies might facilitate use of the grape genomic resource.

                Comparison of genetic and physical maps

                To compare the genetic and physical maps, we investigated the locations of all 1,814 SNP markers on the reference genome. The high-quality 30-bp sequences from both ends of each SNP marker were employed for the location search by aligning them to the reference genome. A total of 1,456 SNP markers showed a match between their two ends and the same positions (intervals of 200–500 bp) on the reference genome 106 markers only showed a match for one end to one position on the reference genome, while the other end had no match the remaining 252 markers showed no match to the reference genome, showed a conflict in matching positions for the two ends, or were mapped on the random genome. To increase accuracy, only the first type of markers (1,456 SNP markers) was used to compare the genetic and physical maps.

                From Tables 3 and Additional file 3: Table S2, 892 common markers were found between the physical and Z180 (female) genetic map 606 common markers were found between the physical and Beihong (male) genetic map. This indicated that 79.6% (892/1,121) of the markers on the female LGs could be mapped on the reference genome similarly, 79.8% of the markers on the male LGs could be mapped on the reference genome. Among the 19 chromosomes or LGs, LG18 showed the highest number of common markers between the physical and genetic maps for Z180 and Beihong (75 and 61, respectively) LG15 showed the lowest number of common markers, only 13 for the Z180 map and 15 for the Beihong map. To compare the order of the common markers, a dot-plot diagram (Figure 7) was generated using the physical position of each common marker on the reference genome against its genetic position on the LGs at the same time, all LGs of the two parental maps were aligned with the reference genome (Additional file 4: Figure S2). According to these two analyses, most of the markers showed good linear agreement between physical and genetic maps on the basic framework. However, there were also chromosomes showing rearrangement of some regions. Among the 19 LGs, Chr01, 03, 04, 05, 06, 08 (two LGs for male), 09, 10, 12, 13, 14, 17, 18, 19 showed high collinear results for both female and male maps. The remaining LGs only showed high collinear results for one map. Because both parents were produced by interspecies crosses (V. monticola × V. riparia and V. vinifera × V. amurensis), some of the regions in the two parent genetic maps might be identical to the reference genome (V. vinifera) nevertheless, most of the regions are expected to come from the other three Vitis species. Therefore, the same order for the two types of map most probably indicates conservation of genomes among the different grape species the non-collinearity for some chromosome regions might indicate some variations among different grape species during evolution.

                Watch the video: Restriction mapping of circular DNA (January 2022).