Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

50
(from Reference Citation Analysis)

Article PDFs (21)

Cited by > 0 (47)

Searched Name

Thomas L Madden

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Statistics

Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Category

Show more Refine

Number	Citation Analysis
1	ElasticBLAST: accelerating sequence search via cloud computing. BMC Bioinformatics 2023;24:117. [PMID: 36967390 PMCID: PMC10040096 DOI: 10.1186/s12859-023-05245-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 03/21/2023] [Indexed: 03/28/2023] Open Abstract BACKGROUND Biomedical researchers use alignments produced by BLAST (Basic Local Alignment Search Tool) to categorize their query sequences. Producing such alignments is an essential bioinformatics task that is well suited for the cloud. The cloud can perform many calculations quickly as well as store and access large volumes of data. Bioinformaticians can also use it to collaborate with other researchers, sharing their results, datasets and even their pipelines on a common platform. RESULTS We present ElasticBLAST, a cloud native application to perform BLAST alignments in the cloud. ElasticBLAST can handle anywhere from a few to many thousands of queries and run the searches on thousands of virtual CPUs (if desired), deleting resources when it is done. It uses cloud native tools for orchestration and can request discounted instances, lowering cloud costs for users. It is supported on Amazon Web Services and Google Cloud Platform. It can search BLAST databases that are user provided or from the National Center for Biotechnology Information. CONCLUSION We show that ElasticBLAST is a useful application that can efficiently perform BLAST searches for the user in the cloud, demonstrating that with two examples. At the same time, it hides much of the complexity of working in the cloud, lowering the threshold to move work to the cloud. Collapse Key Words AWS Batch Alignment BLAST Cloud computing Kubernetes Collapse MESH Headings Cloud Computing Software Computational Biology/methods Databases, Factual Costs and Cost Analysis Collapse Grants U.S. National Library of Medicine National Institutes of Health (NIH) Collapse
2	ElasticBLAST: Accelerating Sequence Search via Cloud Computing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.04.522777. [PMID: 36789435 PMCID: PMC9928022 DOI: 10.1101/2023.01.04.522777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Abstract Background Biomedical researchers use alignments produced by BLAST (Basic Local Alignment Search Tool) to categorize their query sequences. Producing such alignments is an essential bioinformatics task that is well suited for the cloud. The cloud can perform many calculations quickly as well as store and access large volumes of data. Bioinformaticians can also use it to collaborate with other researchers, sharing their results, datasets and even their pipelines on a common platform. Results We present ElasticBLAST, a cloud native application to perform BLAST alignments in the cloud. ElasticBLAST can handle anywhere from a few to many thousands of queries and run the searches on thousands of virtual CPUs (if desired), deleting resources when it is done. It uses cloud native tools for orchestration and can request discounted instances, lowering cloud costs for users. It is supported on Amazon Web Services and Google Cloud Platform. It can search BLAST databases that are user provided or from the National Center for Biotechnology Information. Conclusion We show that ElasticBLAST is a useful application that can efficiently perform BLAST searches for the user in the cloud, demonstrating that with two examples. At the same time, it hides much of the complexity of working in the cloud, lowering the threshold to move work to the cloud. Collapse Key Words blast cloud computing alignment kubernetes aws batch Collapse MESH Headings Collapse Grants Collapse
3	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2021;49:D10-D17. [PMID: 33095870 PMCID: PMC7778943 DOI: 10.1093/nar/gkaa892] [Citation(s) in RCA: 410] [Impact Index Per Article: 136.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 09/25/2020] [Accepted: 10/08/2020] [Indexed: 11/14/2022] Open Abstract The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 34 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Custom implementations of the BLAST program provide sequence-based searching of many specialized datasets. New resources released in the past year include a new PubMed interface and NCBI datasets. Additional resources that were updated in the past year include PMC, Bookshelf, Genome Data Viewer, SRA, ClinVar, dbSNP, dbVar, Pathogen Detection, BLAST, Primer-BLAST, IgBLAST, iCn3D and PubChem. All of these resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Computational Biology/methods Databases, Chemical Databases, Genetic Databases, Nucleic Acid Databases, Protein Genomics/methods Humans National Library of Medicine (U.S.) PubMed United States Collapse Grants NLM NIH HHS National Institutes of Health Collapse
4	Reply to the paper: Misunderstood parameters of NCBI BLAST impacts the correctness of bioinformatics workflows. Bioinformatics 2020;35:2699-2700. [PMID: 30590429 DOI: 10.1093/bioinformatics/bty1026] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 12/19/2018] [Indexed: 11/14/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
5	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2020;48:D9-D16. [PMID: 31602479 DOI: 10.1093/nar/gkz899] [Citation(s) in RCA: 267] [Impact Index Per Article: 66.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 10/09/2019] [Indexed: 11/14/2022] Open Abstract The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Custom implementations of the BLAST program provide sequence-based searching of many specialized datasets. New resources released in the past year include a new PubMed interface, a sequence database search and a gene orthologs page. Additional resources that were updated in the past year include PMC, Bookshelf, My Bibliography, Assembly, RefSeq, viral genomes, the prokaryotic genome annotation pipeline, Genome Workbench, dbSNP, BLAST, Primer-BLAST, IgBLAST and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
6	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2020;47:D23-D28. [PMID: 30395293 PMCID: PMC6323993 DOI: 10.1093/nar/gky1069] [Citation(s) in RCA: 331] [Impact Index Per Article: 82.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Accepted: 10/18/2018] [Indexed: 11/16/2022] Open Abstract The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank^® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 38 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. New resources released in the past year include PubMed Labs and a new sequence database search. Resources that were updated in the past year include PubMed, PMC, Bookshelf, genome data viewer, Assembly, prokaryotic genomes, Genome, BioProject, dbSNP, dbVar, BLAST databases, igBLAST, iCn3D and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
7	Magic-BLAST, an accurate RNA-seq aligner for long and short reads. BMC Bioinformatics 2019;20:405. [PMID: 31345161 PMCID: PMC6659269 DOI: 10.1186/s12859-019-2996-x] [Citation(s) in RCA: 147] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 07/16/2019] [Indexed: 12/31/2022] Open Abstract BACKGROUND Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline. RESULTS Magic-BLAST uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome. CONCLUSIONS We show that Magic-BLAST is the best at intron discovery over a wide range of conditions and the best at mapping reads longer than 250 bases, from any platform. It is versatile and robust to high levels of mismatches or extreme base composition, and reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI. Collapse Key Words Alignment BLAST RNA-seq Collapse MESH Headings Collapse Grants Collapse
8	IgBLAST: an immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res 2013;41:W34-40. [PMID: 23671333 PMCID: PMC3692102 DOI: 10.1093/nar/gkt382] [Citation(s) in RCA: 748] [Impact Index Per Article: 68.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open Abstract The variable domain of an immunoglobulin (IG) sequence is encoded by multiple genes, including the variable (V) gene, the diversity (D) gene and the joining (J) gene. Analysis of IG sequences typically requires identification of each gene, as well as a comparison of sequence variations in the context of defined regions. General purpose tools, such as the BLAST program, have only limited use for such tasks, as the rearranged nature of an IG sequence and the variable length of each gene requires multiple rounds of BLAST searches for a single IG sequence. Additionally, manual assembly of different genes is difficult and error-prone. To address these issues and to facilitate other common tasks in analysing IG sequences, we have developed the sequence analysis tool IgBLAST (http://www.ncbi.nlm.nih.gov/igblast/). With this tool, users can view the matches to the germline V, D and J genes, details at rearrangement junctions, the delineation of IG V domain framework regions and complementarity determining regions. IgBLAST has the capability to analyse nucleotide and protein sequences and can process sequences in batches. Furthermore, IgBLAST allows searches against the germline gene databases and other sequence databases simultaneously to minimize the chance of missing possibly the best matching germline V gene. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
9	BLAST: a more efficient report with usability improvements. Nucleic Acids Res 2013;41:W29-33. [PMID: 23609542 PMCID: PMC3692093 DOI: 10.1093/nar/gkt282] [Citation(s) in RCA: 735] [Impact Index Per Article: 66.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract The Basic Local Alignment Search Tool (BLAST) website at the National Center for Biotechnology (NCBI) is an important resource for searching and aligning sequences. A new BLAST report allows faster loading of alignments, adds navigation aids, allows easy downloading of subject sequences and reports and has improved usability. Here, we describe these improvements to the BLAST report, discuss design decisions, describe other improvements to the search page and database documentation and outline plans for future development. The NCBI BLAST URL is http://blast.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
10	Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction. BMC Bioinformatics 2012;13:134. [PMID: 22708584 PMCID: PMC3412702 DOI: 10.1186/1471-2105-13-134] [Citation(s) in RCA: 3443] [Impact Index Per Article: 286.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2012] [Accepted: 06/18/2012] [Indexed: 02/07/2023] Open Abstract BACKGROUND Choosing appropriate primers is probably the single most important factor affecting the polymerase chain reaction (PCR). Specific amplification of the intended target requires that primers do not have matches to other targets in certain orientations and within certain distances that allow undesired amplification. The process of designing specific primers typically involves two stages. First, the primers flanking regions of interest are generated either manually or using software tools; then they are searched against an appropriate nucleotide sequence database using tools such as BLAST to examine the potential targets. However, the latter is not an easy process as one needs to examine many details between primers and targets, such as the number and the positions of matched bases, the primer orientations and distance between forward and reverse primers. The complexity of such analysis usually makes this a time-consuming and very difficult task for users, especially when the primers have a large number of hits. Furthermore, although the BLAST program has been widely used for primer target detection, it is in fact not an ideal tool for this purpose as BLAST is a local alignment algorithm and does not necessarily return complete match information over the entire primer range. RESULTS We present a new software tool called Primer-BLAST to alleviate the difficulty in designing target-specific primers. This tool combines BLAST with a global alignment algorithm to ensure a full primer-target alignment and is sensitive enough to detect targets that have a significant number of mismatches to primers. Primer-BLAST allows users to design new target-specific primers in one step as well as to check the specificity of pre-existing primers. Primer-BLAST also supports placing primers based on exon/intron locations and excluding single nucleotide polymorphism (SNP) sites in primers. CONCLUSIONS We describe a robust and fully implemented general purpose primer design tool that designs target-specific PCR primers. Primer-BLAST offers flexible options to adjust the specificity threshold and other primer properties. This tool is publicly available at http://www.ncbi.nlm.nih.gov/tools/primer-blast. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
11	New finite-size correction for local alignment score distributions. BMC Res Notes 2012;5:286. [PMID: 22691307 PMCID: PMC3483159 DOI: 10.1186/1756-0500-5-286] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2012] [Accepted: 05/16/2012] [Indexed: 11/10/2022] Open Abstract BACKGROUND Local alignment programs often calculate the probability that a match occurred by chance. The calculation of this probability may require a "finite-size" correction to the lengths of the sequences, as an alignment that starts near the end of either sequence may run out of sequence before achieving a significant score. FINDINGS We present an improved finite-size correction that considers the distribution of sequence lengths rather than simply the corresponding means. This approach improves sensitivity and avoids substituting an ad hoc length for short sequences that can underestimate the significance of a match. We use a test set derived from ASTRAL to show improved ROC scores, especially for shorter sequences. CONCLUSIONS The new finite-size correction improves the calculation of probabilities for a local alignment. It is now used in the BLAST+ package and at the NCBI BLAST web site ( http://blast.ncbi.nlm.nih.gov). Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
12	Domain enhanced lookup time accelerated BLAST. Biol Direct 2012;7:12. [PMID: 22510480 PMCID: PMC3438057 DOI: 10.1186/1745-6150-7-12] [Citation(s) in RCA: 535] [Impact Index Per Article: 44.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Accepted: 04/17/2012] [Indexed: 11/10/2022] Open Abstract Background BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i + 1. Biegert and Söding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch. Results We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI’s Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC₅₀₀₀ of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST. Conclusions DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the “Protein BLAST” link at http://blast.ncbi.nlm.nih.gov. Reviewers This article was reviewed by Arcady Mushegian, Nick V. Grishin, and Frank Eisenhaber. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
13	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2011;40:D13-25. [PMID: 22140104 PMCID: PMC3245031 DOI: 10.1093/nar/gkr1184] [Citation(s) in RCA: 466] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open Abstract In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
14	Exclusion of Spherical Particles from the Nematic Phase of Reversibly Assembled Rod-Like Particles. ACTA ACUST UNITED AC 2011. [DOI: 10.1557/proc-248-95] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Abstract AbstractThe open-ended aggregation of amphiphilic molecules in aqueous solution generates a broadly polydisperse population of elongated particles that form a variety of partially ordered phases. Herzfeld and coworkers have shown that the phase behavior of these binary systems is well described by self-consistently combining scaled-particle theory for the effects of excluded volume in fluid dimensions, a simple cell model for the effects of excluded volume in positionally ordered dimensions, a mean-field treatment of soft-interactions, and a phenomenological model of aggregate formation. We have now extended this model to ternary systems. We find that the addition of spherical particles to a solution of rod-forming particles induces a very wide isotropic-nematic coexistence region in which a relatively dilute isotropic solution with little aggregation separates from a rather concentrated nematic solution that almost completely excludes the spherical solutes. The magnitude of this effect depends on the relative diameters of the two solutes. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
15	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2011;39:D38-51. [PMID: 21097890 PMCID: PMC3013733 DOI: 10.1093/nar/gkq1172] [Citation(s) in RCA: 475] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2010] [Revised: 10/29/2010] [Accepted: 11/01/2010] [Indexed: 12/03/2022] Open Abstract In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Databases, Genetic Databases, Protein Gene Expression Genomics National Library of Medicine (U.S.) Protein Structure, Tertiary PubMed Sequence Alignment Sequence Analysis, DNA Sequence Analysis, RNA Software Systems Integration United States Collapse Grants Intramural NIH HHS Collapse
16	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2010;38:D5-16. [PMID: 19910364 PMCID: PMC2808881 DOI: 10.1093/nar/gkp967] [Citation(s) in RCA: 374] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 10/06/2009] [Accepted: 10/13/2009] [Indexed: 12/23/2022] Open Abstract In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, Reference Sequence, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Peptidome, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Algorithms Animals Computational Biology/methods Computational Biology/trends Databases, Genetic Databases, Nucleic Acid Databases, Protein Genome, Bacterial Genome, Viral Humans Information Storage and Retrieval/methods Internet National Institutes of Health (U.S.) National Library of Medicine (U.S.) Software United States Collapse Grants Intramural NIH HHS Collapse
17	BLAST+: architecture and applications. BMC Bioinformatics 2009. [PMID: 20003500 DOI: 10.1186/1471–2105-10-421] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open Abstract BACKGROUND Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications. RESULTS We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site. CONCLUSION The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
18	BLAST+: architecture and applications. BMC Bioinformatics 2009;10:421. [PMID: 20003500 PMCID: PMC2803857 DOI: 10.1186/1471-2105-10-421] [Citation(s) in RCA: 10825] [Impact Index Per Article: 721.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2009] [Accepted: 12/15/2009] [Indexed: 01/13/2023] Open Abstract Background Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications. Results We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site. Conclusion The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
19	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2009;37:D5-15. [PMID: 18940862 PMCID: PMC2686545 DOI: 10.1093/nar/gkn741] [Citation(s) in RCA: 654] [Impact Index Per Article: 43.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2008] [Revised: 10/01/2008] [Accepted: 10/02/2008] [Indexed: 11/13/2022] Open Abstract In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the web applications is custom implementation of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Databases, Genetic Gene Expression Genes Genomics Genotype National Library of Medicine (U.S.) Phenotype Protein Structure, Tertiary Proteomics PubMed Sequence Homology Systems Integration United States Collapse Grants Collapse
20	Database indexing for production MegaBLAST searches. ACTA ACUST UNITED AC 2008;24:1757-64. [PMID: 18567917 PMCID: PMC2696921 DOI: 10.1093/bioinformatics/btn322] [Citation(s) in RCA: 683] [Impact Index Per Article: 42.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Abstract Motivation: The BLAST software package for sequence comparison speeds up homology search by preprocessing a query sequence into a lookup table. Numerous research studies have suggested that preprocessing the database instead would give better performance. However, production usage of sequence comparison methods that preprocess the database has been limited to programs such as BLAT and SSAHA that are designed to find matches when query and database subsequences are highly similar. Results: We developed a new version of the MegaBLAST module of BLAST that does the initial phase of finding short seeds for matches by searching a database index. We also developed a program makembindex that preprocesses the database into a data structure for rapid seed searching. We show that the new ‘indexed MegaBLAST’ is faster than the ‘non-indexed’ version for most practical uses. We show that indexed MegaBLAST is faster than miBLAST, another implementation of BLAST nucleotide searching with a preprocessed database, for most of the 200 queries we tested. To deploy indexed MegaBLAST as part of NCBI'sWeb BLAST service, the storage of databases and the queueing mechanism were modified, so that some machines are now dedicated to serving queries for a specific database. The response time for such Web queries is now faster than it was when each computer handled queries for multiple databases. Availability: The code for indexed MegaBLAST is part of the blastn program in the NCBI C++ toolkit. The preprocessor program makembindex is also in the toolkit. Indexed MegaBLAST has been used in production on NCBI's Web BLAST service to search one version of the human and mouse genomes since October 2007. The Linux command-line executables for blastn and makembindex, documentation, and some query sets used to carry out the tests described below are available in the directory: ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/indexed_megablast Contact:schaffer@helix.nih.gov Supplementary information:Supplementary data are available at Bioinformatics online. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
21	NCBI BLAST: a better web interface. Nucleic Acids Res 2008;36:W5-9. [PMID: 18440982 PMCID: PMC2447716 DOI: 10.1093/nar/gkn201] [Citation(s) in RCA: 2383] [Impact Index Per Article: 148.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open Abstract Basic Local Alignment Search Tool (BLAST) is a sequence similarity search program. The public interface of BLAST, http://www.ncbi.nlm.nih.gov/blast, at the NCBI website has recently been reengineered to improve usability and performance. Key new features include simplified search forms, improved navigation, a list of recent BLAST results, saved search strategies and a documentation directory. Here, we describe the BLAST web application's new features, explain design decisions and outline plans for future improvement. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
22	Sequence similarity searching using the BLAST family of programs. ACTA ACUST UNITED AC 2008;Chapter 19:Unit 19.3. [PMID: 18265177 DOI: 10.1002/0471142727.mb1903s46] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Abstract Database sequence similarity searching is carried out thousands of times each day by researchers worldwide and has become a very valuable tool. Over the years, a number of algorithms have been implemented to facilitate database searching. The BLAST (Basic Local Alignment Research Tool) family of sequence similarity search programs allows searches to be done quickly and easily, but with sensitive, yet rigorous statistical expectations. In this unit, which is a completely new version of its predecessor of the same title, the user learns how to access the databases, determine the correct searching strategies, and apply examples of BLAST searches to his or her own data. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
23	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2008;36:D13-21. [PMID: 18045790 PMCID: PMC2238880 DOI: 10.1093/nar/gkm1000] [Citation(s) in RCA: 608] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2007] [Revised: 10/19/2007] [Accepted: 10/22/2007] [Indexed: 12/21/2022] Open Abstract In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data available through NCBI's web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace, Assembly, and Short Read Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Database of Genotype and Phenotype, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting the web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Animals Databases, Genetic Databases, Nucleic Acid Gene Expression Genomics Genotype Humans Internet Models, Molecular National Library of Medicine (U.S.) Phenotype Proteomics Sequence Alignment United States Collapse Grants Z99 LM999999 Intramural NIH HHS Collapse
24	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2007;35:D5-12. [PMID: 17170002 PMCID: PMC1781113 DOI: 10.1093/nar/gkl1031] [Citation(s) in RCA: 626] [Impact Index Per Article: 36.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2006] [Revised: 10/16/2006] [Accepted: 10/17/2006] [Indexed: 01/12/2023] Open Abstract In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link(BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace and Assembly Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Viral Genotyping Tools, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Animals Databases, Genetic Databases, Nucleic Acid Databases, Protein Gene Expression Genomics Humans Internet National Library of Medicine (U.S.) Phenotype Proteomics PubMed Sequence Alignment Software United States Collapse Grants Collapse
25	BLAST: improvements for better sequence analysis. Nucleic Acids Res 2006;34:W6-9. [PMID: 16845079 PMCID: PMC1538791 DOI: 10.1093/nar/gkl164] [Citation(s) in RCA: 336] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract Basic local alignment search tool (BLAST) is a sequence similarity search program. The National Center for Biotechnology Information (NCBI) maintains a BLAST server with a home page at . We report here on recent enhancements to the results produced by the BLAST server at the NCBI. These include features to highlight mismatches between similar sequences, show where the query was masked for low-complexity sequence, and integrate information about the database sequences from the NCBI Entrez system into the BLAST display. Changes to how the database sequences are fetched have also improved the speed of the report generator. Collapse Key Words Collapse MESH Headings Computer Graphics Databases, Nucleic Acid Databases, Protein Humans Internet Sequence Alignment/methods Sequence Analysis/methods Software Systems Integration User-Computer Interface Collapse Grants Intramural NIH HHS Collapse
26	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2006;34:D173-80. [PMID: 16381840 PMCID: PMC1347520 DOI: 10.1093/nar/gkj158] [Citation(s) in RCA: 396] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2005] [Revised: 10/03/2005] [Accepted: 10/31/2005] [Indexed: 12/31/2022] Open Abstract In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Databases, Genetic Databases, Nucleic Acid Databases, Protein Gene Expression Regulation Genes Genomics Humans Internet National Library of Medicine (U.S.) PubMed Sequence Alignment Sequence Analysis, DNA Software United States Collapse Grants Collapse
27	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2005;33:D39-45. [PMID: 15608222 PMCID: PMC540016 DOI: 10.1093/nar/gki062] [Citation(s) in RCA: 338] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data retrieval systems and computational resources for the analysis of data in GenBank and other biological data made available through NCBI's website. NCBI resources include Entrez, Entrez Programming Utilities, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Animals Computational Biology Conserved Sequence Databases, Factual Databases, Genetic Gene Expression Profiling Genes Genomics Humans Models, Molecular National Library of Medicine (U.S.) Phenotype Protein Interaction Mapping Protein Structure, Tertiary Sequence Alignment Software United States Collapse Grants Collapse
28	BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 2004;32:W20-5. [PMID: 15215342 PMCID: PMC441573 DOI: 10.1093/nar/gkh435] [Citation(s) in RCA: 1150] [Impact Index Per Article: 57.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract Basic Local Alignment Search Tool (BLAST) is one of the most heavily used sequence analysis tools available in the public domain. There is now a wide choice of BLAST algorithms that can be used to search many different sequence databases via the BLAST web pages (http://www.ncbi.nlm.nih.gov/BLAST/). All the algorithm-database combinations can be executed with default parameters or with customized settings, and the results can be viewed in a variety of ways. A new online resource, the BLAST Program Selection Guide, has been created to assist in the definition of search strategies. This article discusses optimal search strategies and highlights some BLAST features that can make your searches more powerful. Collapse Key Words Collapse MESH Headings Databases, Genetic Genome Internet Nucleotides/chemistry Sequence Alignment Sequence Analysis Sequence Analysis, Protein Software User-Computer Interface Collapse Grants Collapse
29	Database resources of the National Center for Biotechnology Information: update. Nucleic Acids Res 2004;32:D35-40. [PMID: 14681353 PMCID: PMC308807 DOI: 10.1093/nar/gkh073] [Citation(s) in RCA: 284] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open Abstract In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI’s website. NCBI resources include Entrez, PubMed, PubMed Central, LocusLink, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SARS Coronavirus Resource, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Animals Classification Computational Biology Databases, Factual Gene Expression Profiling Genes Genome Genomics Humans Information Storage and Retrieval National Institutes of Health (U.S.) Open Reading Frames Polymorphism, Genetic PubMed Software United States Collapse Grants Collapse
30	Database resources of the National Center for Biotechnology. Nucleic Acids Res 2003;31:28-33. [PMID: 12519941 PMCID: PMC165480 DOI: 10.1093/nar/gkg033] [Citation(s) in RCA: 655] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, PubMed, PubMed Central (PMC), LocusLink, the NCBITaxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR (e-PCR), Open Reading Frame (ORF) Finder, References Sequence (RefSeq), UniGene, HomoloGene, ProtEST, Database of Single Nucleotide Polymorphisms (dbSNP), Human/Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker (MM), Evidence Viewer (EV), Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Animals Biotechnology Chromosome Mapping Databases, Genetic Gene Expression Profiling Genes Genome Humans Information Storage and Retrieval Mice Models, Molecular Phenotype Protein Structure, Tertiary Sequence Alignment/methods Sequence Homology United States Collapse Grants Collapse
31	Database resources of the National Center for Biotechnology Information: 2002 update. Nucleic Acids Res 2002;30:13-6. [PMID: 11752242 PMCID: PMC99094 DOI: 10.1093/nar/30.1.13] [Citation(s) in RCA: 154] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open Abstract In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI's web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, Human inverted exclamation markVMouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Animals Base Sequence Biotechnology Chromosome Aberrations Chromosomes Conserved Sequence Databases, Genetic Gene Expression Profiling Genome Genome, Human Humans Information Storage and Retrieval National Library of Medicine (U.S.) Polymorphism, Single Nucleotide Protein Structure, Tertiary RNA, Messenger/genetics Sequence Homology United States Collapse Grants Collapse
32	Development of a simplified, sensitive high-performance liquid chromatographic method using fluorescence detection to determine the concentration of UCN-01 in human plasma. JOURNAL OF CHROMATOGRAPHY. B, BIOMEDICAL SCIENCES AND APPLICATIONS 2001;760:247-53. [PMID: 11530983 DOI: 10.1016/s0378-4347(01)00276-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Abstract UCN-01 is a naturally derived anticancer agent isolated in the culture broth of actinomyces streptomyces. We have developed a sensitive high-performance liquid chromatographic method for the determination of UCN-01 in human plasma. UCN-01 was isolated from human plasma after intravenous administration, by using 100% ice-cold acetonitrile liquid-liquid phase extraction. Liquid chromatographic separation was achieved by isocratic elution on a phenyl analytical column. The mobile phase consisted of acetonitrile-0.5 M ammonium acetate (45:55) with 0.2% triethylamine added as a modifier. The UCN-01 peak was identified from other peaks using fluorescence excitation energy and emission energy wavelengths of 310 and 410 nm, respectively. Retention time for UCN-01 was 4.2 +/- 0.5 min. The UCN-01 peak was baseline resolved, with nearest peak at 2.6 min distance. No interfering peaks were observed at the retention time of UCN-01. Peak area amounts from extracted samples were proportional over the dynamic concentration range used: 0.2 to 30 microg/ml. Mean recoveries of UCN-01 at concentrations of 0.5 and 25 microg/ml were 89 and 90.2%, respectively. Relative standard deviations for UCN-01 calibration standards ranged from 1.89 to 2.31%, with relative errors ranging from 0.3 to 11.6%. Assay precision for UCN-01 based on quality control samples of 0.50 microg/ml was +/- 4.86% with an accuracy of +/-5.7%. For drug extracted from plasma the lowest limit of detection was 0.1 microg/ml, with the lowest limit of quantitation being 0.2 microg/ml. This method is suitable for routine analysis of UCN-01 in human plasma at concentration from 0.2 to 30 microg/ml. Collapse Key Words Collapse MESH Headings Alkaloids/blood Antineoplastic Agents/blood Calibration Chromatography, High Pressure Liquid/methods Humans Reference Standards Sensitivity and Specificity Spectrometry, Fluorescence Staurosporine/analogs & derivatives Collapse Grants Collapse
33	Development of a high-performance liquid chromatographic method to determine the concentration of karenitecin, a novel highly lipophilic camptothecin derivative, in human plasma and urine. JOURNAL OF CHROMATOGRAPHY. B, BIOMEDICAL SCIENCES AND APPLICATIONS 2001;759:117-24. [PMID: 11499615 DOI: 10.1016/s0378-4347(01)00206-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Abstract Karenitecin is a novel, highly lipophilic camptothecin derivative with potent anticancer potential. We have developed a sensitive high-performance liquid chromatographic method for the determination of karenitecin concentration in human plasma and urine. Karenitecin was isolated from human plasma and urine using solid-phase extraction. Separation was achieved by gradient elution, using a water and acetonitrile mobile phase, on an ODS analytical column. Karenitecin was detected using fluorescence detection at excitation and emission wavelengths of 370 and 490 nm, respectively. Retention time for karenitecin was 16.2 +/- 0.5 min and 8.0 +/- 0.2 min for camptothecin, the internal standard. The karenitecin peak was baseline resolved, with the nearest peak at 3.1 min distance. Using normal volunteer plasma and urine from multiple individuals, as well as samples from the 50 patients analyzed to date, no interfering peaks were detected. Inter- and intra-day coefficients of variance were <4.4 and 7.1% for plasma and <4.9 and 11.6% for urine. Assay precision, based on an extracted karenitecin standard plasma sample of 2.5 ng/ml, was +4.46% with a mean accuracy of 92.4%. For extracted karenitecin standard urine samples of 2.5 ng/ml assay precision was +2.35% with a mean accuracy of 99.5%. The mean recovery of karenitecin, at plasma concentrations of 1.0 and 50 ng/ml, was 81.9 and 87.8% respectively. In urine, at concentrations of 1.5 and 50 ng/ml, the mean recoveries were 90.3 and 78.4% respectively. The lower limit of detection (LLD) for karenitecin was 0.5 ng/ml in plasma and 1.0 ng/ml in urine. The lower limit of quantification (LLQ) for karenitecin was 1 ng/ml and 1.5 ng/ml for plasma and urine, respectively. Stability studies indicate that when frozen at -70 degrees C, karenitecin is stable in human plasma for up to 3 months and in human urine for up to 1 month. This method is useful for the quantification of karenitecin in plasma and urine samples for clinical pharmacology studies in patients receiving this agent in clinical trials. Collapse Key Words Collapse MESH Headings Antineoplastic Agents/blood Camptothecin/analogs & derivatives Camptothecin/blood Chromatography, High Pressure Liquid/methods Enzyme Inhibitors/blood Reproducibility of Results Sensitivity and Specificity Spectrometry, Fluorescence Collapse Grants Collapse
34	Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 2001;29:2994-3005. [PMID: 11452024 PMCID: PMC55814 DOI: 10.1093/nar/29.14.2994] [Citation(s) in RCA: 939] [Impact Index Per Article: 40.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2001] [Revised: 05/30/2001] [Accepted: 05/30/2001] [Indexed: 11/13/2022] Open Abstract PSI-BLAST is an iterative program to search a database for proteins with distant similarity to a query sequence. We investigated over a dozen modifications to the methods used in PSI-BLAST, with the goal of improving accuracy in finding true positive matches. To evaluate performance we used a set of 103 queries for which the true positives in yeast had been annotated by human experts, and a popular measure of retrieval accuracy (ROC) that can be normalized to take on values between 0 (worst) and 1 (best). The modifications we consider novel improve the ROC score from 0.758 +/- 0.005 to 0.895 +/- 0.003. This does not include the benefits from four modifications we included in the 'baseline' version, even though they were not implemented in PSI-BLAST version 2.0. The improvement in accuracy was confirmed on a small second test set. This test involved analyzing three protein families with curated lists of true positives from the non-redundant protein database. The modification that accounts for the majority of the improvement is the use, for each database sequence, of a position-specific scoring system tuned to that sequence's amino acid composition. The use of composition-based statistics is particularly beneficial for large-scale automated applications of PSI-BLAST. Collapse Key Words Collapse MESH Headings Algorithms Amino Acids/genetics Animals Computational Biology/methods Computational Biology/statistics & numerical data Databases, Factual Humans Information Storage and Retrieval Proteins/genetics Reproducibility of Results Sensitivity and Specificity Sequence Alignment/methods Software Collapse Grants Collapse
35	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2001;29:11-6. [PMID: 11125038 PMCID: PMC29800 DOI: 10.1093/nar/29.1.11] [Citation(s) in RCA: 196] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2000] [Accepted: 10/04/2000] [Indexed: 11/14/2022] Open Abstract In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI's Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, GeneMap'99, Human-Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP), SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheri-tance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih. gov. Collapse Key Words Collapse MESH Headings Animals Biotechnology Databases, Factual Gene Expression Profiling Genome Genome, Human Humans Information Services Information Storage and Retrieval Internet Molecular Biology National Institutes of Health (U.S.) National Library of Medicine (U.S.) Phenotype Sequence Alignment United States Collapse Grants Collapse
36	Arglabin-DMA, a plant derived sesquiterpene, inhibits farnesyltransferase. Oncol Rep 2001;8:173-9. [PMID: 11115593 DOI: 10.3892/or.8.1.173] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open Abstract Arglabin [1(R),10(S)-epoxy-5(S),5(S),7(S)-guaia-3(4),11(13)-dien-6, 12-olide], a sesquiterpene gamma-lactone is isolated from Artemisia glabella, a species of wormwood endemic to the Karaganda region of Kazakstan. The compound has been modified to render it water-soluble through addition of a dimethylaminohydrochloride group to the C(13) carbohydride moiety to yield Arglabin-DMA. Arglabin-DMA is a registered antitumor substance in the Republic of Kazakstan. Previously, we have shown that this compound prevents protein farnesylation without altering geranylgeranylation. We now report that Arglabin-DMA inhibits the incorporation of [(3)H]farnesylpyrophosphate into human H-ras protein by FTase with an IC(50) of no greater than 25 microM. Kinetic studies show that the phosphorylated form of this compound competitively inhibits the binding of farnesyl diphosphate to FTase. This mechanism of action is different from other reported peptidomimetic FTIs which lower the affinity of ras protein to FTase. Our in vitro studies confirm that Arglabin-DMA inhibits post-translational modification of ras protein in cells. Arglabin-DMA inhibits anchorage-dependent proliferation of NB cells (IC50=10 microg/ml) and inhibits anchorage-independent growth of NB and KNRK cells with about the same IC(50). Soft-agar colony formation assay of H-ras and K-ras transformed cells show IC(50)s to be 2 and 5 microg/ml, respectively. In primary cultures of human tumor cells, Arglabin-DMA inhibits cell proliferation of a variety of tumor types with IC(90)s in the range of 0.85 to 5.0 microg/ml. Because of these pharmacologic properties, we propose that Arglabin-DMA is suitable for the treatment of ras related malignancies. Collapse Key Words Collapse MESH Headings 3T3 Cells/drug effects Alkyl and Aryl Transferases/antagonists & inhibitors Animals Antineoplastic Agents, Phytogenic/chemistry Antineoplastic Agents, Phytogenic/pharmacology Artemisia/chemistry Binding, Competitive Cell Division/drug effects Cell Line, Transformed/drug effects Cell Transformation, Neoplastic Enzyme Inhibitors/chemistry Enzyme Inhibitors/pharmacology Farnesyltranstransferase Mice Molecular Structure Neoplasm Proteins/antagonists & inhibitors Neoplasm Proteins/metabolism Neoplasms/pathology Neuroblastoma/pathology Plants, Medicinal Polyisoprenyl Phosphates/metabolism Protein Prenylation/drug effects Protein Processing, Post-Translational/drug effects Proto-Oncogene Proteins p21(ras)/metabolism Sesquiterpenes/pharmacology Sesquiterpenes, Guaiane Solubility Tumor Cells, Cultured/drug effects Tumor Stem Cell Assay Collapse Grants Collapse
37	Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2000;28:10-4. [PMID: 10592169 PMCID: PMC102437 DOI: 10.1093/nar/28.1.10] [Citation(s) in RCA: 297] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/1999] [Revised: 09/14/1999] [Accepted: 10/08/1999] [Indexed: 11/14/2022] Open Abstract In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval and resources that operate on the data in GenBank and a variety of other biological data made available through NCBI's Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing pages, GeneMap'99, Davis Human-Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP) pages, Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP) pages, SAGEmap, Online Mendelian Inheritance in Man (OMIM) and the Molecular Modeling Database (MMDB). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih. gov Collapse Key Words Collapse MESH Headings Animals Biology Databases, Factual Gene Expression Genome, Human Humans Information Storage and Retrieval Mice Models, Molecular National Library of Medicine (U.S.) Neoplasms/genetics Phenotype United States Collapse Grants Collapse
38	Erratum to âBLAST 2 Sequences, a new tool for comparing protein and nucleotide sequencesâ [FEMS Microbiol. 174 (1999) 247â250]. FEMS Microbiol Lett 1999. [DOI: 10.1111/j.1574-6968.1999.tb13730.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
39	BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett 1999;174:247-50. [PMID: 10339815 DOI: 10.1111/j.1574-6968.1999.tb13575.x] [Citation(s) in RCA: 1261] [Impact Index Per Article: 50.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open Abstract 'BLAST 2 Sequences', a new BLAST-based tool for aligning two protein or nucleotide sequences, is described. While the standard BLAST program is widely used to search for homologous sequences in nucleotide and protein databases, one often needs to compare only two sequences that are already known to be homologous, coming from related species or, e.g. different isolates of the same virus. In such cases searching the entire database would be unnecessarily time-consuming. 'BLAST 2 Sequences' utilizes the BLAST algorithm for pairwise DNA-DNA or protein-protein sequence comparison. A World Wide Web version of the program can be used interactively at the NCBI WWW site (http://www.ncbi.nlm.nih.gov/gorf/bl2.++ +html). The resulting alignments are presented in both graphical and text form. The variants of the program for PC (Windows), Mac and several UNIX-based platforms can be downloaded from the NCBI FTP site (ftp://ncbi.nlm.nih.gov). Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Base Sequence Sequence Alignment/methods Software Collapse Grants Collapse
40	Sequence Similarity Searching Using the BLAST Family of Programs. ACTA ACUST UNITED AC 1999;Chapter 2:Unit2.5. [DOI: 10.1002/0471140864.ps0205s15] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
41	Pharmaceutical properties of related calanolide compounds with activity against human immunodeficiency virus. J Pharm Sci 1998;87:1077-80. [PMID: 9724557 DOI: 10.1021/js980122d] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Abstract The present studies were undertaken to compare the relative pharmacokinetic parameters and bioavailability of two chemically related natural products which are nonnucleoside inhibitors of reverse transcriptase. Both (+)-calanolide A (Cal A; NSC 675451) and (+)-dihydrocalanolide A (DHCal A; NSC 678323) are currently under development for the treatment of HIV infections. HPLC-based analytical assays were developed for both compounds using modifications of a previously published procedure. The assays were used to compare the intravenous pharmacokinetics of the dihydro analogue relative to the parent compound, Cal A, and to determine the relative oral bioavailability of each drug in CD2F1 mice. Although the pharmacokinetic parameters of each drug were similar (Cal A, 25 mg/kg: AUC: 9.4 [microg/mL]. hr, t1/2beta: 0.25 h,, t1/2gamma: 1.8 h, clearance: 2.7 L/h/kg versus DHCal A, 25 mg/kg: AUC: 6.9 [microg/mL].hr, t1/2beta: 0.22 h,, t1/2gamma: 2.3 h, clearance: 3.6 L/h/kg), the oral bioavailability of DHCal A (F = 46. 8%) was markedly better than that obtained for Cal A (F = 13.2%). The relative ability of Cal A and DHCal A to change to their inactive epimer forms, (+)-calanolide B and (+)-dihydrocalanolide B, respectively, was also determined. While conversion of active to inactive forms of the drugs was noted to occur in vitro especially under acidic conditions, no epimer forms of either compound were noted in plasma of mice after administration of either CalA or DHCal A. Considered together with preliminary toxicology findings, the pharmacokinetic data obtained in the present series of experiments suggest that selection of the dihydro derivative of (+)-calanolide A may be a reasonable choice for further preclinical development and possible Phase I clinical evaluation. Collapse Key Words Collapse MESH Headings Administration, Oral Animals Anti-HIV Agents/chemistry Anti-HIV Agents/pharmacokinetics Area Under Curve Biological Availability Coumarins/chemistry Coumarins/pharmacokinetics Drug Stability Half-Life Male Mice Pyranocoumarins Stereoisomerism Collapse Grants N01-CM-57203 NCI NIH HHS Collapse
42	Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res 1998;26:3986-90. [PMID: 9705509 PMCID: PMC147803 DOI: 10.1093/nar/26.17.3986] [Citation(s) in RCA: 217] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract Protein families often are characterized by conserved sequence patterns or motifs. A researcher frequently wishes to evaluate the significance of a specific pattern within a protein, or to exploit knowledge of known motifs to aid the recognition of greatly diverged but homologous family members. To assist in these efforts, the pattern-hit initiated BLAST (PHI-BLAST) program described here takes as input both a protein sequence and a pattern of interest that it contains. PHI-BLAST searches a protein database for other instances of the input pattern, and uses those found as seeds for the construction of local alignments to the query sequence. The random distribution of PHI-BLAST alignment scores is studied analytically and empirically. In many instances, the program is able to detect statistically significant similarity between homologous proteins that are not recognizably related using traditional single-pass database search methods. PHI-BLAST is applied to the analysis of CED4-like cell death regulators, HS90-type ATPase domains, archaeal tRNA nucleotidyltransferases and archaeal homologs of DnaG-type DNA primases. Collapse Key Words Collapse MESH Headings Adenosine Triphosphatases Algorithms Amino Acid Sequence Archaeal Proteins Caenorhabditis elegans Proteins Calcium-Binding Proteins DNA Primase Databases, Factual HSP90 Heat-Shock Proteins Helminth Proteins Pattern Recognition, Automated RNA Nucleotidyltransferases Software Collapse Grants Collapse
43	Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997;25:3389-402. [PMID: 9254694 PMCID: PMC146917 DOI: 10.1093/nar/25.17.3389] [Citation(s) in RCA: 50875] [Impact Index Per Article: 1884.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open Abstract The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Animals DNA/chemistry Databases, Factual Humans Molecular Sequence Data Proteins/chemistry Sequence Alignment Software Collapse Grants LM05110 NLM NIH HHS Collapse
44	Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997. [PMID: 9254694 DOI: 10.1093/naar/25.17.3389] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023] Open Abstract The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Animals DNA/chemistry Databases, Factual Humans Molecular Sequence Data Proteins/chemistry Sequence Alignment Software Collapse Grants LM05110 NLM NIH HHS Collapse
45	A phase II and pharmacokinetic study of enloplatin in patients with platinum refractory advanced ovarian carcinoma. Anticancer Drugs 1997;8:649-56. [PMID: 9311439 DOI: 10.1097/00001813-199708000-00001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Abstract This was a study of enloplatin in 18 evaluable patients with platinum refractory ovarian cancer. They received an i.v. infusion of enloplatin over 1.5 h without prehydration every 21 days. One patient had a partial response (6%; 95% CI 0-26%) lasting 2.8 months. The median survival was 9.4 months (95%; CI 5.1-19.7%). Neutropenia was the dose-limiting toxicity. Nephrotoxicity was manageable. Enloplatin is the major form of the free drug in plasma. However, 13.5 h after initiation of treatment, 85% of the drug in plasma is protein bound. Elimination of the drug is mainly renal. Enloplatin pharmacokinetics is similar to that of carboplatin. Thus, the plasma pharmacokinetics of enloplatin is dictated by the cyclobutanedicarboxylato (CBDCA) ligand and not the novel amino ligand. Collapse Key Words Collapse MESH Headings Adult Aged Antineoplastic Agents/adverse effects Antineoplastic Agents/pharmacokinetics Antineoplastic Agents/therapeutic use Carboplatin/adverse effects Carboplatin/analogs & derivatives Carboplatin/pharmacokinetics Carboplatin/therapeutic use Disease-Free Survival Drug Resistance, Neoplasm Female Glomerular Filtration Rate Humans Metabolic Clearance Rate Middle Aged Ovarian Neoplasms/drug therapy Ovarian Neoplasms/mortality Ovarian Neoplasms/pathology Patient Selection Survival Rate Collapse Grants Collapse
46	PowerBLAST: a new network BLAST application for interactive or automated sequence analysis and annotation. Genome Res 1997;7:649-56. [PMID: 9199938 PMCID: PMC310664 DOI: 10.1101/gr.7.6.649] [Citation(s) in RCA: 209] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Abstract As the rate of DNA sequencing increases, analysis by sequence similarity search will need to become much more efficient in terms of sensitivity, specificity, automation potential, and consistency in annotation. PowerBLAST was developed, in part, to address these problems. PowerBLAST includes a number of options for masking repetitive elements and low complexity subsequences. It also has the capacity to restrict the search to any level of NCBI's taxonomy index, thus supporting "comparative genomics" applications. Postprocessing of the BLAST output using the SIM series of algorithms produces optimal, gapped alignments, and multiple alignments when a region of the query sequence matches multiple database sequences. PowerBLAST is capable of processing sequences of any length because it divides long query sequences into overlapping fragments and then merges the results after searching. The results may be viewed graphically, as a textual representation, or as an HTML page with links to GenBank and Entrez. For matching database sequences, annotated features are superimposed on the aligned query sequence in the output, thus greatly increasing the ease of interpretation. Such features may be used for automated annotation of new sequence because PowerBLAST output in ASN.1 form may be "dragged and dropped" into NCBI's Sequin program for sequence annotation and submission. PowerBLAST is capable of analyzing and annotating a 100-kb query in 60 min on NCBI's BLAST server. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Base Sequence Classification Computer Communication Networks Information Systems Molecular Sequence Data Repetitive Sequences, Nucleic Acid Sequence Alignment Sequence Analysis, DNA/methods Software Collapse Grants Collapse
47	Applications of network BLAST server. Methods Enzymol 1996;266:131-41. [PMID: 8743682 DOI: 10.1016/s0076-6879(96)66011-x] [Citation(s) in RCA: 264] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Abstract The sequence databases continue to grow at an extraordinary rate. Contributions come from both small laboratories and large-scale projects, such as the Merck EST project. This growth has placed new demands on computational sequence comparison tools such as BLAST. Even now it is no longer practical to evaluate some BLAST reports manually; it is necessary to filter the output by, for example, organism, source, or degree of annotation. The new network BLAST service makes such tools possible. It is also possible to present BLAST output in different formats, such as BLANCE. Perhaps most important of all, it becomes simple to call BLAST from another application, making it one step within an integrated system. This makes the automated preparation of sequence evaluations that include BLAST runs possible. In the near future we expect to see a number of applications that use the network BLAST interface to help molecular biologists search against a database that is growing not only in size but in biological richness. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Animals Base Composition Base Sequence Databases, Factual Escherichia coli Humans Molecular Sequence Data Proteins/chemistry Repetitive Sequences, Nucleic Acid Saccharomyces cerevisiae Software Collapse Grants Collapse
48	Crowding-induced organization of cytoskeletal elements: II. Dissolution of spontaneously formed filament bundles by capping proteins. J Biophys Biochem Cytol 1994;126:169-74. [PMID: 8027175 PMCID: PMC2120095 DOI: 10.1083/jcb.126.1.169] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open Abstract Through calculations of molecular packing constraints in crowded solutions, we have previously shown that dispersions of filament forming proteins and soluble proteins can be unstable at physiological concentrations, such that tight bundles of filaments are formed spontaneously, in the absence of any accessory binding proteins. Here we consider the modulation of this phenomenon by capping proteins. The theory predicts that, by shortening the average filament length, capping alleviates the packing problem. As a result, the dispersed isotropic solution is stable over an expanded range of compositions. Collapse Key Words Collapse MESH Headings Actin Depolymerizing Factors Actins Cytoplasm Cytoskeleton/physiology Destrin Microfilament Proteins Models, Biological Morphogenesis Collapse Grants HL-36546 NHLBI NIH HHS HL08472 NHLBI NIH HHS Collapse
49	Crowding-induced organization of cytoskeletal elements: I. Spontaneous demixing of cytosolic proteins and model filaments to form filament bundles. Biophys J 1993;65:1147-54. [PMID: 8241394 PMCID: PMC1225832 DOI: 10.1016/s0006-3495(93)81144-5] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open Abstract The theory for the effects of crowding on the behavior of reversibly self-assembling solutes is extended to mixtures containing nonassembling solutes. The theory predicts that excluded volume will cause dramatic demixing into domains of long, tightly packed, highly aligned fibers coexisting with an isotropic solution of unaggregated species. It suggests that the bundling of fibers in cells is entropically driven and that accessory binding proteins in the cytoplasm serve to modulate the process rather than create it. Collapse Key Words Collapse MESH Headings Biophysical Phenomena Biophysics Cytoskeleton/metabolism Cytoskeleton/ultrastructure Cytosol/metabolism Macromolecular Substances Models, Biological Proteins/chemistry Proteins/metabolism Solutions Thermodynamics Collapse Grants HL-36546 NHLBI NIH HHS HL08472 NHLBI NIH HHS Collapse
50	Theoretical studies of DNA during orthogonal field alternating gel electrophoresis. J Chem Phys 1991. [DOI: 10.1063/1.459963] [Citation(s) in RCA: 28] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse