September 3, 2009

WebArrayDB: A Platform for Microarray Data Analysis

Filed under: Bioinformatics,Microarray — Biointelligence: Education,Training & Consultancy Services @ 3:02 pm
Tags: , , , ,


 Microarray Data Analysis

Cross-platform microarray analysis is an increasingly important research tool, but researchers still lack open source tools for storing, integrating, and analyzing large amounts of microarray data obtained from different array platforms.

An open source integrated microarray database and analysis suite, WebArrayDB (, has been developed that features convenient uploading of data for storage in a MIAME (Minimal Information about a Microarray Experiment) compliant fashion, and allows data to be mined with a large variety of R-based tools, including data analysis across multiple platforms. Different methods for probe alignment, normalization and statistical analysis are included to account for systematic bias. Student’s t-test, moderated t-tests, non-parametric tests, and analysis of variance or covariance (ANOVA/ANCOVA) are among the choices of algorithms for differential analysis of data. Users also have the flexibility to define new factors and create new analysis models to fit complex experimental designs. All data can be queried or browsed through a web browser. The computations can be performed in parallel on symmetric multiprocessing (SMP) systems or Linux clusters.
The software package is available for use on a public web server ( or can be downloaded.

Check out WebArray at:

August 9, 2009

Bioinformatics Tools: NCBI Tools for Data Mining – Part I

Filed under: Bioinformatics,Computational Biology — Biointelligence: Education,Training & Consultancy Services @ 11:41 am
Tags: , , , , ,

Here is a list of Tools hosted by NCBI for data mining:

Tools for Nucleotide Sequence Analysis


The Basic Local Alignment Search Tool for comparing gene and protein sequences against others in public databases, now comes in several types including PSI-BLAST, PHI-BLAST, and BLAST 2 sequences. Specialized BLASTs are also available for human, microbial, malaria, and other genomes, as well as for vector contamination, immunoglobulins, and tentative human consensus sequences.

Electronic PCR :

It allows you to search your DNA sequence for sequence tagged sites (STSs) that have been used as landmarks in various types of genomic maps. It compares the query sequence against data in NCBI’s UniSTS, a unified, non-redundant view of STSs from a wide range of sources.

Entrez Gene:

Each Entrez Gene record encapsulates a wide range of information for a given gene and organism. When possible, the information includes results of analyses that have been done on the sequence data. The amount and type of information presented depend on what is available for a particular gene and organism and can include: (1) graphic summary of the genomic context, intron/exon structure, and flanking genes, (2) link to a graphic view of the mRNA sequence, which in turn shows biological features such as CDS, SNPs, etc., (3) links to gene ontology and phenotypic information, (4) links to corresponding protein sequence data and conserved domains, (5) links to related resources, such as mutation databases. Entrez Gene is a successor to LocusLink.

Model Maker:

allows you to view the evidence (mRNAs, ESTs, and gene predictions) that was aligned to assembled genomic sequence to build a gene model and to edit the model by selecting or removing putative exons. You can then view the mRNA sequence and potential ORFs for the edited model and save the mRNA sequence data for use in other programs. Model Maker is accessible from sequence maps that were analyzed at NCBI and displayed in Map Viewer.

ORF Finder:

ORF Finder identifies all possible ORFs in a DNA sequence by locating the standard and alternative stop and start codons. The deduced amino acid sequences can then be used to BLAST against GenBank. ORF finder is also packaged in the sequence submission software Sequin.


It is a tool for performing statistical tests designed specifically for differential-type analyses of SAGE (Serial Analysis of Gene Expression) data. The data include SAGE libraries generated by individual labs as well as those generated by the Cancer Genome Anatomy Project (CGAP), which have been submitted to Gene Expression Omnibus (GEO). Gene expression profiles that compare the expression in different SAGE libraries are also available on the Entrez GEO Profiles pages. It is possible to enter a query sequence in the SAGEmap resource to determine what SAGE tags are in the sequence, then map to associated SAGEtag records and view the expression of those tags in different CGAP SAGE libraries.


It aligns one or more mRNA sequences to a single genomic sequence. Spidey will try to determine the exon/intron structure, returning one or more models of the genomic structure, including the genomic/mRNA alignments for each exon.


It is a tool for identifying segments of a nucleic acid sequence that may be of vector, linker, or adapter origin prior to sequence analysis or submission. VecScreen was developed to combat the problem of vector contamination in public sequence databases.

Part II of NCBI Tools in the next post… Keep Visiting !!!!