Biointelligence

April 17, 2010

A meta-analysis of two-dimensional electrophoresis pattern of the Parkinson’s disease-related protein DJ-1

Filed under: Bioinformatics — Biointelligence: Education,Training & Consultancy Services @ 9:00 am
Tags: ,

The two-dimensional electrophoresis (2-DE) pattern of proteins is thought to be specifically related to the physiological or pathological condition at the moment of sample preparation. On this ground, most proteomic studies move to identify specific hallmarks for a number of different conditions. However, the information arising from these investigations is often incomplete due to inherent limitations of the technique, to extensive protein post-translational modifications and sometimes to the paucity of available samples.

The meta-analysis of proteomic data can provide valuable information pertinent to various biological processes that otherwise remains hidden.

Results: Here, we show a meta-analysis of the PD protein DJ-1 in heterogeneous 2-DE experiments. The protein was shown to segregate into specific clusters associated with defined conditions.

Interestingly, the DJ-1 pool from neural tissues displayed a specific and characteristic molecular weight and isoelectric point pattern. Moreover, changes in this pattern have been related to neurodegenerative processes and aging. These results were experimentally validated on human brain specimens from control subjects and PD patients.

Availability: ImageJ is a public domain image processing program developed by the National Institutes of Health and is freely available at http://rsbweb.nih.gov/ij. All the ImageJ macros used in this study are available as supplementary material and upon request at info@biodigitalvalley.com. XLSTAT can be purchased online at http://www.xlstat.com/en/home/ at a current cost of 300 EUR.

Advertisements

April 7, 2010

EBImage—an R package for image processing with applications to cellular phenotypes

Filed under: Bioinformatics — Biointelligence: Education,Training & Consultancy Services @ 9:00 am
Tags: , ,

EBImage provides general purpose functionality for reading, writing, processing and analysis of images. Furthermore, in the context of microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and use of existing tools in the R environment for signal processing, statistical modeling, machine learning and data visualization.

Availability: EBImage is free and open source, released under the LGPL license and available from the Bioconductor project (http://www.bioconductor.org/packages/release/bioc/html/EBImage.html).

December 1, 2009

Machine Learning in Bioinformatics: A Review

Filed under: Bioinformatics,Computational Biology,Systems Biology — Biointelligence: Education,Training & Consultancy Services @ 12:12 pm
Tags: , , , ,

Due to continued research there is a continuous groth in the amount of biological data available. The exponential growth of the amount of biological data available raises two problems:

1. Efficient information storage and management and, on the other hand, the extraction of useful information from these data.

2. It requires the development of tools and methods capable of transforming all these heterogeneous data into biological knowledge about the underlying mechanism.

 There are various biological domains where machine learning techniques are applied for knowledge extraction from data. The below figure shows the main areas of biology such as genomics, proteomics, microarrays, evolution and text mining where computational methods are being applied.

 

In addition to all the above applications, computational techniques are used to solve other problems, such as efficient primer design for PCR, biological image analysis and backtranslation of proteins (which is, given the degeneration of the genetic code, a complex combinatorial problem). Machine learning consists in programming computers to optimize a performance criterion by using example data or past experience. The optimized criterion can be the accuracy provided by a predictive model—in a modelling problem—, and the value of a fitness or evaluation function—in an optimization problem. Machine learning uses statistical theory when building computational models since the objective is to make inferences from a sample. The two main steps in this process are:

 1. To induce the model by processing the huge amount of data

2. To represent the model and making inferences efficiently.

 The process of transforming data into knowledge is both iterative and interactive. The iterative phase consists of several steps. In the first step, we need to integrate and merge the different sources of information into only one format. By using data warehouse techniques, the detection and resolution of outliers and inconsistencies are solved. In the second step, it is necessary to select, clean and transform the data. To carry out this step, we need to eliminate or correct the uncorrected data, as well as decide the strategy to impute missing data. This step also selects the relevant and non-redundant variables; this selection could also be done with respect to the instances. In the third step, called data mining, we take the objectives of the study into account in order to choose the most appropriate analysis for the data. In this step, the type of paradigm for supervised or unsupervised classification should be selected and the model will be induced from the data. Once the model is obtained, it should be evaluated and interpreted—both from statistical and biological points of view—and, if necessary, we should return to the previous steps for a new iteration. This includes the solution of conflicts with the current knowledge in the domain. The model satisfactorily checked—and the new knowledge discovered—are then used to solve the problem.

 An article published in the journal ‘Briefings in Bioinformatics’ gives an insight of various machine learning techniques used in Bioinformatics. It also throws light on some major techniques such as Bayesian classifiers, logistic regression, discriminant analysis, classification trees, nearest neighbour, neural networks, Support vector machines, clustering, Hidden Markov Models and much more.

 The article can be found here: http://bib.oxfordjournals.org/cgi/content/full/7/1/86?maxtoshow=&HITS=&hits=&RESULTFORMAT=&fulltext=bioinformatics&andorexactfulltext=and&searchid=1&FIRSTINDEX=0&resourcetype=HWCIT