September 9, 2009

R Programming for Computational Biology

Filed under: Bioinformatics,Computational Biology — Biointelligence: Education,Training & Consultancy Services @ 3:26 am
Tags: , ,


The R Programming langauge : An Introduction

R is a language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, etc) and graphical techniques, and is highly extensible. It includes

1. An effective data handling and storage facility.
2. A suite of operators for calculations on arrays, in particular matrices.
3. A large, coherent, integrated collection of intermediate tools for data analysis.
4. Graphical facilities for data analysis and display either on-screen or on hardcopy.
5. A well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

Why use R in Computational Biology?

One of the main reasons that computational biologists use R is the Bioconductor project (, which is a set of packages for R to analyse genomic data. These packages have, in many cases, been provided by researchers to complement descriptions of algorithms in journal articles. Many computational biologists regard R and Bioconductor as fundamental tools for their research. R is a modern, functional programming language that allows for rapid development of ideas, together with object-oriented features for rigorous software development. The rich set of inbuilt functions makes it ideal for high-volume analysis or statistical simulations, and the packaging system means that code provided by others can easily be shared. Finally, it generates high-quality graphical output so that all stages of a study, from modelling/analysis to publication, can be undertaken within R.

For more to read on R programming refer this paper: