Today, the bioinformatics community is facing a deluge of data. Several experimental technologies have been improved in such a way that obtaining data is easy. The challenge is to be able to analyze these data with the relevant applications. For example, sequencing a whole genome has became usual with the new technologies called Next Generation Sequencing (NGS). Many projects are working on the genome sequence of different organisms, thus continuously providing new sequences for analysis. Algorithms like BLAST, FastA or ClustalW are used intensively for that analysis and usually classified as data-intensive. They are processing gigabytes of data stored in flat-file databases like UNIPROT, EMBL or PDBseq on a shared filesystem. To give an insight to this challenge, StratusLab has built two virtual appliances, “Biological databases repository” and “Bioinformatics compute node”, to provide bioinformaticians with a repository appliance maintaining up-to-date international reference databases, then made available through shared filesystem in destination to bioinformatics cloud nodes with pre-installed bioinformatics software.
The usage of cloud for bioinformatics has to be connected with public bioinformatics infrastructures like the French Bioinformatics Network RENABI (www.renabi.fr) and especially its grid infrastructure GRISBI (www.grisbio.fr). The adoption of clouds for bioinformatics applications will be strongly correlated to the capability of cloud infrastructures to provide ease-of-use and access to reference biological data and applications. In that sense, StratusLab is collaborating with RENABI to help solving the requirements from the Bioinformatics community.
Flexibility to use bioinformatics applications with specific software requirements. Cloud for Bioinformatics has to be connected with public Bioinformatics infrastructures especially the biological databases.
PaaS
IaaS