A bioinformatics library for the Go language










Installation

$ go get code.google.com/p/biogo/...

Documentation

Core packages

See sub-packages below.

Mailing list

Overview

bíogo is a bioinformatics library for the Go language.

The Purpose of bíogo

bíogo stems from the need to address the size and structure of modern genomic and metagenomic data sets. These properties enforce requirements on the libraries and languages used for analysis:
  • speed - size of data sets
  • concurrency - problems often embarrassingly parallelisable
In addition to the computational burden of massive data set sizes in modern genomics there is an increasing need for complex pipelines to resolve questions in tightening problem space and also a developing need to be able to develop new algorithms to allow novel approaches to interesting questions. These issues suggest the need for a simplicity in syntax to facilitate:
  • ease of coding
  • checking for correctness in development and particularly in peer review
These ideas are more fully discussed in this paper.
Related to the second issue is the reluctance of some researchers to release code because of quality concerns ("Publish your computer code: it is good enough. Nature 2010.").
The issue of code release is the first of the principles formalised in the Science Code Manifesto.

A language with a simple, yet expressive, syntax should facilitate development of higher quality code and thus help reduce this barrier to research code release. The Go language design satisfies these requirements.

If you use bíogo for work that you subsequently publish, please include a note in the paper linking to this site - and let us know.

Yet Another Bioinformatics Library

It seems that nearly every language has it own bioinformatics library, some of which are very mature, for example BioPerl andBioPython. Why add another one?
The different libraries excel in different fields, acting as scripting glue for applications in a pipeline (much of [1], [2] and [3]) and interacting with external hosts¹², wrapping lower level high performance languages with more user friendly syntax¹²³ or providing bioinformatics functions for high performance languages.
The intended niche for bíogo lies somewhere between the scripting libraries and high performance language libraries in being easy to use for both small and large projects while having reasonable performance with computationally intensive tasks.
The intent is to reduce the level of investment required to develop new research software for computationally intensive tasks.
  1. BioPerl


  1. BioPython
  1. BioRuby
  1. PyCogent
  1. BioJava
  1. SeqAn