Software

AMOS Assembler project

The is a set of tools, libraries, and freestanding genome assemblers, all open source. AMOS is also an open consortium that includes TIGR, the University of Maryland, The Karolinska Institutet, and the Marine Biological Laboratory.

AMOScmp

is a comparative genome assembler, which uses one genome as a reference on which to assemble another, closely related species. See the journal paper here.

ARDB

(New in early 2009) Antibiotic Resistance Genes Database

AutoEditor

A tool for correcting sequencing and basecaller errors using sequence assembly and chromatogram data. On average AutoEditor corrects 80% of erroneous base calls, with an accuracy of 99.99%.

BAMBUS

The first publicly available, standalone genome sequence scaffolding program. It orders and orients contigs into scaffolds based on various types of linking information.

Bambus2

Bambus 2.0, the second generation Bambus scaffolder available as an open source package. While most other scaffolders are closely tied to a specific assembly program, Bambus accepts the output from most current assemblers and provides the user with great flexibility in choosing the scaffolding parameters. In particular, Bambus is able to accept contig linking data other than specified by mate-pairs. Such sources of information include alignment to a reference genome (Bambus can directly use the output of MUMmer), physical mapping data, or information about gene synteny.

Bowtie

An ultrafast, memory-efficient short read aligner that aligns short DNA sequences to the human genome at a rate of about 25 million reads per hour on a typical workstation with 2 GB of memory. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: 1.1 GB for the human genome.

Steven Salzberg has been nominated for the 2013 Benjamin Franklin Award in the Life Sciences. This is a humanitarian/bioethics award presented to an individual who has, in his or her practice, promoted free and open access to the materials and methods used in the life sciences. More information on the award can be found at http://www.bioinformatics.org/franklin/.

BRCA gene testing

a computational screening test that takes the raw DNA sequence data from a whole-genome sequence of an individual human and tests for each of 68 known mutations in the BRCA1 and BRCA2 genes.

Celera Assembler

A whole genome assembler originally developed at Celera Genomics for the assembly of the human genome. CeleraAssembler is now an open-source project at SourceForge. The code is actively maintained by researchers at CBCB and the Venter Institute (formerly known as TIGR, The Institute for Genomic Research).

CloudBurst

(New in Nov 2008) Highly Sensitive Short Read mapping with MapReduce. CloudBurst uses Hadoop - an open source version of Google's parallel computing software MapReduce - to efficiently parallelize the short read mapping problem to dozens or hundreds of computers. This enables CloudBurst to execute highly sensitive read mappings with any number of mutations or indels.

Crossbow

(New in Nov 2009) Crossbow is a scalable software pipeline for whole genome resequencing analysis. It combines Bowtie, an ultrafast and memory efficient short read aligner, and SoapSNP, an accurate genotyper, within Hadoop to distribute and accelerate the computation with many nodes. The pipeline can accurately analyze over 35x coverage of a human genome in one day on a 10-node local cluster, or in 3 hours for about $100 using a 40-node, 320-core cluster rented from Amazon's EC2 utility computing service.

Cufflinks

(New in September 2009) A transcript assembler and abundance estimator for RNA-Seq

DNACLUST

(New in July 2010) DNACLUST is a tool for clustering millions of short DNA sequences. DNACLUST is free software.

ELPH

A motif finder based on Gibbs sampling that can find ribosome binding sites, exon splicing enhancers, or regulatory sites.

ExAlt

a Phylogenetic Generalized Hidden Markov Model for finding alternatively spliced exons.

Figaro

A vector trimmer capable of accurately trimming vector from shotgun reads without prior knowledge of the vector sequence. Figaro statistically models short oligo-nucleotide frequencies in order to infer which oligos are associated vector sequence.

FLASH

A fast accurate software to increase the length of reads by overlapping and merging mate pairs from fragments shorter than twice the length of reads.

GeneMerge

a program for analysis of microarray data including rank scores for over-representation of particular functions and categories

GeneSplicer

a fast system for detecting splice sites in genomic DNA of various eukaryotes.

GeneZilla

a generalized HMM for eukaryotic gene finding, with a design similar to Genscan. Written and maintained by Bill Majoros, now at Duke University.

GiRaF

GiRaF is a computational tool for identification of reassortments in influenza viruses from sequence databases of isolates.

Glimmer

a system that uses interpolated Markov models to find genes in microbial DNA. Used to annotate hundreds (possibly thousands) of bacterial, archaeal, and viral genomes. Current version is 3.02.

GlimmerHMM

a Generalized Hidden Markov Model gene-finder which makes use of the techniques implemented previously by GlimmerM.

Hawkeye

A visual analytics tool for genome assembly analysis and validation, designed to aid in identifying and correcting assembly errors. All levels of the assembly data hierarchy are made accessible to users, along with summary statistics and common assembly metrics. A ranking component guides investigation towards likely mis-assemblies or interesting features to support the task at hand. Can be used to interactively analyze assemblies from many popular assemblers on your desktop computer. See the journal paper here.

Insignia

A comprehensive system for finding unique DNA sequences that can be used to identify any bacterial or virus species or strain. Currently has over 13,000 species and strains in its database..

JELLYFISH

A fast, multithreaded k-mer counter.

JIGSAW

(previously called Combiner),a program that predicts gene models using the output from other annotation software. It uses a statistical algorithm to identify patterns of evidence corresponding to gene models.

metagenomeSeq

R package to estimate differential abundance of marker gene survey data and visualize results.

metAMOS

Metagenomic datasets prove challenging to assemble using traditional assembly pipelines designed for individual genomes. Using AMOS as a foundation, we have created a robust & easy-to-use metagenomic assembly pipeline that takes reads (FASTA,FASTQ,SFF) and assembles them into Unitigs (CABOG,NEWBLER,Minimus,SOAPdenovo), Contigs & Scaffolds (Bambus2) & ORFs (Glimmer MG, MetaGeneMark), and annotates results using Metaphyler and a graph-based propagation method. MetAMOS was designed with efficiency in mind and can run through tens of millions of reads in a few hours on a multi-core workstation with ample RAM.

MetaPath

(New in 2010) MetaPath can identify differentially abundant pathways in metagenomic data-sets, relying on a combination of metagenomic sequence data and prior metabolic pathway knowledge.

MetaPhyler

(New in 2010) Taxonomic Profiling for Metagenomic Sequences.

MINIMUS

A small, lightweight assembler for small jobs such as assembling a viral genome, assembling a set of reads that match a single gene, or other tasks that don't require the complex infrastructure of a large-genome assembler.

MUMmer

a system for aligning whole genomes, chromosomes, and other very long DNA sequences. New (May 2008): see how to use MUMmer to align Solexa reads to the human genome.

MUMmerGPU

High throughput sequence alignmentusing Graphics Processing Units (GPUs). Uses a technique called general-purpose GPU programming (GPGPU programming) to harness the extreme parallelism of GPUs for non-graphics tasks. In this application, hundreds of query sequences are simultaneously aligned to a reference sequence, creating an order of magnitude speed up over the same alignmenton the CPU.

OperonDB

Software and a database of operons covering a large number of prokaryotic genomes. Described in M. Pertea et al., Nucl. Acids Res 37 (2009), D479-D482.

PanArray

PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on a single microarray chip. These unique pan-genome tiling arrays provide maximum flexibility for the analysis of both known and uncharacterized strains.

Latest News

AMOS Assembler project

Software

0 comments:

Post a Comment

Popular Posts

Recent Posts

Social

More Links

About Me

Blog Archive

Featured Posts

Labels

Popular Tags

About

Featured Posts

Featured Posts

Recent Comments