Genome/Transcriptome-wide tag scanner

Tagger is a tool which allows searching fixed-sequence tags (or sequence branches) against entire genomes or mRNA reference sequence databases.

Tagger is a program which allows fast mapping of short sequences against genomics (or genomics-like, e.g. mRNA) databases. Technically, the program is much faster than other existing mapping solutions, because it only searches for exact matches within genomic sequences. This feature is essential for high-throughput genomics/transcriptomics data analysis.

The software package comprises two main tools: the tagger program, a memory-based tool that hashes the query sequence in memory and then scans for word matches in the database, and the fetchGWI program that relies on pre-computed genome indices and is best used in cases where queries must be mapped very rapidly and efficiently. To get maximal search speed, fetchGWI only searches within the index files that represent the genome sequences. There is one index entry for each nucleotide in the genome. This exhaustive index also ensures that no match can possibly be missed.

Based on Tagger, we provide two Web-based applications for the following purposes:

  • TagScan : an online tool for finding short exact or near-exact matches in large sequence databases
  • ZFN-Site : a genome-wide tag scanner for nuclease off-sites
Both application relies on the fetchGWI software for matching large collections of short sequences to genome-size databases.

The source code for the Tagger software is available on SourceForge logo
Last update 15 June 2012