ZFN

Tagger
Genome/Transcriptome-wide Tag Scanner

Introduction

The Tagger project aims at providing tools for finding short exact matches in large sequence databases.
The two tools provided are Tagger, a memory-based tool that hashes the query sequence in memory and then scans for word matches in the database, and FetchGWI, which indexes the words in the database on disk and then scans for matches to the query.

Project Description

The Tagger and fetchGWI tools have originally been developed by Christian Iseli at the Ludwig Institute for Cancer Research (LICR) and further extended, in particular as regards the Tagger program and the Web server, by Giovanna Ambrosini at the Ecole Polytechnique Federale de Lausanne (EPFL).

Tagger and, in particular, FetchGWI have been shown to be versatile tools for rapidly searching multiple genomes thanks to a simple and efficient indexing strategy. These tools may prove helpful to users who need fast matching of very large probe collections to one or more genomes and, more in general, for high-throughput genomics/transcriptomics data analysis.


References
PMID:17593978
Indexing Strategies for Rapid Searches of Short Words in Genome Sequences
Iseli C, Ambrosini G, Bucher P, and Jongeneel CV.
PLoS ONE. 2007; 2(6): e579.

Web Server

We provide a web page for mapping short oligo sequences to one or more genomes and transcriptomes

Direct hyperlink from text documents

If you want to link a sequence tag within a text document to one ore more genomes, for instance "CCACTCTCTCTTTCCGG" (hyperlinked), then use the following inline URL syntax to call fetchGWI or tagger:

    http://ccg.vital-it.ch/cgi-bin/tagger/tagscan?dbtype=dna&dbname=HS MM AME PTR&mode=1&tag=CCACTCTCTCTTTCCGG
Where:
  • dbtype specifies whether searched databases are genomes (dna) or transcriptomes (rna)
  • dbname is the list of sequences databases to be searched against:
    • HS --> Homo Sapiens
    • MM --> Mus musculus
    • AME --> Apis mellifera
    • BT --> Bos taurus
    • CFA --> Canis familiaris
    • DM --> Drosophila melanogaster
    • PTR --> Pan troglodytes
    • RN --> Rattus norvegicus
    • SCE --> Saccharomyces cerevisiae
    • TCA --> Tribolium castaneum
    • ALL --> All genomes/transcriptomes

  • mode indicates whether the search is exact (0) or at least one mismatch-tolerant (1,2,3 ...)
  • tag is a short (10-40 nucleotides) sequence tag
In the given example, the hyperlinked tag:
CCACTCTCTCTTTCCGG

is mapped to the Human (HS), Mouse (MM), Bee (AME) and Chimpanzee (PTR) genomes (dbtype=dna),
including single nucleotide mismatches (mode=1).

The source code is available on

Valid XHTML 1.0!

Last update 13 Feb. 2010