SNP2TFBS WEB-SERVER OUTPUT FORMAT DESCRIPTION ##################################################################################################### SNPViewer - Viewing variants that affect TF-binding ##################################################################################################### Two tables Table1: - SNP identifier (with dbSNP link) - Chromosome (Feb 2009 GRCh37/hg19) - SNP position - Number of TF factors affected by the variant Table2: - TF name - PWM score on ref assembly - PWM score on alt assembly - Score difference - Low score (threshold used to identify PWM sites) - High score (threshold for selecting variants with significant regulatory affect) ##################################################################################################### SNPSelect - Annotate selected SNPs with TFBS matches ##################################################################################################### ----------------------------------------------------------------------------------------------------- SNP Set analysis (to check overlap of variants from SNP2TFBS in a user defined rsID list or vcf file) ----------------------------------------------------------------------------------------------------- rsID based search ----------------- Output plots - PLOT1: Variants affecting factors - number of user uploaded variants affecting one or more TFs. - PLOT2: TF Enrichment - TFs enriched that overlap user defined variants. TFs are sorted based on their enrichment. - PLOT3: Variants Annotation - matching variants are annotated based on RefSeq annotation. Output data files - OUTPUT FILE: Variants-TF matches - Chromosome - SNP pos - SNP pos - ref allele - alt allele - Information: MATCH=(number of TF affected);TF=(TF name);ScoreDiff=(score difference sorted on absolute difference among TFs) - rsID - OUTPUT FILE: TF Enrichment - TF name - Total TF sites identified with our pipeline - TFs identified in user defined list - Ration of TF identified/ total TF - OUTPUT FILE: Annotated Variants - Chromosome number - SNP pos - ref allele - alt allele - Functional annotation of gene - Gene name - Gene details (. if absent) - ExonicFunc.refGeneAAChange (. if absent) - rsID - TFname vcf based search ----------------- Output plots - PLOT1: Variants affecting factors - number of user uploaded variants affecting one or more TFs. - PLOT2: TF Enrichment - TFs enriched that overlap user defined variants. TFs are sorted based on their enrichment. - PLOT3: Variants Annotation - matching variants are annotated based on RefSeq annotation. Output data files - OUTPUT FILE: Variants-TF matches - same as above - OUTPUT FILE: TF Enrichment - same as above - OUTPUT FILE: Filtered Variants (VCF) - same format as user defined vcf but will contain only variants that found match in SNP2TFBS - OUTPUT FILE: Annotated Variants - same as above ---------------------------------------------------------------------------------------------------- SNPIntersect - Intersect genomic regions with variants affecting TF-binding ---------------------------------------------------------------------------------------------------- Output plots - PLOT1: Intersection counts - Number of user defined regions having zero/ one/ more than one variant matches from SNP2TFBS - PLOT2: TF Enrichment - TFs enriched in user defined regions. TFs are sorted based on their enrichment. - PLOT3: Variants Annotation - matching variants are annotated based on RefSeq annotation. Output data files - OUTPUT FILE: Variants-TF matches - Chromosome - SNP pos - SNP pos - ref allele - alt allele - Information: MATCH=(number of TF affected);TF=(TF name);ScoreDiff=(score difference sorted on absolute difference among TFs) - rsID - OUTPUT FILE: Region/SNP Overlap Count - Chromosome - SNP pos - SNP pos - TF name - Counts - Number of variants identified in user defined region file - OUTPUT FILE: TF Enrichment - TF name - Total TF sites identified with our pipeline - TFs identified in user defined list - Ration of TF identified/ total TF - OUTPUT FILE: Annotated Variants - Chromosome number - SNP pos - ref allele - alt allele - Functional annotation of gene - Gene name - Gene details (. if absent) - ExonicFunc.refGeneAAChange (. if absent) - rsID - TFname ---------------------------------------------------------------------------------------------------- PWM from library (JASPAR CORE 2014 vertebrates) ---------------------------------------------------------------------------------------------------- Output plots - PLOT1 - Selected Variants: venn diagram showing intersection of PWM specific variants identified in reference and alternate genome. - PLOT2 - Variants Annotation: number of variants in genomic regions as defined by RefSeq annotation. Output data files - OUTPUT FILE: Custom file - rsID - Chromosome - SNP pos (reference) - SNP pos (alternate) - ref-allele - alt-allele - ref PWM match start position (. if absent) - ref PWM match end position (. if absent) - ref PWM match seq (. if absent) - ref PWM match score (. if absent) - alt PWM match start position (. if absent) - alt PWM match end position (. if absent) - alt PWM match seq (. if absent) - alt PWM match score (. if absent) - strand - score difference alt-ref (if absent use low score threshold) - flag 1/0 (interesting SNP - 1 means one score >= high score threshold) - OUTPUT FILE: BED file - Chromosome - SNP position - SNP position - rsID - Allele change (ref>alt) - OUTPUT FILE: Annotated File - Chromosome - SNP position - ref allele - alt allele - Functional annotation of gene - Gene name - Gene details (. if absent) - ExonicFunc.refGeneAAChange (. if absent) - rsID - TFname - OUTPUT FILE: Alt genome SNPs - Same format as "custom file", ref PWM specific fields (column 7-10) are replaced by "." - OUTPUT FILE: Ref genome SNPs - Same format as "custom file", alt PWM specific fields (column 11-14) are replaced by "." - OUTPUT FILE: Both genomes SNPs - - Same format as "custom file" ---------------------------------------------------------------------------------------------------- Indicate a gene of interest ---------------------------------------------------------------------------------------------------- Output plots - PLOT1: TF Enrichment plot - TFs enriched near user defined gene. TFs are sorted based on their enrichment. - PLOT2: Variant annotation plot - number of variants in genomic regions as defined by RefSeq annotation. Output data files - OUTPUT FILE: TF Enrichment - TF name - Total TF sites identified with our pipeline - TFs identified in user defined list - Ration of TF identified/ total TF - OUTPUT FILE: Annotated Variants - Chromosome - SNP position - ref allele - alt allele - Functional annotation of gene - Gene name - ExonicFunc.refGeneAAChange (. if absent) - Gene details (. if absent) - rsID - TFname