| Accessibility Statement

Software Applications

Resources

All Software Applications

For a complete list of all applications on RCC service, please check the complete list page.

Software Applications

snowhite

Category: Bioinformatics

Program on

zcluster

Version

1.1.4

Author / Distributor

Citing SnoWhite

For SnoWhite:

Dlugosch KM, Rieseberg LH. SnoWhite: A pipeline for aggressive cleaning of next-generation sequence reads. In prep.

If you use the TagDust option, you should ALSO cite:

Lassmann T, Hayashizaki Y, Daub CO. 2009. TagDust - A program to eliminate artifacts from next generation sequencing data. Bioinformatics 25: 2839-2840.

Description

 A cleaning pipeline for next-generation cDNA sequences, more details at snowhite

Running Program

Also refer to submit jobs to queues

/usr/local/snowhite/latest/ is pointed to the latest update version.

Version 1.1.4 is at /usr/local/snowhite/1.1.4

Example running this at queue, shell script sub.sh

 

#!/bin/bash

cd working_directory

time perl /usr/local/snowhite/latest/snowhite_1.1.4.pl   [options]

qsub-q queueName ./sub.sh

Documentation

perl /usr/local/snowhite/latest/snowhite_1.1.4.pl -help

        Input <f>
        Usage Error:
        Run: perl snowhite_1.1.4.pl [OPTIONS]

        OPTIONS =
        Files:
        -f: <FILENAME> fasta sequences (specify path if needed)
        -q: <FILENAME> quality file (optional)
        -v: <FILENAME> vector/primer/adapter file (optional)
        -o: <FILENAME> name for new output folder and file prefixes (default = sequence input filename)

        Adapter clipping:
        -c: <integer> number of bases to clip off the front of all sequences (default = 0)
        -C: <3/5/B/FILENAME/> clip at 3', 5', Both, or according to sequences in FILENAME (default = 5)

        SeqClean step:
        -m: <integer> minimum sequence length for cleaned reads (default = 50bp, applies to all steps)
        -x: <T/F> discard reads with internal vector/primer contaminants? (default = F)
        -p: <integer> processor number (optional, default = 1)

        Terminal poly trimming (e.g. 3'AAAAAAAAAACGATTAG...):
        -l: <integer> minimum length of terminal A/T repeat (min = 1, default = 6)
        -a: <3/5/B> poly A at 3', 5', or Both ends (default = 3)
        -t: <3/5/B> poly T at 3', 5', or Both ends (default = 5)

        Terminal poly trimming inside of cap (e.g. 3'CGAAAAAAAAAAAACGATTAG...):
        -b: <integer> number of terminal bases to look beyond for start of terminal poly A/T (default = 0)
        -r: <integer> minimum length of A/T repeat inside of -b to consider as poly A/T (min = 2, default = 10)

        Internal poly trimming (e.g. 3'...CCGTATAGGAAAAAAAAAAAAAAAAAAAACGATTAGGG...5'):
        -i: <integer> minimum length of internal poly A/T repeat to consider as poly A/T (default = 100bp, extreme case)
        -k: <T/F> keep the longer end of sequence broken by a single internal polyA/T (default = F)

        General poly trimming settings:
        -n: <T/F> interpret Ns within A/T repeats as As or Ts (default = T)
        -s: <T/F> ignore single alternative bases within A/T repeats (default = T)

        TagDust step:
        -e: <T/F> execute TagDust, assuming primer/adapter (-v) file is provided (default = F)
        -d: <decimal> false discovery rate (default = 0.01)

Installation

source code from snowhite

System(s)

Unix