UGA logo RCC: Research Computing Center
 
 
Home >
 
 
RESOURCES
SERVICES
Application & Code Development
Consulting
Grantwriting Support

PHYML

Category | Version | Author | Description
Program on:altix | inQuiry | pcluster | rcluster,IOB

Category(ies): Bioinformatics

Version: V3.0, V2.4, last updated at 07/09/2008

Author / Distributor:
Stéphane Guindon, Olivier Gascuel LIRMM, Montpellier, France

Please cite:
"A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood."
Guindon S., Gascuel O., Systematic Biology, 52(5):696-704, 2003.

Description:

PHYML is a software implementing a new method for building phylogenies from DNA and protein sequences using maximum likelihood.
Please refer to PHYML for more details.

altix: Not available on altix

Back to top


pcluster: Not available on pcluster

Back to top


rcluster,IOB: running program | Documentation | Installation | System

Version 2.4 is at /usr/local/phyml/phyml_2.4
Version 3.0 is at /usr/local/phyml/phyml_3.0

Running Program: also refer to submit jobs to queues at rcluster,IOB

Please refer to PHYML Commands for detail commands

e.x. DNA interleaved sequence file, default parameters

/usr/local/phyml/phyml_3.0 -i sequences

AA interleaved sequence file, default parameters :

/usr/local/phyml/phyml_3.0 -i sequences -d aa

AA sequential sequence file, with customization :


/usr/local/phyml/phyml_3.0 -i sequences -q -d aa -m JTT -c 4 -a e

Submit to the queue
Firstly, create a phyml.sub.sh file with contents as below. The working_directory is the path to your working directory (e.g., it could be /home/labname/username/subdir or /scratch/username/subdir ). You can add other parameters after phyml_3.0 commands.
Please refer to running jobs in the queue for details

#!/bin/csh
cd working_directory
time /usr/local/phyml/phyml_3.0 -i inputfile -d nt > phyml.out

chmod u+x phyml.sub.sh

bsub -q queueName -o filename.%J.out -e filenamet.%J.err ./phyml.sub.sh

Use following command to check if your jobs are done:

bjobs -u my-user-name

- PhyML v3.0 -

phyml [command args]

Command options:

-i (or --input) seq_file_name
seq_file_name is the name of the nucleotide or amino-acid sequence file in PHYLIP format.

-d (or --datatype) data_type
data_type is 'nt' for nucleotide (default) and 'aa' for amino-acid sequences.

-q (or --sequential)
Changes interleaved format (default) to sequential format.

-n (or --multiple) nb_data_sets
nb_data_sets is an integer corresponding to the number of data sets to analyse.

-b (or --bootstrap) int
int > 0 : int is the number of bootstrap replicates.
int = 0 : neither approximate likelihood ratio test nor bootstrap values are computed.
int = -1 : approximate likelihood ratio test returning aLRT statistics.
int = -2 : approximate likelihood ratio test returning Chi2-based parametric branch supports.
int = -3 : minimum of Chi2-based parametric and SH-like branch supports.
int = -4 : SH-like branch supports alone.

-m (or --model) model
model : substitution model name.
- Nucleotide-based models : HKY85 (default) | JC69 | K80 | F81 | F84 | TN93 | GTR | custom (*)
(*) : for the custom option, a string of six digits identifies the model. For instance, 000000
corresponds to F81 (or JC69 provided the distribution of nucleotide frequencies is uniform).
012345 corresponds to GTR. This option can be used for encoding any model that is a nested within GTR.

- Amino-acid based models : WAG (default) | JTT | MtREV | Dayhoff | DCMut | RtREV | CpREV | VT
Blosum62 | MtMam | MtArt | HIVw | HIVb | custom

-f e, d, or "fA fC fG fT"
e : the character frequencies are determined as follows :
- Nucleotide sequences: the equilibrium base frequencies are estimated using maximum likelihood
- Amino-acid sequences: the equilibrium amino-acid frequencies are estimated by counting the
occurence of the different amino-acids in the data.

d : the character frequencies are determined as follows :
- Nucleotide sequences: the equilibrium base frequencies are estimated by counting the occurence
of the different bases in the alignment.
- Amino-acid sequences: the equilibrium amino-acid frequencies are estimated using the frequencies
defined by the substitution model.

"fA fC fG fT" : only valid for nucleotide-based models. fA, fC, fG and fT are floating numbers that
correspond to the frequencies of A, C, G and T respectively.

-t (or --ts/tv) ts/tv_ratio
ts/tv_ratio : transition/transversion ratio. DNA sequences only.
Can be a fixed positive value (ex:4.0) or e to get the maximum likelihood estimate.

-v (or --pinv) prop_invar
prop_invar : proportion of invariable sites.
Can be a fixed value in the [0,1] range or e to get the maximum likelihood estimate.

-c (or --nclasses) nb_subst_cat
nb_subst_cat : number of relative substitution rate categories. Default : nb_subst_cat=1.
Must be a positive integer.

-a (or --alpha) gamma
gamma : distribution of the gamma distribution shape parameter.
Can be a fixed positive value or e to get the maximum likelihood estimate.

-s (or --search) move
Tree topology search operation option.
Can be either NNI (default, fast) or SPR (a bit slower than NNI) or BEST (best of NNI and SPR search).

-u (or --inputtree) user_tree_file
user_tree_file : starting tree filename. The tree must be in Newick format.

-o params
This option focuses on specific parameter optimisation.
params=tlr : tree topology (t), branch length (l) and rate parameters (r) are optimised.
params=tl : tree topology and branch length are optimised.
params=lr : branch length and rate parameters are optimised.
params=l : branch length are optimised.
params=r : rate parameters are optimised.
params=n : no parameter is optimised.

--rand_start
This option sets the initial tree to random.
It is only valid if SPR searches are to be performed.

--n_rand_starts num
num is the number of initial random trees to be used.
It is only valid if SPR searches are to be performed.

--r_seed num
num is the seed used to initiate the random number generator.
Must be an integer.

--print_site_lnl
Print the likelihood for each site in file *_phyml_lk.txt.

--print_trace
Print each phylogeny explored during the tree search process
in file *_phyml_trace.txt.

PHYLIP-LIKE INTERFACE

You can also use PhyML with no argument, in this case change the value of
a parameter by typing its corresponding character as shown on screen.

Documentation: Online document available at PHYML website.

Installation: binary code downloaded from PHYML AMD-64 website or here.

System(s): Unix

Back to top


 
Partnering with UGA