Before installation, please make sure that the required programs are properly installed. Links for the installation of the requirements are listed in the download page.
treemodule, containing path information of dependant binaires.
profile extracts the core gene profile from a FASTA formatted genome assembly.
A single assembly or a directory containing multiple assemblies are accepted as an input.
You can provide a metadata for your input genome assemblies by converting them in a proper format. UFCG pipeline recieves seven entries that represents the taxonomic label of the genome.
|Filename||Name of the file||
|Label||Full label of the genome||
|Accession||Accession code of the assembly (NCBI)||
|Taxon name||Name of the species||
|NCBI name||Name of the assembly provided in NCBI||
|Strain name||Name of the strain||
|Taxonomy||Full taxonomy of the species||Fungi;Ascomycota;Saccharomycetes;Saccharomycetales;Saccharomycetaceae;Saccharomyces;Saccharomyces cerevisiae|
To provide the metadata to the pipeline, you should create a TSV (tab-separated values) formatted file including the data and specify it as an argument. TSV file must include the header line indicating entries, and respectively ordered metadata of the genomes one per line.
Run following command on your terminal to run UFCG pipeline interactively. Interactive mode will guide you through the options that pipeline requires, and automatically create the command to run the pipeline.
Following options are not mandatory, but maybe useful to configurate your run.
To check the entire available options, run the pipeline with -h option.
The pipeline will extract the core gene profiles of given genome assemblies and store them as .ucg files.
With 32 CPU threads, profile module requires about 55 seconds to extract the UFCG marker genes from a fungal whole genome assembly.
Run following command on your terminal to align the genes and infer phylogenetic tree with UFCG pipeline.
To check the entire available options, run the module with -h option.
tree module will produce following result files. You may further analyze the trees with various phylogenetic tools that can handle Newick files. (MEGA, ETE, ape, etc.)
|UFCG tree inferred from concatenated alignment|
|UFCG tree with GSI values computed with [N] genes, replacing bootstrap values|
|Gene tree inferred from each single gene|
|Concatenated sequences of entire core gene alignments|
|Aligned sequences of each single gene|
|Log file containing information about this run|
|JSON file containing entire trees and their metadata|
With 32 CPU threads, tree module requires about 413 seconds to produce the trees from 30 UFCG profiles.
Run following command to replace the name of leaves with a different format.
To replace the UFCG tree by using taxon names and strain names as leaves from the run myRun, execute:
To replace the RPB2 gene tree by using accessions and taxonomic relationships as leaves from the run Lorem_ipsum, execute:
You have to define a local variable
$AUGUSTUS_CONFIG_PATH to run AUGUSTUS properly. Run following code to define the variable.
If you are using
bash, run this to semi-permanently add the variable on your system.
© All rights reserved to Steinegger Lab, Seoul National University.