UFCG stands for Universal Fungal Core Genes, which is a combined suite of the fungal marker gene database and the bioinformatic pipeline tool, developed to provide an accessible and credible software suite for the phylogenetic study of fungi.
UFCG project features:
UFCG gene database is a combination of canonical genes and core genes of fungi.
The canonical genes were first defined and included by literature search, which have been frequently used and accepted by fungal taxonomists.
In addition, the novel set of genes were also included using the concept of core gene: the most widely used methodology for the genome-based phylogenetic tree reconstruction. Core genes can be defined as:
We prepared the organized list of the genes including their annotations, visualized MSAs and downloadable resources.
UFCG species database contains a list of 1,587 reference species that have been used to define the novel markers.
You can navigate through the list by sorting or searching their taxonomic names.
For each species, you have an access to:
In addition, we provide downloadable archives of the resources from 10,984 assemblies, encompassing both taxonomically representative and redundant entries.
We designed the pipeline for the users who wish to analyze hundreds of whole-genome assemblies. The core modules of the pipeline are briefed here to help you understand and maximize the utility of our pipeline.
profile
module extracts previously described core genes, using the corresponding HMM profiles.
tree
module carries out phylogenetic analysis using a set of .ucg files from different species. Specifically, tree
module features:
Module | Input | Output | Description |
---|---|---|---|
profile |
.fa | .ucg | Extracts UFCG profile from Fungal whole genome sequences |
profile-rna |
.fq | .ucg | Extracts UFCG profile from Fungal RNA-seq reads |
train |
.fa | .fa .hmm | Trains and generates sequence model of fungal markers |
align |
.ucg | .fa | Aligns genes and provides multiple sequence alignments from UFCG profiles |
tree |
.ucg | .nwk .trm | Reconstructs the phylogenetic relationship with UFCG profiles |
prune |
.trm | .nwk | Fixes UFCG tree labels or gets a single gene tree |
Format | Input | Output | Desciption |
---|---|---|---|
.fa .fna .fasta | profile train |
train align |
Standard FASTA file format for genome sequences or MSAs. |
.fq .fastq | profile-rna |
- | Standard FASTQ file format for sequence reads |
.ucg | align tree |
profile profile-rna |
JSON-formatted profile containing extracted sequences of core genes, along with the metadata of the genome. These files can also be read and edited via any text editor. |
.nwk | - | tree prune |
Standard Newick format file for phylogenetic trees. |
.trm | prune |
tree |
JSON-formatted file containing Newick-formatted trees and the metadata of individual gene trees and concatenated UFCG tree. |
1 Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea
2 School of Biological Sciences, Seoul National University, Seoul, Korea
3 Institute of Molecular Biology and Genetics, Seoul National University, Seoul, Korea
4 Artificial Intelligence Institute, Seoul National University, Seoul, Korea
Corresponding authors