Bio Resources

Last updated February 13, 2023

Reference genomes, protein and nucleotide sequences databases, and other bio resources are now available on Discovery and Endeavour. If you need a specific release that is not currently included in the pages below, please submit a help ticket and we will try to make those resources available to you.

1 Genbank

An open access, annotated sequence database of all publicly available nucleotide sequences and their protein translations.

2 Genome Taxonomy Database (GTDB)

An initiative to establish a standardized microbial taxonomy based on genome phylogeny, primarily funded by an Australian Research Council Laureate Fellowship.

3 Genomes

A set of ready-to-use reference sequences and annotations for commonly analyzed organisms in a directory accessible from the Discovery and Endeavour clusters.

4 Pfam Database

A large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).

5 TIGRFAMs

A resource consisting of curated multiple sequence alignments, Hidden Markov Models (HMMs) for protein sequence classification, and associated information designed to support automated annotation of (mostly prokaryotic) proteins.

6 UniProt

A resource comprised of three databases, each optimized for different uses.