CATH / Gene3D

26 million protein domains classified into 2,738 superfamilies

What is CATH?

CATH is a classification of protein structures downloaded from the Protein Data Bank. We group protein domains into superfamilies when there is sufficient evidence they have diverged from a common ancestor.

Example pages

Latest Release Statistics

CATH v4.0 based on PDB dated March 26, 2013
235,858 CATH Domains
2,738 CATH Superfamilies
69,058 Annotated PDBs

Gene3D v12 released March 18, 2012
6,131 Cellular Genomes
21,662,155 Protein Sequences
25,615,754 CATH Domain Predictions

Citing CATH

If you find this resource useful, please consider citing the reference that describes this work:

CATH: comprehensive structural and functional annotations for genome sequences.
Sillitoe I, Lewis, TE, Cuff AL, Das S, Ashford P, Dawson NL, Furnham N, Laskowski RA, Lee D, Lees J, Lehtinen S, Studer R, Thornton JM, Orengo CA
Nucleic Acids Res. 2015 Jan