CATH / Gene3D

26 million protein domains classified into 2,738 superfamilies

CATH is a classification of protein structures downloaded from the Protein Data Bank. We group protein domains into superfamilies when there is sufficient evidence they have diverged from a common ancestor.

Latest Release Statistics

CATH v4.0 based on PDB dated March 26, 2013
235,858 CATH Domains
2,738 CATH Superfamilies
69,058 Annotated PDBs

Gene3D v12 released March 18, 2012
6,131 Cellular Genomes
21,662,155 Protein Sequences
25,615,754 CATH Domain Predictions

Citing CATH

If you find this resource useful, please consider citing the reference that describes this work:

New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures.
Sillitoe I, Cuff AL, Dessailly BH, Dawson NL, Furnham N, Lee D, Lees JG, Lewis TE, Studer RA, Rentzsch R, Yeats C, Thornton JM, Orengo CA
Nucleic Acids Res. 2013 Jan