×

Network disruptions

We have been experiencing disruptions on our local network which has affected the stability of these web pages. We have been working with IT support team to get this fixed as a matter of urgency and apologise for any inconvenience.

User Tools

Site Tools


data_curation:superfamily_naming_tutorial:index

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
data_curation:superfamily_naming_tutorial:index [2023/09/28 16:21] vwamandata_curation:superfamily_naming_tutorial:index [2023/09/28 16:24] (current) vwaman
Line 1: Line 1:
 === Superfamily Naming exercise (Last updated in Sept 2023) === === Superfamily Naming exercise (Last updated in Sept 2023) ===
  
-Useful websites: +**Useful websites:** 
 https://www.cathdb.info/ https://www.cathdb.info/
 http://sfam.cathdb.info/  http://sfam.cathdb.info/ 
  
 **Part I: Steps followed for naming a superfamily** **Part I: Steps followed for naming a superfamily**
-- Look through representative domains as: ‘domain only’ to understand common secondary structures; as ‘domain in chain’ to observe the location of the domain in the chain; as ‘domain in PDB’ to understand the domain’s function and location in the protein. +  * Look through representative domains as: ‘domain only’ to understand common secondary structures; as ‘domain in chain’ to observe the location of the domain in the chain; as ‘domain in PDB’ to understand the domain’s function and location in the protein. 
-- Check through FunFams/SwissProt/Keywords and refer to the most abundant name when naming. +  Check through FunFams/SwissProt/Keywords and refer to the most abundant name when naming. 
-- Check through enzymes (EC number if available), GO terms  and species  to get a rough idea of domain function. +  Check through enzymes (EC number if available), GO terms  and species  to get a rough idea of domain function. 
-- Refer to Pfam and InterPro  entries for general idea of protein domain function and/or structure. +  Refer to Pfam and InterPro  entries for general idea of protein domain function and/or structure. 
-- Check through papers associated with PDB entry for better understanding of protein and protein domain structure and/or function . +  Check through papers associated with PDB entry for better understanding of protein and protein domain structure and/or function . 
-- In ‘Description section’,  provide an overview of structure and function. In larger superfamilies, you may have to refer to specific PDB IDs. +  In ‘Description section’,  provide an overview of structure and function. In larger superfamilies, you may have to refer to specific PDB IDs. 
-- Check references are correct: [InterPro:] [Pfam:] [PMID:] [DOI:] +  Check references are correct: [InterPro:] [Pfam:] [PMID:] [DOI:] 
-- Check other names in the database, either to avoid duplicate names or to identify potential cross-hits +  Check other names in the database, either to avoid duplicate names or to identify potential cross-hits 
-- Check names of other domains in the same chain to keep the name similar.+  Check names of other domains in the same chain to keep the name similar.
  
 **Part II: General observations and tips** **Part II: General observations and tips**
  
 Dos Dos
-  * Check other names in CATH to not make duplicates (i.e. make sure the assigned name is unique) +  * Check other names in CATH to not make duplicates (i.e. make sure the assigned name is unique) 
-  * Make superfamily names consistent with other domains of same protein +  * Make superfamily names consistent with other domains of same protein 
-  * Start with smaller families until you get the hang of it +  * Start with smaller families until you get the hang of it 
-  * For larger superfamily- it is a good idea to check FunFam  +  * For larger superfamily- it is a good idea to check FunFam  
-  * When looking at a protein on InterPro, see if there are other domains that don’t have a name yet on the same protein - it will be easy to name that one +  * When looking at a protein on InterPro, see if there are other domains that don’t have a name yet on the same protein - it will be easy to name that one 
-  * Work in groups for larger superfamilies +  * Work in groups for larger superfamilies 
-  * Choose superfamily entries with FunFams, Pfams, or InterPro associated+  * Choose superfamily entries with FunFams, Pfams, or InterPro associated
  
 Don’ts Don’ts
-  * Make description without sourcing references +  * Make description without sourcing references 
-  * Make description without actually really understanding it +  * Make description without actually really understanding it 
-  * Spend 3 hours on a very small superfamily +  * Spend 3 hours on a very small superfamily 
-  * Look at every single PDB for big superfamilies +  * Look at every single PDB for big superfamilies 
-  * For smaller representative domains, don’t put too much confidence in InterPro/Pfam - it may be better to look at PDB paper for the specific domain +  * For smaller representative domains, don’t put too much confidence in InterPro/Pfam - it may be better to look at PDB paper for the specific domain 
-  * Assume it is the exact same domain if it has good mapping to Pfam +  * Assume it is the exact same domain if it has good mapping to Pfam 
-  * Choose a superfamily entry with no annotation or too many annotation+  * Choose a superfamily entry with no annotation or too many annotation 
 + 
 +(Last updated in September 2023, Written by summer interns since 2020-2023 (Barbara, Oliver, Natalie, Charling, Ruiqi, Lorna, Katie, Charlotte, Hazuki) and CATH curators (Vaishali Waman, Ian Sillitoe) 
  
data_curation/superfamily_naming_tutorial/index.txt · Last modified: by vwaman

CATH-Gene3D is a Global Biodata Core Resource Learn more...