SecStrAnnotator:OneToMany
Appearance
This page describes the procedure for annotating SSEs in a whole protein family.
A protein family is understood as a set of structurally similar protein domains. A protein domain can be either a whole protein chain or only a part of it (in multidomain proteins).
Dependencies
Python3
Preparing structural data
A list of PDB structures corresponding to a protein family can be obtained from PDBe REST API using domains_from_pdbeapi.py
. The protein family can be identified by a CATH code, such as 1.10.630.10 (CATH), or a Pfam accession, such as PF00067 (Pfam):
python3 domains_from_pdbeapi.py 1.10.630.10 > family_from_cath.json
or
python3 domains_from_pdbeapi.py PF00067 > family_from_pfam.json
The main stages of the procedure are:
- preparing the structural data for the family
- selecting a template domain from the family and obtaining its annotation
- running the annotation algorithm (SecStrAnnotator) on each member of the family