SecStrAnnotator:OneToOne
SecStrAnnotator finds annotation for a query protein Q, based on the template protein T. Thus, the input consists of the structure of T, structure of Q, and annotation of T.
Sometimes a single protein consists of several domains. In such cases, T and Q do not refer to the whole protein but only to one domain.
The annotation algorithm consists of three major steps. The first step is structural alignment and superimposition of the query protein with the template protein, so the corresponding parts of the two proteins are located close to each other. In the second step, secondary structure assignment (SSA) is performed – SSEs are detected in the query protein Q. The third step is called matching – the algorithm will match the template SSEs to the query SSEs and for each annotated SSE in T it will select the corresponding SSE in Q.
Dependencies
PyMOL
PyMOL is used by SecStrAnnotator for structural alignment and visualization. It can be downloaded from the PyMOL website. In Ubuntu Linux it can also be installed by running sudo apt install pymol
.
Mono
On Windows, SecStrAnnotator.exe can be executed directly; however, on other operating systems it must be run using Mono. In Ubuntu Linux it can be installed by running sudo apt install mono-devel
.
DSSP
Necessary only when DSSP secondary structure assignment method is selected (by --ssa dssp
).
Execution
SecStrAnnotator is executed from command line.
Windows:
SecStrAnnotator.exe [OPTIONS] DIRECTORY TEMPLATE QUERY
Unix:
mono SecStrAnnotator.exe [OPTIONS] DIRECTORY TEMPLATE QUERY
Example of a call:
mono SecStrAnnotator.exe --align cealign --ssa geom-hbond --matching mom --session my_data_directory 1og2,A,30:491 1tqn,A,28:499
Arguments
DIRECTORY
is the directory containing all the input files. The output files will also be saved to this directory.TEMPLATE
describes the template protein domain in one of the following formats:PDB
orPDB,CHAIN
orPDB,CHAIN,RANGES
. The whole argument must be written without spaces. Examples:1og2
(structure 1og2, chain A by default)1og2,B
(chain B)1og2,B,100:400
(residues 100–400 of the chain B)1og2,B,:400
(residues up to 400 of the chain B)1h9r,A,123:183,252:261
(residues 123–183 and 252–261 of the chain A)
QUERY
describes the query protein domain and uses the same format asTEMPLATE
.
Options
There is a range of options which can be used to modify the behaviour of SecStrAnnotator. The most important option is:
--help
Prints the help message, which includes the description of all the other options.
Input files
DIRECTORY/TEMPLATEPDB.pdb
– structure of the template proteinDIRECTORY/TEMPLATEPDB-template.sses.json
– annotation of the template domainDIRECTORY/QUERYPDB.pdb
– structure of the query protein
Output files
DIRECTORY/QUERYPDB-aligned.pdb
– structure of the query protein after superimposition on the template proteinDIRECTORY/QUERYPDB-detected.sses.json
– secondary structure assignment of the query protein, i.e. all detected SSEsDIRECTORY/QUERYPDB-annotated.sses.json
– annotated SSEs in the query proteinDIRECTORY/QUERYPDB-annotated.pse
– PyMOL session with the visualization of the resulting annotation (only when executed with--session
option)
Auxiliary files and programs
SecStrAnnotator has dependencies on other programs (PyMOL, optionally DSSP) and scripts. These auxiliary files need to be available in the system, and there location must be specified in the configuration file SecStrAnnotator_config.json
. The configuration file itself must be in the same directory as SecStrAnnotator.exe
. The default content of the configuration file is:
{
"PymolExecutable": "pymol",
"DsspExecutable": "./dssp",
"PymolScriptAlign": "./script_align.py",
"PymolScriptSession": "./script_session.py"
}
which assumes that pymol
is installed and can be run directly (from $PATH
) and that the other files are present in the same directory as SecStrAnnotator.exe
.
On Windows, the location of PyMOL executable must be manually inserted into the modification file (it is usually C:\Program Files\PyMOL\PyMOL\PyMOL.exe
, but can be different; it should be PyMOL.exe, not PymolWin.exe):
{
"PymolExecutable": "C:\\Program Files\\PyMOL\\PyMOL\\PyMOL.exe",
"DsspExecutable": "./dssp",
"PymolScriptAlign": "./script_align.py",
"PymolScriptSession": "./script_session.py"
}
Annotation file format
All files with extension .sses.json
are in SecStrAnnotator annotation format. A short example of this format:
{
"1og2": {
"comment": "This is a demonstration of the annotation format.",
"secondary_structure_elements": [
{ "label": "A", "chain_id": "A", "start": 50, "end": 61, "type": "H" },
{ "label": "1.1", "chain_id": "A", "start": 65, "end": 69, "type": "E" },
{ "label": "1.2", "chain_id": "A", "start": 72, "end": 77, "type": "E" },
{ "label": "B", "chain_id": "A", "start": 80, "end": 90, "type": "H" },
{ "label": "1.3", "chain_id": "A", "start": 386, "end": 389, "type": "E" }
],
"beta_connectivity": [
[ "1.1", "1.2", -1 ],
[ "1.2", "1.3", 1 ]
]
}
}
The example describes two helices, named A and B, and a β-sheet consisting of three strands, named 1.1, 1.2, and 1.3. Strands 1.1 and 1.2 are connected by an anti-parallel β-ladder, strands 1.2 and 1.3 by a parallel β-ladder. All the SSEs are located on chain A of structure 1og2.