Structure Calculation With RDC's Using CYANA: Difference between revisions

From NESG Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(One intermediate revision by one other user not shown)
Line 15: Line 15:
c) improved function for symmetric dimers and annotation of intermolecular NOE&nbsp;contacts in the peaklist using the xeasy color code notation. &nbsp;<br>  
c) improved function for symmetric dimers and annotation of intermolecular NOE&nbsp;contacts in the peaklist using the xeasy color code notation. &nbsp;<br>  


Good agreement between global orientational constraints from RDC&nbsp;and distance information from NOEs is very important to achieve better structures by NMR. The program is an excellent step forward.&nbsp; Results should always be accompanied by energy refinement in CNS or NIH-XPLOR.&nbsp; Further information about the program and publication references can be found in the [http://www.cyana.org/wiki/index.php/Main_Page CYANA WIKI]&nbsp;page.<br>  
Good agreement between global orientational constraints from RDC&nbsp;and distance information from NOEs is very important to achieve better structures by NMR.&nbsp; Results should always be accompanied by energy refinement in CNS or NIH-XPLOR.&nbsp; Further information about the program and publication references can be found in the [http://www.cyana.org/wiki/index.php/Main_Page CYANA WIKI]&nbsp;page.<br>  


<br>  
<br>  
Line 21: Line 21:
== <span style="font-weight: bold;">Automated NOE and RDC and Structure Calculation Setup </span><br>  ==
== <span style="font-weight: bold;">Automated NOE and RDC and Structure Calculation Setup </span><br>  ==


The structure calculation with automated noesy assignments and RDC&nbsp;restraints follows the canonical CYANA&nbsp;recipe.&nbsp; Simple annealing calcualtions starting from a set of constraints that include RDC are easily derived by simplifying the scripts below and following the demo scripts. The program requires a sequence file (name.seq), a proton assignment list (name.prot), a noesy peaklist set (name.peaks), an RDC&nbsp;list (name.rdc), a CALC.cya script and an init.cya script.&nbsp;&nbsp; <br>  
The structure calculation with automated noesy assignments and RDC&nbsp;restraints follows the canonical CYANA&nbsp;recipe.&nbsp; Simple annealing runs starting from a set of constraints that include RDC are easily derived by simplifying the scripts below and following the demo scripts. The program requires a sequence file (name.seq), a proton assignment list (name.prot), a noesy peaklist set (name.peaks), an RDC&nbsp;list (name.rdc), a CALC.cya script and an init.cya script.&nbsp;&nbsp; <br>  


The sequence file now includes the RDC&nbsp;tensor origin separated by dummy linker residues: <br>  
==== Sequence file<br>  ====
 
The sequence file now includes the RDC&nbsp;tensor origin separated by dummy linker residues (LL5): <br>  
<pre>MET      1
<pre>MET      1
THR      2
THR      2
Line 44: Line 46:


</pre>  
</pre>  
The RDC&nbsp;list supports multiple interatomic vectors in multiple media. RDC&nbsp;with distinct scaling factors and distinct ORI&nbsp;residue numbers are listed in a single file. The program supports the Da (magnitude) and R (Rhombicity) notation typical of programs such as PALES, REDCAT etc. Below is a sample RDC file that includes N-H, N-CA (intra), and N-C' (sequential) vectors in one medium with appropriate errors and [http://www.ncbi.nlm.nih.gov/pubmed/18388951 scaling factors]:<br>  
<br>
 
==== RDC&nbsp;constraint file<br>  ====
 
The RDC&nbsp;list supports multiple interatomic vectors in multiple media. RDC&nbsp;with distinct scaling factors and distinct ORI&nbsp;residue numbers are listed in a single file. The program supports the Da (magnitude) and R (Rhombicity) notation typical of programs such as PALES, REDCAT etc. Below is a sample RDC file that includes N-H, N-CA (intra), and N-C' (sequential) vectors in one medium with adequate error (here the following errors were used:&nbsp; ~10&nbsp;% the RDC spread for N-H vectors, the error determined by analysis of ''J''-modulated experiments for N-CA and N-C' RDC measurement) and <span style="text-decoration: underline;">[http://www.ncbi.nlm.nih.gov/pubmed/18388951 weight factors]:</span><br>  
<pre># Orientation  Magnitude  Rhombicity  ORI residue number
<pre># Orientation  Magnitude  Rhombicity  ORI residue number
       1    5.39535        0.63125      360
       1    5.39535        0.63125      360
Line 98: Line 104:
LL5    365
LL5    365
ORI    370</pre>  
ORI    370</pre>  
<br>
==== Initial model-free determination of Da and R from assigned RDCs  ====
If the values of Da and R are not known they can be determined using any desired software OR&nbsp;using the FindTensor.cya script below. The program yields results that are equivalent to PALES assiuming the same fitting method is employed.<br>  
If the values of Da and R are not known they can be determined using any desired software OR&nbsp;using the FindTensor.cya script below. The program yields results that are equivalent to PALES assiuming the same fitting method is employed.<br>  
<pre>## 8DEMOS: FindTensor - Determine alignment tensor
<pre>## 8DEMOS: FindTensor - Determine alignment tensor
Line 125: Line 135:
</pre>  
</pre>  
The dummy values in the rdc list are read initially and they can be updated after running the FindTensor.cya routine. If no models are present (e.g. final.pdb) the program will terminate with a warning. <br>  
The dummy values in the rdc list are read initially and they can be updated after running the FindTensor.cya routine. If no models are present (e.g. final.pdb) the program will terminate with a warning. <br>  
<br>
==== The CALC.cya file  ====


The working directory that contains all the files necessary to start the calcualtion is ready and the CALC.cya modified for the presence of RDC constraints is used:<br>  
The working directory that contains all the files necessary to start the calcualtion is ready and the CALC.cya modified for the presence of RDC constraints is used:<br>  
Line 144: Line 158:


</pre>  
</pre>  
Notice the restraint contain the forced stereospecifically assigned methyls and sidechain NH's and the RDC&nbsp;set.&nbsp; Also, the NOE vs. RDC&nbsp;weight is set by the weight_rdc and cut_rdc functions. The remaining instructions are identical to the CYANA-2.1 file.&nbsp; Further parameters are specified in the init.cya file below:<br>  
Notice that the 'restraints' row contain the forced stereospecifically assigned methyls and sidechain NH's and the .rdc file.&nbsp; Also, the NOE vs. RDC&nbsp;weight is set by the weight_rdc and cut_rdc functions.&nbsp; The remaining instructions are identical to the CYANA-2.1 file.&nbsp;  
 
<br>
 
==== The init.cya file  ====
 
Further parameters are specified in the init.cya file below:<br>  
<pre>name:=RpR324
<pre>name:=RpR324
rmsdrange:=10-80
rmsdrange:=10-80
Line 151: Line 171:
read seq $name.seq
read seq $name.seq
rdcdistances</pre>  
rdcdistances</pre>  
The above script is intended to run off a single dual-quad core machine (nproc=8).&nbsp; Please note the rdcdistance.cya macro is being called by the init.cya setup file.&nbsp; This file, located in the cyana-3.0/macro directory contains the supported RDC&nbsp;vectors, more vectors could potentially be added such as Trp N<sup>ε1</sup>-H<sup>ε1</sup> that maybe useful in deuterated sample to direct the large hydrophobic sidechain.&nbsp; <br>  
The above script is intended to run off a single dual-quad core machine (nproc=8).&nbsp;  
 
===== rdcdistances.cya file  =====
 
Please note the rdcdistance.cya macro is being called by the init.cya setup file.&nbsp; This file, located in the cyana-3.0/macro directory contains the supported RDC&nbsp;vectors, more vectors could potentially be added such as Trp N<sup>ε1</sup>-H<sup>ε1</sup> that maybe useful in deuterated sample to direct the large hydrophobic sidechain.&nbsp; <br>  
<pre># Copyright (c) 2002-08 Peter Guntert. All rights reserved.
<pre># Copyright (c) 2002-08 Peter Guntert. All rights reserved.
## 7MACROS: rdcdistances - CYANA macro
## 7MACROS: rdcdistances - CYANA macro
Line 181: Line 205:
print "&nbsp;&nbsp;&nbsp; Standard RDC distances defined."
print "&nbsp;&nbsp;&nbsp; Standard RDC distances defined."
</pre>  
</pre>  
<br>
==== Notes on run execution  ====
The command line execution form single machine or cluster using MPI implememntation (highly recommended) is carried on as usual:<br>  
The command line execution form single machine or cluster using MPI implememntation (highly recommended) is carried on as usual:<br>  
<pre>/cyana-3.0/cyana CALC &gt; &amp; log &amp;
<pre>/cyana-3.0/cyana CALC &gt; &amp; log &amp;
Line 271: Line 299:
<br>  
<br>  


<br>  
<br>
 
-- PaoloRossi - 14 Dec 2009

Latest revision as of 21:56, 6 January 2010

Introduction

IMPORTANT DISCLAIMER: A number of NESG NMR groups are currently beta-testing sites for CYANA-3.0 and the information in this page is intended for use by the licensed members of the NESG consortium, other beta testers, and is to be used in accordance to the program licensing agreement.


The following page describes the setup of and analysis of an automated structure determination starting from NOEs peaklists and residual dipolar coupling (RDC) constraints in the framework of CYANA 3.0.

CYANA version 3.0 incorporates many new features including:

a) inclusion of residual dipolar coupling in structure calculation (RDC)

b) inclusion of pseudocontact shifts from paramagnetic centers in structure calculation (PCS)

c) improved function for symmetric dimers and annotation of intermolecular NOE contacts in the peaklist using the xeasy color code notation.  

Good agreement between global orientational constraints from RDC and distance information from NOEs is very important to achieve better structures by NMR.  Results should always be accompanied by energy refinement in CNS or NIH-XPLOR.  Further information about the program and publication references can be found in the CYANA WIKI page.


Automated NOE and RDC and Structure Calculation Setup

The structure calculation with automated noesy assignments and RDC restraints follows the canonical CYANA recipe.  Simple annealing runs starting from a set of constraints that include RDC are easily derived by simplifying the scripts below and following the demo scripts. The program requires a sequence file (name.seq), a proton assignment list (name.prot), a noesy peaklist set (name.peaks), an RDC list (name.rdc), a CALC.cya script and an init.cya script.  

Sequence file

The sequence file now includes the RDC tensor origin separated by dummy linker residues (LL5):

MET      1
THR      2
SER      3
THR      4
PHE      5
ASP      6
ARG      7
VAL      8
ALA      9
THR     10

PL     350
LL5    351
LL5    352
LL5    353
LL5    354
LL5    355
ORI    360


RDC constraint file

The RDC list supports multiple interatomic vectors in multiple media. RDC with distinct scaling factors and distinct ORI residue numbers are listed in a single file. The program supports the Da (magnitude) and R (Rhombicity) notation typical of programs such as PALES, REDCAT etc. Below is a sample RDC file that includes N-H, N-CA (intra), and N-C' (sequential) vectors in one medium with adequate error (here the following errors were used:  ~10 % the RDC spread for N-H vectors, the error determined by analysis of J-modulated experiments for N-CA and N-C' RDC measurement) and weight factors:

# Orientation  Magnitude  Rhombicity  ORI residue number
       1     5.39535        0.63125       360
#  First atom      Second atom                    RDC      Error  Weight  Orientation 
     7    ARG      H       7    ARG     N        5.936     2.000   1.000  1
     8    VAL      H       8    VAL     N        3.827     2.000   1.000  1
     9    ALA      H       9    ALA     N       -2.822     2.000   1.000  1
    10    THR      H      10    THR     N       -0.674     2.000   1.000  1
    11    ILE      H      11    ILE     N        4.945     2.000   1.000  1
    12    ILE      H      12    ILE     N        1.709     2.000   1.000  1
    13    ALA      H      13    ALA     N       -1.336     2.000   1.000  1
#
    4     THR      N       3    SER     C        1.267     0.095   8.330  1
    5     PHE      N       4    THR     C       -0.246     0.207   8.330  1   
    7     ARG      N       6    ASP     C        0.161     0.052   8.330  1   
    8     VAL      N       7    ARG     C        0.439     0.034   8.330  1   
    9     ALA      N       8    VAL     C       -0.076     0.040   8.330  1   
   10     THR      N       9    ALA     C       -0.957     0.048   8.330  1   
   11     ILE      N      10    THR     C        1.123     0.022   8.330  1   
   12     ILE      N      11    ILE     C       -0.440     0.037   8.330  1   
   13     ALA      N      12    ILE     C        0.065     0.026   8.330  1   
#
    3     SER      N      3     SER    CA        0.251     0.199   8.330  1
    4     THR      N      4     THR    CA       -0.265     0.258   8.330  1
    5     PHE      N      5     PHE    CA       -0.499     0.281   8.330  1
    7     ARG      N      7     ARG    CA       -0.457     0.200   8.330  1
    8     VAL      N      8     VAL    CA       -0.481     0.154   8.330  1
    9     ALA      N      9     ALA    CA        0.349     0.083   8.330  1
   10     THR      N     10     THR    CA        0.548     0.121   8.330  1
   11     ILE      N     11     ILE    CA       -0.091     0.111   8.330  1
   12     ILE      N     12     ILE    CA       -0.678     0.078   8.330  1
   13     ALA      N     13     ALA    CA        1.014     0.108   8.330  1

Multiple media (e.g. orientations) should be listed as follows:

# Orientation  Magnitude  Rhombicity  ORI residue number
       1     5.39535        0.63125       360
       2     7.55656        0.58200       370

and the sequence should be modified to include further links and ORI:

ALA      9
THR     10

PL     350
LL5    351
LL5    352
LL5    353
LL5    354
LL5    355
ORI    360
LL5    361
LL5    362
LL5    363
LL5    364
LL5    365
ORI    370


Initial model-free determination of Da and R from assigned RDCs

If the values of Da and R are not known they can be determined using any desired software OR using the FindTensor.cya script below. The program yields results that are equivalent to PALES assiuming the same fitting method is employed.

## 8DEMOS: FindTensor - Determine alignment tensor
##
## Determine magnitude and rhombicity of the alignment tensor
## from input RDCs


# determine tensor from histogram, no structure needed

read rdc phage_all_mono.rdc
print "    Input alignment tensor:"
do i 1 orientations
  print "    Orientation $i: magnitude = $magnitude(i) Hz, rhombicity = $rhombicity(i)."
end do

rdc fittensor method=simplex       # (can take several minutes)
#rdc fittensor method=gridsearch   # systematic search (very slow)


# alternatively, determine tensor from given structure by SVD

read rdc phage_all_mono.rdc
read pdb final.pdb
overview

The dummy values in the rdc list are read initially and they can be updated after running the FindTensor.cya routine. If no models are present (e.g. final.pdb) the program will terminate with a warning.


The CALC.cya file

The working directory that contains all the files necessary to start the calcualtion is ready and the CALC.cya modified for the presence of RDC constraints is used:

peaks       := ali5.peaks,aro5.peaks,n3.peaks  # names of NOESY peak lists
prot        := RpR324.prot               # names of chemical shift lists
restraints  := ssa.cya,phage_all_mono.rdc # additional (non-NOE) constraints
tolerance   := 0.04,0.025,0.3            # chemical shift tolerances: HX2-HX1-X1
calibration :=                           # NOE calibration parameters
structures  := 100,20                    # number of initial, final structures
steps       := 10000                     # number of torsion angle dynamics steps
rmsdrange   := 10..80                    # residue range for RMSD calculation
randomseed  := 56231       # random number generator seed

weight_rdc   = 0.02               # weight for RDC restraints
cut_rdc      = 0.2                # cutoff for RDC violation output

ssa
noeassign peaks=$peaks prot=$prot autoaco

Notice that the 'restraints' row contain the forced stereospecifically assigned methyls and sidechain NH's and the .rdc file.  Also, the NOE vs. RDC weight is set by the weight_rdc and cut_rdc functions.  The remaining instructions are identical to the CYANA-2.1 file. 


The init.cya file

Further parameters are specified in the init.cya file below:

name:=RpR324
rmsdrange:=10-80
cyanalib
nproc:=8
read seq $name.seq
rdcdistances

The above script is intended to run off a single dual-quad core machine (nproc=8). 

rdcdistances.cya file

Please note the rdcdistance.cya macro is being called by the init.cya setup file.  This file, located in the cyana-3.0/macro directory contains the supported RDC vectors, more vectors could potentially be added such as Trp Nε1-Hε1 that maybe useful in deuterated sample to direct the large hydrophobic sidechain. 

# Copyright (c) 2002-08 Peter Guntert. All rights reserved.
## 7MACROS: rdcdistances - CYANA macro
##
## Parameters: (none)
##
# dipole definition format: atom1_name atom2_name atom1_index atom2_index
# if indexes are missing, zeros are assumed

var info echo

syntax

info:=none; echo:=off
rdc distance "N  H"    distance=1.041
rdc distance "CA HA"   distance=1.117
rdc distance "C  CA"   distance=1.525
rdc distance "C  N"    distance=2.461
rdc distance "C  N -1" distance=1.329
rdc distance "CA N"    distance=1.458
rdc distance "CA N -1" distance=2.425
rdc distance "CA H"    distance=2.117
rdc distance "CA H -1" distance=2.533
rdc distance "C  H -1" distance=2.000
rdc distance "C  HA"   distance=2.144
rdc distance "CB HB"   distance=1.080
rdc distance "CA CB"   distance=1.532
unset info
print "    Standard RDC distances defined."


Notes on run execution

The command line execution form single machine or cluster using MPI implememntation (highly recommended) is carried on as usual:

/cyana-3.0/cyana CALC > & log &

the MPI is launched using the script called, for example, submit_cyana:

#!/bin/bash
#PBS -S /bin/bash
#PBS -N cyana
#PBS -lnodes=6:ppn=8
lamboot ~/bhost.def
cd /farm/users/prossi/RpR324_structure/cyana_new_mono2
/opt/openmpi/tcp-gnu/bin/mpirun /farm/software/cyana-3.0-mpi/cyana CALC.cya
lamhalt


with the command:

qsub -q @master3 submit_cyana

The starting scripts are highly system specific they are almost guaranteed NOT to work on your sytem and are given here for general information only.


Output analysis

The output analysis is carried out in the usual manner, it should be noted that, during the calculation the specified values Da and R are kept fixed.  Following the final cycle a new model-based estimate of Da and R is calculated and used to compute the RDC violations and their contribution to the target function and the quality factor (Q). The resulting target function will be increased by the number and extent of RDC violations in addition to other violations from dihedral, vdw, and NOEs restraints.

A partial output file is given below (final.ovw):

 
    Structural statistics:
 
    str   target     upper limits    van der Waals             RDCs
        function   #    rms   max   #    sum   max   #    rms   max
      1     8.98   5 0.0089  0.40  12   18.5  0.37  13 0.2508  2.27
      2     8.96   7 0.0083  0.31  15   17.4  0.38  19 0.2656  2.27
      3     9.69  11 0.0122  0.54  15   18.6  0.38  18 0.2595  2.28
      4     9.63   6 0.0068  0.20  16   18.9  0.38  18 0.2604  2.27
      5     9.41  14 0.0089  0.22  17   20.4  0.37  14 0.2521  2.26
      6     9.80  10 0.0085  0.24  17   19.7  0.38  15 0.2563  2.27
      7    10.52  12 0.0158  0.78  15   19.1  0.38  13 0.2554  2.29
      8    10.05  10 0.0084  0.18  18   20.3  0.42  18 0.2587  2.28
      9    10.57   7 0.0079  0.29  19   18.3  0.53  16 0.3169  2.43
     10    10.38  13 0.0088  0.21  22   20.5  0.37  21 0.2636  2.28
     11    10.33   7 0.0067  0.16  17   19.6  0.64  15 0.2663  2.28
     12     9.93   9 0.0090  0.29  19   20.6  0.38  20 0.2541  2.27
     13    10.12  12 0.0098  0.23  21   19.9  0.47  13 0.2681  2.30
     14    10.53   7 0.0077  0.19  23   19.3  0.38  11 0.3145  2.40
     15    10.96   8 0.0089  0.31  21   22.6  0.38  12 0.2639  2.28
     16    10.50  12 0.0100  0.29  23   20.1  0.37  15 0.2696  2.27
     17    10.56  14 0.0119  0.30  23   21.1  0.37  16 0.2736  2.33
     18    10.75  17 0.0146  0.60  21   20.0  0.37  12 0.2547  2.27
     19    10.88  18 0.0125  0.38  19   23.3  0.38  10 0.2506  2.30
     20    10.88   8 0.0083  0.26  23   21.4  0.42  15 0.2644  2.28
 
    Ave    10.17  10 0.0097  0.32  19   20.0  0.41  15 0.2660  2.29
    +/-     0.59   4 0.0024  0.15   3    1.4  0.07   3 0.0177  0.04
    Min     8.96   5 0.0067  0.16  12   17.4  0.37  10 0.2506  2.26
    Max    10.96  18 0.0158  0.78  23   23.3  0.64  21 0.3169  2.43
    Cut                      0.10             0.20             0.20
 
    Constraints violated in 6 or more structures:
                                                   #   mean   max.  1   5   10   15   20
    Upper HA    PRO   19 - HB3   ARG   20   5.50  15   0.11   0.21  +++++++  ++++   ++*+  peak 970
    Upper HA    ILE   23 - QB    SER   27   5.34   6   0.09   0.16         + ++   +  +*   peak 276
    Upper HA    ILE   30 - HB2   LEU   33   4.95  10   0.10   0.15      ++++  ++  ++ +*   peak 313
    VdW   CB    ALA   69 - H     THR   70   2.55  18   0.24   0.28  ++++++++ ++++ *+++++
    VdW   O     THR   71 - N     PHE   75   2.75  12   0.20   0.32   ++* +++  +   ++ +++
    VdW   O     PHE   75 - C     VAL   76   2.80  13   0.21   0.31      ++++ +  *+++++++
    VdW   CG1   VAL   76 - HG2   LYS   78   2.60   9   0.17   0.25   ++*   +++++   +
    VdW   HG3   LYS   78 - C     LYS   78   2.50  14   0.22   0.36  +++*++ +++++ + +  +
    Ori 1 N     ALA   69 - CA    ALA   69  -2.84  20   1.95   2.04  +++++++++++++*++++++
    Ori 1 H     ALA   69 - N     ALA   69  -3.32  20   2.01   2.33  ++++++++++++++++*+++
    Ori 1 N     THR   70 - C     ALA   69   0.25  16   0.31   1.06  ++++++++*+++ + +  ++
    Ori 1 H     ASN   79 - N     ASN   79  -4.71  10   0.17   0.32  + *+ +  ++++   +  +
    Ori 1 H     GLY   92 - N     GLY   92  -5.54   8   0.15   0.51     ++++*       +  ++
    Ori 1 H     LEU   94 - N     LEU   94   0.42   6   0.14   0.54   +*+     +++
    3 violated distance restraints.
    5 violated van der Waals restraints.
    6 violated residual dipolar coupling restraints.
 
 
    RDC statistics:
    Correlation coefficient      :    0.906 +/-  0.003    (0.899..0.909, best in conformer 4)
    Q = rms(Dcalc-Dobs)/rms(Dobs):   42.709 +/-  0.583 %  (42.073..44.194)
    Q normalized by tensor       :   32.943 +/-  0.638 %  (32.231..34.776)
    Alignment tensor magnitude   :    5.881 +/-  0.045 Hz (5.760..5.950, best 5.898; input 5.898)
    Alignment tensor rhombicity  :    0.537 +/-  0.007    (0.524..0.556, best 0.539; input 0.539)