RDC Refinement with XPLOR-NIH: Difference between revisions

From NESG Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(38 intermediate revisions by the same user not shown)
Line 1: Line 1:
===== Brief Description  =====
=== Brief Description  ===


The angular dependence of RDCs and RCSAs can provide valuable structural information that complements NOE distance restraints.<br> RDCs and RCSAs can be used to:  
The angular dependence of RDCs and RCSAs can provide valuable structural information that complements NOE distance restraints. RDCs and RCSAs can be used to:  


:- Validate protein structures  
:- Validate protein structures  
:- Refine protein structures (current topic)  
:- Refine protein structures (current topic)  
:- Provide constraints as a part of an initial structure determination ([[Structure Calculation With RDC's Using CYANA|CYANA 3.0 with RDC]])
:- Provide constraints as a part of an initial structure determination


Here, we describe the RDCs and RCSAs refinement protocol using XPLOR-NIH. The python version of the refinement script was taken from the example dataset (xplor-nih-2.22/eginput/gb1_rdc/refine.py) provided by the XPLOR-NIH package (http://nmr.cit.nih.gov/xplor-nih/). The key features of this refinement are as follow:  
'''Selection of RDC and RCSA Orientational Restraints''' While the RDCs and RCSAs can be very useful orientational restraints for structural refinement, it can also introduce errors into calculations if not used properly. Local motion can average RDCs and make their use in the search for rigid structural model inappropriate. The current recommendation is to use RDCs and RCSAs from residues that are in well ordered regions of the protein. For initial structure refinement, use RDCs and RCSAs from residues that are part of well ordered alpha-helical or beta-strand regions (PSVS identifies these regions). RDCs from other well ordered regions can be added at the user's discretion. Ideally, the use of residue specific TauCs from T1/T2 or cross-correlation measurements can be used to eliminate the use of RDCs from mobile regions.
 
'''Parameter Selection for Use of RDC and RCSA Orientational Restraints in Refinement'''
 
:Weighting of the RDCs and RCSAs
:Error correction of NOE, dihedral angle, RDCs, and RCSAs restraint
 
<br> It is easy to implement the RDC and RCSA restraints using example simulated annealing protocol included in the Xplor-NIH distribution packages. The python version of the refinement script was taken from the example dataset (xplor-nih-2.28/eginput/gb1_rdc/refine.py) provided by the XPLOR-NIH package (http://nmr.cit.nih.gov/xplor-nih/). The features of particular relevance to RDC refinement are as follows:  
<div></div>  
<div></div>  
:- Variable tensor tools for floating the RDC tensors during refinement  
:- Variable tensor tools for floating the principal alignment parameters, Da and R, during refinement  
:- A radius of gyration term to represent the weak packing potential
:-&nbsp;?Database potentials of mean force to refine against?:  
::(This potential is used when the calculated structures are too loosely packed)
:- Database potentials of mean force to refine against:  
::- Multidimensional torsion angles  
::- Multidimensional torsion angles  
::- Backbone hydrogen bonding database (Optional)
::- Backbone hydrogen bonding database (Optional)
Line 18: Line 23:
<br>  
<br>  


===== Getting Started  =====
=== Getting Started  ===


The following files in XPLOR format are required to run the refinement:  
XPLOR supports the following types of restraint files:  


:'''prot_noe.tbl''' NOE restraint table  
:'''prot_noe.tbl''' NOE restraint table  
Line 28: Line 33:
:'''prot.psf''' and '''prot.pdb''' Startup psf and pdb files were generated using the lowest energy structure from CYANA.
:'''prot.psf''' and '''prot.pdb''' Startup psf and pdb files were generated using the lowest energy structure from CYANA.


====== An example of the NOE restraint table in XPLOR format is shown below (converted from CYANA upl file using a CYANA to XPLOR conversion script): ======
==== NOE Restraint Table  ====
 
An example of the NOE restraint table in XPLOR format is shown below (converted from CYANA upl file using a CYANA to XPLOR conversion script):  
<pre>assign ( resid    2 and name HA  )  ( resid    2 and name HD*  )  4.00  2.20  1.00
<pre>assign ( resid    2 and name HA  )  ( resid    2 and name HD*  )  4.00  2.20  1.00
assign ( resid    2 and name HA  )  ( resid    2 and name HG1  )  4.00  2.20  1.00
assign ( resid    2 and name HA  )  ( resid    2 and name HG1  )  4.00  2.20  1.00
assign ( resid    2 and name HA  )  ( resid    2 and name HE*  )  4.00  2.20  1.00
assign ( resid    2 and name HA  )  ( resid    2 and name HE*  )  4.00  2.20  1.00
assign ( resid    2 and name HD*  )  ( resid    2 and name HE*  )  3.00  1.20  0.50
assign ( resid    2 and name HA  )  ( resid    2 and name HG2  )  4.00  2.20  1.00
assign ( resid    2 and name HA  )  ( resid    2 and name HG*  )  3.00  1.20  0.50
</pre>  
</pre>  
==== Dihedral Angle Table  ====
An example of the Dihedral angle restraint in XPLOR format is shown below (Use CYANA for format conversion):  
An example of the Dihedral angle restraint in XPLOR format is shown below (Use CYANA for format conversion):  
<pre>assign ( resid   7 and name N    )  ( resid    7 and name CA  )  
<pre>assign ( resid   7 and name N    )  ( resid    7 and name CA  )  
       (resid    7 and name C    )  ( resid    8 and name N    )  1  -34.00  20.00 2
       (resid    7 and name C    )  ( resid    8 and name N    )  1  -34.00  20.00 2 # For Phi Angle
assign ( resid   7 and name C   )  ( resid    8 and name N    )  
assign ( resid   7 and name C   )  ( resid    8 and name N    )  
       (resid    8 and name CA  )  ( resid    8 and name C    )  1  -71.00  34.00 2
       (resid    8 and name CA  )  ( resid    8 and name C    )  1  -71.00  34.00 2 # For Psi Angle
assign ( resid    8 and name N  )  ( resid    8 and name CA  )
      (resid    8 and name C    )  ( resid    9 and name N    )  1  -41.00  22.00 2
</pre>  
</pre>  
==== RDC Table  ====
An example of the RDC table in XPLOR format is shown below:  
An example of the RDC table in XPLOR format is shown below:  
:'''( resid 500 and name OO )''': resid 500 is the residue number for the tensor axis system
:'''( resid # and name N )''': defines the first atom of the pair
:'''( resid # and name HN ) 2.586 1.5''': defines the second atom of the pair, RDC_value, Error_value
::'''Note''': The error_value term is not utilized with the square potential of this script.
<pre># For NH Coupling
<pre># For NH Coupling
assign ( resid 500  and name OO  )
assign ( resid 500  and name OO  )
Line 75: Line 86:
       ( resid 4    and name N  )  -3.9435  1.5
       ( resid 4    and name N  )  -3.9435  1.5
</pre>  
</pre>  
==== RCSA Restraint Table  ====
An example of the RCSA table in XPLOR format is shown below:  
An example of the RCSA table in XPLOR format is shown below:  
RCSAs can also be used to validate protein structure using [[REDCAT|REDCAT]].
Unlike RDCs which are defined with a single vector, RCSAs require a full 3 axis system. This is usually defined by identifying a set of 3 bonded atoms.
:'''(resid 2 and name C) (resid 3 and name N) (resid 3 and name HN) 29.040 24.750''': 29.040 is the N RCSA shift in PPB unit and 24.750 is a constant
:'''(resid 4 and name C) (resid 4 and name O) (resid 5 and name N) -24.400 13.333''': -24.400 is the C RCSA shift in PPB unit and 13.333 is a constant
<pre> #For N RCSA
<pre> #For N RCSA
assign (resid 500 and name OO ) (resid 500 and name Z) (resid 500 and name X ) (resid 500 and name Y )
assign (resid 500 and name OO )  
(resid 2 and name C) (resid 3 and name N) (resid 3 and name HN) 29.040  24.750
      (resid 500 and name Z)  
      (resid 500 and name X )  
      (resid 500 and name Y )
      (resid 2 and name C)  
      (resid 3 and name N)  
      (resid 3 and name HN) 29.040  24.750
assign (resid 500 and name OO ) (resid 500 and name Z) (resid 500 and name X ) (resid 500 and name Y )
assign (resid 500 and name OO ) (resid 500 and name Z) (resid 500 and name X ) (resid 500 and name Y )
(resid 3 and name C) (resid 4 and name N) (resid 4 and name HN) 62.205  24.750
(resid 3 and name C) (resid 4 and name N) (resid 4 and name HN) 62.205  24.750
Line 92: Line 117:
(resid 8 and name C) (resid 8 and name O) (resid 9 and name N) -8.400  13.333
(resid 8 and name C) (resid 8 and name O) (resid 9 and name N) -8.400  13.333
</pre>  
</pre>  
&lt;/pre&gt;
=== Protocol for Inclusion of RDCs in Structure Determination  ===


===== Protocol for RDC Refinement  =====
It is important to run more than one simulate annealing cycle (sometimes call annealing and refinement cycle). It is also important to recognized that RDCs error functions have multi-minimia and are not very useful in finding a suitable structure beginning far far away from the target. So RDCs are typically used in conduction with NOE data. In simulated annealing, the relative weight of RDCs data are low compare to weight of NOE at early point in the annealing protocol. The weight for both RDCs and NOEs are ramped up as temperature decreases, but the RDCs weight is ramp up more rapidly. In the scripts below, suggested starting for weighting factors are given. Note, in the case of multiple RDCs sets, additional relative weighting factors are given in proportion to the importance (size) of the varies RDCs. Ideally, weights are adjusted to produce Q-factor for back-calculated sets in the 0.1 to 0.2 ranges. Values below 0.1 are not normally justify by the quality of the data and may distort local bond geometry.


First, obtain a good estimate of the magnitude of Da and R from alignment tensors using either [[REDCAT|REDCAT]] or PALES program and use this as a starting point for the refinement. Then edit the following portion of the refine.py script. Note: text on the same line and following a “#” sign is not read by the XPLOR program.  
<br> First, obtain a good estimate of the magnitude of Da and R from alignment tensors using either [[REDCAT|REDCAT]] or PALES program and use this as a starting point for the refinement. Then edit the following portion of the refine.py script. Note: text on the same line and following a “#” sign is not read by the XPLOR program.  
 
:'''('t', -6.5, 0.62)''': this line refers to the Da and rhombicity for alignment medium t
<pre>#                        medium  Da  rhombicity
<pre>#                        medium  Da  rhombicity
for (medium,Da,Rh) in [ ('t',  -6.5, 0.62),
for (medium,Da,Rh) in [ ('t',  -6.5, 0.62),
Line 132: Line 159:
     pass
     pass
potList.append(rdcs)
potList.append(rdcs)
rampedParams.append( MultRamp(0.05,5.0, "rdcs.setScale( VALUE )") )
rampedParams.append( MultRamp(0.05,0.25, "rdcs.setScale( VALUE )") )
</pre>  
</pre>  
<br> Allow Da and R to float by using the setFreedom method associated with the medium object. To fix the peptide plane, the IVM_groupRigidBackbone tool were used (First two lines and the last line).  
<br> Allow Da and R to float by using the setFreedom method associated with the medium object. To fix the peptide plane, the IVM_groupRigidBackbone tool were used (First two lines and the last line).  
Line 149: Line 176:
IVM_groupRigidBackbone(minc)
IVM_groupRigidBackbone(minc)
</pre>
</pre>
=== Validation of Structures Using RDCs and RCSAs ===

Latest revision as of 19:28, 5 April 2012

Brief Description

The angular dependence of RDCs and RCSAs can provide valuable structural information that complements NOE distance restraints. RDCs and RCSAs can be used to:

- Validate protein structures
- Refine protein structures (current topic)
- Provide constraints as a part of an initial structure determination

Selection of RDC and RCSA Orientational Restraints While the RDCs and RCSAs can be very useful orientational restraints for structural refinement, it can also introduce errors into calculations if not used properly. Local motion can average RDCs and make their use in the search for rigid structural model inappropriate. The current recommendation is to use RDCs and RCSAs from residues that are in well ordered regions of the protein. For initial structure refinement, use RDCs and RCSAs from residues that are part of well ordered alpha-helical or beta-strand regions (PSVS identifies these regions). RDCs from other well ordered regions can be added at the user's discretion. Ideally, the use of residue specific TauCs from T1/T2 or cross-correlation measurements can be used to eliminate the use of RDCs from mobile regions.

Parameter Selection for Use of RDC and RCSA Orientational Restraints in Refinement

Weighting of the RDCs and RCSAs
Error correction of NOE, dihedral angle, RDCs, and RCSAs restraint


It is easy to implement the RDC and RCSA restraints using example simulated annealing protocol included in the Xplor-NIH distribution packages. The python version of the refinement script was taken from the example dataset (xplor-nih-2.28/eginput/gb1_rdc/refine.py) provided by the XPLOR-NIH package (http://nmr.cit.nih.gov/xplor-nih/). The features of particular relevance to RDC refinement are as follows:

- Variable tensor tools for floating the principal alignment parameters, Da and R, during refinement
- ?Database potentials of mean force to refine against?:
- Multidimensional torsion angles
- Backbone hydrogen bonding database (Optional)


Getting Started

XPLOR supports the following types of restraint files:

prot_noe.tbl NOE restraint table
prot_dihe.tbl Dihedral angle restraint
prot_rdc.tbl RDC restraint table
prot_rcsa.tbl RCSA restraint table
prot.psf and prot.pdb Startup psf and pdb files were generated using the lowest energy structure from CYANA.

NOE Restraint Table

An example of the NOE restraint table in XPLOR format is shown below (converted from CYANA upl file using a CYANA to XPLOR conversion script):

assign ( resid    2 and name HA   )   ( resid    2 and name HD*  )   4.00  2.20  1.00
assign ( resid    2 and name HA   )   ( resid    2 and name HG1  )   4.00  2.20  1.00
assign ( resid    2 and name HA   )   ( resid    2 and name HE*  )   4.00  2.20  1.00

Dihedral Angle Table

An example of the Dihedral angle restraint in XPLOR format is shown below (Use CYANA for format conversion):

assign ( resid   7 and name N    )   ( resid    7 and name CA   ) 
       (resid    7 and name C    )   ( resid    8 and name N    )  1  -34.00   20.00 2 # For Phi Angle
assign ( resid   7 and name C    )   ( resid    8 and name N    ) 
       (resid    8 and name CA   )   ( resid    8 and name C    )  1  -71.00   34.00 2 # For Psi Angle

RDC Table

An example of the RDC table in XPLOR format is shown below:

( resid 500 and name OO ): resid 500 is the residue number for the tensor axis system
( resid # and name N ): defines the first atom of the pair
( resid # and name HN ) 2.586 1.5: defines the second atom of the pair, RDC_value, Error_value
Note: The error_value term is not utilized with the square potential of this script.
# For NH Coupling
assign ( resid 500  and name OO  )
       ( resid 500  and name Z   )
       ( resid 500  and name X   )
       ( resid 500  and name Y   )
       ( resid 3    and name N   )
       ( resid 3    and name HN   )  2.586  1.5

assign ( resid 500  and name OO  )
       ( resid 500  and name Z   )
       ( resid 500  and name X   )
       ( resid 500  and name Y   )
       ( resid 4    and name N   )
       ( resid 4    and name HN   )  7.785  1.5

# For NC Coupling, normalized to NH magnitude
assign ( resid 500  and name OO  )
       ( resid 500  and name Z   )
       ( resid 500  and name X   )
       ( resid 500  and name Y   )
       ( resid 2    and name C   )
       ( resid 3    and name N   )  -9.26475  1.5

assign ( resid 500  and name OO  )
       ( resid 500  and name Z   )
       ( resid 500  and name X   )
       ( resid 500  and name Y   )
       ( resid 3    and name C   )
       ( resid 4    and name N   )  -3.9435  1.5

RCSA Restraint Table

An example of the RCSA table in XPLOR format is shown below:

RCSAs can also be used to validate protein structure using REDCAT.

Unlike RDCs which are defined with a single vector, RCSAs require a full 3 axis system. This is usually defined by identifying a set of 3 bonded atoms.

(resid 2 and name C) (resid 3 and name N) (resid 3 and name HN) 29.040 24.750: 29.040 is the N RCSA shift in PPB unit and 24.750 is a constant
(resid 4 and name C) (resid 4 and name O) (resid 5 and name N) -24.400 13.333: -24.400 is the C RCSA shift in PPB unit and 13.333 is a constant
 #For N RCSA
assign (resid 500 and name OO ) 
       (resid 500 and name Z) 
       (resid 500 and name X ) 
       (resid 500 and name Y )
       (resid 2 and name C) 
       (resid 3 and name N) 
       (resid 3 and name HN) 29.040  24.750
assign (resid 500 and name OO ) (resid 500 and name Z) (resid 500 and name X ) (resid 500 and name Y )
(resid 3 and name C) (resid 4 and name N) (resid 4 and name HN) 62.205  24.750
assign (resid 500 and name OO ) (resid 500 and name Z) (resid 500 and name X ) (resid 500 and name Y )
(resid 4 and name C) (resid 5 and name N) (resid 5 and name HN) 55.110  24.750

# For C RCSA
assign (resid 500 and name OO ) (resid 500 and name Z) (resid 500 and name X ) (resid 500 and name Y )
(resid 4 and name C) (resid 4 and name O) (resid 5 and name N) -24.400  13.333
assign (resid 500 and name OO ) (resid 500 and name Z) (resid 500 and name X ) (resid 500 and name Y )
(resid 7 and name C) (resid 7 and name O) (resid 8 and name N) 36.533   13.333
assign (resid 500 and name OO ) (resid 500 and name Z) (resid 500 and name X ) (resid 500 and name Y )
(resid 8 and name C) (resid 8 and name O) (resid 9 and name N) -8.400   13.333

Protocol for Inclusion of RDCs in Structure Determination

It is important to run more than one simulate annealing cycle (sometimes call annealing and refinement cycle). It is also important to recognized that RDCs error functions have multi-minimia and are not very useful in finding a suitable structure beginning far far away from the target. So RDCs are typically used in conduction with NOE data. In simulated annealing, the relative weight of RDCs data are low compare to weight of NOE at early point in the annealing protocol. The weight for both RDCs and NOEs are ramped up as temperature decreases, but the RDCs weight is ramp up more rapidly. In the scripts below, suggested starting for weighting factors are given. Note, in the case of multiple RDCs sets, additional relative weighting factors are given in proportion to the importance (size) of the varies RDCs. Ideally, weights are adjusted to produce Q-factor for back-calculated sets in the 0.1 to 0.2 ranges. Values below 0.1 are not normally justify by the quality of the data and may distort local bond geometry.


First, obtain a good estimate of the magnitude of Da and R from alignment tensors using either REDCAT or PALES program and use this as a starting point for the refinement. Then edit the following portion of the refine.py script. Note: text on the same line and following a “#” sign is not read by the XPLOR program.

('t', -6.5, 0.62): this line refers to the Da and rhombicity for alignment medium t
#                        medium  Da   rhombicity
for (medium,Da,Rh) in [ ('t',   -6.5, 0.62),
                        ('b',   -9.9, 0.23) ]:
    oTensor = create_VarTensor(medium)
    oTensor.setDa(Da)
    oTensor.setRh(Rh)
    media[medium] = oTensor
    pass


The example below contains NH, NCO, and HNC RDCs from two different alignment media. The Da rescaling factor was used since the magnitude of the non-NH RDCs were not normalized to the magnitude of NH RDCs.

from rdcPotTools import create_RDCPot, scale_toNH
rdcs = PotList('rdc')
for (medium,expt,file,                 scale) in \
    [('t','NH' ,'tmv107_nh.tbl'       ,1),
     ('t','NCO','tmv107_nc.tbl'       ,.05),
     ('t','HNC','tmv107_hnc.tbl'      ,.108),
     ('b','NH' ,'bicelles_new_nh.tbl' ,1),
     ('b','NCO','bicelles_new_nc.tbl' ,.05),
     ('b','HNC','bicelles_new_hnc.tbl',.108)
     ]:
    rdc = create_RDCPot("%s_%s"%(medium,expt),file,media[medium])

    #1) scale prefactor relative to NH
    #   see python/rdcPotTools.py for exact calculation
    # scale_toNH(rdc) - not needed for these datasets -
    #                        but non-NH reported rmsd values will be wrong.

    #3) Da rescaling factor (separate multiplicative factor)
    # scale *= ( 1. / rdc.oTensor.Da(0) )**2
    rdc.setScale(scale)
    rdc.setShowAllRestraints(1) #all restraints are printed during analysis
    rdc.setThreshold(1.5)       # in Hz
    rdcs.append(rdc)
    pass
potList.append(rdcs)
rampedParams.append( MultRamp(0.05,0.25, "rdcs.setScale( VALUE )") )


Allow Da and R to float by using the setFreedom method associated with the medium object. To fix the peptide plane, the IVM_groupRigidBackbone tool were used (First two lines and the last line).

from selectTools import IVM_groupRigidBackbone
IVM_groupRigidBackbone(dyn)

for m in media.values():
#    m.setFreedom("fixDa, fixRh")        #fix tensor Rh, Da, vary orientation
    m.setFreedom("varyDa, varyRh")      #vary tensor Rh, Da, vary orientation
protocol.torsionTopology(dyn,oTensors=media.values())

# minc used for final cartesian minimization
#
minc = IVM()
protocol.initMinimize(minc)
IVM_groupRigidBackbone(minc)

Validation of Structures Using RDCs and RCSAs