RDC-Assisted Dimer Structure Determination: Difference between revisions

From NESG Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(15 intermediate revisions by the same user not shown)
Line 1: Line 1:
== <br> '''Introduction''' ==
== '''Introduction''' ==


<br>Protein homo-oligomers occur in nature frequently and some oligomerization processes can influence biological processes significantly. In fact, according to Levy et al. a high percentage of proteins are capable of forming homo-oligomers [1]. The prevalence of homo-oligomers has made elucidation of their complex structures a major imperative for the structural genomics community. However, solving homo-oligomer structures has always been difficult using traditional NOE-based techniques in solution NMR. This is primarily because obtaining unambiguous intermolecular distance constraints using NOE is limited by both sample preparation and NMR methodology. The traditional approach uses isotopic filtering methods to distinguish between intermolecular and intramolecular NOEs. This requires the preparation of samples that contains a mixture of isotopically labeled and unlabeled proteins. Owing to the probability of oligomer association, only half of the protein can form oligomers with both labeled and unlabeled components. This immediately results in a 50% loss in signal intensity. The isotope-filtering/editing experiment used to obtain the NOEs also suffers from low sensitivity due to the presence of many filtering elements. Further more, the nature of the homo-oligomer implies that the same set of residues will be involved in all intermolecular NOEs. This increases the likelihood of intermolecular NOEs between sequential residues or identical residues, thus lowering the number of unambiguous intermolecular NOEs in many situations.<br><br>The disadvantages of the NOE approach in solving homo-oligomer structures have become the motivation for the development of a complementary approach for solving homo-oligomer structures using NMR. The one approach taken by us in this study uses residual dipolar couplings (RDCs) to obtain orientation of the symmetry axis relative to the monomer structure. This information will then be used to model the dimer structure using a simple grid search algorithm to find the suitable binding interface.<br>
Protein homo-oligomers occur in nature frequently and some oligomerization processes can influence biological processes significantly. In fact, according to Levy et al. a high percentage of proteins are capable of forming homo-oligomers [1]. The prevalence of homo-oligomers has made elucidation of their complex structures a major imperative for the structural genomics community. However, solving homo-oligomer structures has always been difficult using traditional NOE-based techniques in solution NMR. This is primarily because obtaining unambiguous intermolecular distance constraints using NOE is limited by both sample preparation and NMR methodology. The traditional approach uses isotopic filtering methods to distinguish between intermolecular and intramolecular NOEs. This requires the preparation of samples that contains a mixture of isotopically labeled and unlabeled proteins. Owing to the probability of oligomer association, only half of the protein can form oligomers with both labeled and unlabeled components. This immediately results in a 50% loss in signal intensity. The isotope-filtering/editing experiment used to obtain the NOEs also suffers from low sensitivity due to the presence of many filtering elements. Further more, the nature of the homo-oligomer implies that the same set of residues will be involved in all intermolecular NOEs. This increases the likelihood of intermolecular NOEs between sequential residues or identical residues, thus lowering the number of unambiguous intermolecular NOEs in many situations.<br><br>The disadvantages of the NOE approach in solving homo-oligomer structures have become the motivation for the development of a complementary approach for solving homo-oligomer structures using NMR. The one approach taken by us in this study uses residual dipolar couplings (RDCs) to obtain orientation of the symmetry axis relative to the monomer structure. This information will then be used to model the dimer structure using a simple grid search algorithm to find the suitable binding interface.<br>  


== '''Theoretical Background''' ==
== '''Theoretical Background''' ==


<br>Dipolar interaction between spins is one of a few dominant interactions in NMR. In solution, the fast reorientation of the molecules usually averages the interaction to zero. However, if the molecules are forced to favor one particular orientation, such as the case of molecules immersed liquid crystalline media or attached to a paramagnetic ion, the small imbalance in average orientation allows a small portion of the dipolar interaction to remain, thus adding an extra component to the splitting observed between the two singlets. This extra component remaining is referred to as the residual dipolar coupling or RDC. As dipolar interaction is orientation dependent, the size of RDC is related to the average orientation of the inter-nuclear vector relative to the magnetic field, and offers information complementary to that of NOE.<br><br>For every alignment, information on five parameters is required for each alignment medium to determine the RDC for each atom pair. This information is usually related in the form of a three-by-three symmetrical and traceless matrix, which is known as the alignment tensor. A frame of reference can be chosen such that the matrix adopts the form of a diagonal matrix in the frame. This frame of reference is usually referred to as the principal axis frame of the alignment tensor. The axes of this frame are unique in that RDC values of inter-nuclear vectors related by 180 degrees rotation around any of the axes are identical. This symmetrical property will be exploited in our modeling of dimer structures: since 180 degree rotation around the dimer symmetry axis leaves the complex unchanged, the RDC values of each molecule should also be invariant. This implies that the symmetry axis of a dimeric molecule must be parallel to one of the axes in the principal alignment frame. This is in fact true of all homo-oligomers with cyclic symmetry. Al-Hashimi et al. has shown that homo-oligomers possessing cyclic order of three or higher always align with the symmetry axis parallel to the non-generate axis of the axially symmetric alignment tensor [2]. In the case of a dimer, all three principal axes have the potential to be parallel to the symmetry axis. The degeneracy can only be solved experimentally by aligning the protein in two non-degenerate alignment media. As the symmetry axis adopts the same orientation relative to the monomers irrespective of alignment media, two non-degenerate alignment tensors should share a common axis, thus identifying the orientation of the symmetry axis.<br><br>Although RDCs have been used in structural determination and validation for some time, the symmetry information contained in RDCs has rarely been exploited. This is especially unfortunate in the situation of homo-oligomer structure elucidation, where global orientation information can help greatly in the interpretation of intermolecular NOEs, if not determining the structure out right. Our protocol will first determine the orientation of the symmetry axis of the oligomer using the relationship between the alignment tensor axis and the oligomer symmetry axis outlined above. This information greatly restricts the possible positions of the dimeric interface and allows it to be determined using only a simple two dimensional grid search.<br><br>It is prudent to mention at this point the pre-requisites for successfully executing the protocol. Since the alignment tensor must be known to place the symmetry axis, a correct monomer structure must be available. If monomer structure is not available due to contamination of the NOE data with intermolecular NOEs, fragments of the structure whose conformation is certain can be used in isolation to calculate the orientation of the alignment tensor. In the case of the dimer, two sets of RDC from non-degenerate alignment media will be required to determine the symmetry axis unambiguously.<br><br>XPLOR-NIH and dimer structure calculation<br>XPLOR-NIH has extensive support for RDC-assisted structure refinement. However, the RDC potential implementation in XPLOR-NIH does not utilize the dimer symmetry axis orientation information embedded in the RDCs. Therefore, refining structures with the RDC in XPLOR-NIH will not result in correct oligomer structures.<br><br>Applicability of the algorithm to weak dimers<br>Although the scheme was first envisioned for strong dimers, there is increasing evidence that the protocol may be even more valuable for the solution of weak dimer structures. It is usually very difficult to obtain intermolecular NOEs for weaker dimers through X-filtered experiments and, as a consequence, probability of obtaining erroneous monomer structure is also reduced. However, because there is usually a mixture of monomer and dimer species in the solution, RDC measured contains contributions from both monomer and dimer. To obtain an accurate measurement of only the dimer component of the RDC, the dissociation constant of the dimerization process needs to be known and RDC measurements will have to be made on multiple samples with different concentrations. This allows the value of the dimer RDC to be extrapolated based the linear relationship between the fraction of dimer in solution and the RDC value. Although the procedure is tedious, it can provide information that is otherwise not obtainable.<br><br>Implementation of the grid search<br>Once the symmetry axis of the oligomer has been identified, the construction of the dimer models can then be carried out via a simple grid search. Computational modeling of dimer structures can be implemented with several approaches. In the absence of the symmetry axis information, the search is a three-dimensional site search that contains a large number of possibilities. But with the symmetry information, the search is restricted to a simple two dimensional search on the plane orthogonal to the symmetry axis. This is because movements parallel to the axis will break the symmetry thus can not be allowed in order to preserve the orientation of the symmetry axis.<br>The grid search can be implemented with a variety of programs. The implementation used in this protocol uses the program VMD, which contains an extensive set of functions that allows the convenient manipulation of PDB coordinates and is capable of calling outside programs to perform molecular dynamics and simulate RDCs. During the grid search, the PDB coordinates of the monomer in the alignment tensor frame is first imported and placed at the origin. An identical copy of the molecule is then made and rotated by 180 degrees around the symmetry axis. The copy is then positioned at regularly spaced points in the plane perpendicular to the symmetry axis. At each grid point, a preliminary analysis of the relations between the two molecules is carried out to judge the suitability of the model being a realistic dimer. The analysis involves mostly calculating the distance between the two molecules to ensure that there are no egregious clashes between the molecules yet they are close enough to have an intermolecular interaction surface. Models judged to be plausible will undergo a round of molecular dynamics involving only residues at the interface. This will improve the interfacial contact if the initial side chain conformation of the monomer is not optimal. All suitable models produced by the model will then be evaluated based on the suitability of the intermolecular interface and, if PEG RDCs are available, the agreement between predicted RDCs based on the dimer structure and the experimental RDCs.
Dipolar interaction between spins is one of a few dominant interactions in NMR. In solution, the fast reorientation of the molecules usually averages the interaction to zero. However, if the molecules are forced to favor one particular orientation, such as the case of molecules immersed liquid crystalline media or attached to a paramagnetic ion, the small imbalance in average orientation allows a small portion of the dipolar interaction to remain, thus adding an extra component to the splitting observed between the two singlets. This extra component remaining is referred to as the residual dipolar coupling or RDC. As dipolar interaction is orientation dependent, the size of RDC is related to the average orientation of the inter-nuclear vector relative to the magnetic field, and offers information complementary to that of NOE.<br><br>For every alignment, information on five parameters is required for each alignment medium to determine the RDC for each atom pair. This information is usually related in the form of a three-by-three symmetrical and traceless matrix, which is known as the alignment tensor. A frame of reference can be chosen such that the matrix adopts the form of a diagonal matrix in the frame. This frame of reference is usually referred to as the principal axis frame of the alignment tensor. The axes of this frame are unique in that RDC values of inter-nuclear vectors related by 180 degrees rotation around any of the axes are identical. This symmetrical property will be exploited in our modeling of dimer structures: since 180 degree rotation around the dimer symmetry axis leaves the complex unchanged, the RDC values of each molecule should also be invariant. This implies that the symmetry axis of a dimeric molecule must be parallel to one of the axes in the principal alignment frame. This is in fact true of all homo-oligomers with cyclic symmetry. Al-Hashimi et al. has shown that homo-oligomers possessing cyclic order of three or higher always align with the symmetry axis parallel to the non-generate axis of the axially symmetric alignment tensor [2]. In the case of a dimer, all three principal axes have the potential to be parallel to the symmetry axis. The degeneracy can only be solved experimentally by aligning the protein in two non-degenerate alignment media. As the symmetry axis adopts the same orientation relative to the monomers irrespective of alignment media, two non-degenerate alignment tensors should share a common axis, thus identifying the orientation of the symmetry axis.<br><br>Although RDCs have been used in structural determination and validation for some time, the symmetry information contained in RDCs has rarely been exploited. This is especially unfortunate in the situation of homo-oligomer structure elucidation, where global orientation information can help greatly in the interpretation of intermolecular NOEs, if not determining the structure out right. Our protocol will first determine the orientation of the symmetry axis of the oligomer using the relationship between the alignment tensor axis and the oligomer symmetry axis outlined above. This information greatly restricts the possible positions of the dimeric interface and allows it to be determined using only a simple two dimensional grid search.<br><br>It is prudent to mention at this point the pre-requisites for successfully executing the protocol. Since the alignment tensor must be known to place the symmetry axis, a correct monomer structure must be available. If monomer structure is not available due to contamination of the NOE data with intermolecular NOEs, fragments of the structure whose conformation is certain can be used in isolation to calculate the orientation of the alignment tensor. In the case of the dimer, two sets of RDC from non-degenerate alignment media will be required to determine the symmetry axis unambiguously.<br>  


==== XPLOR-NIH and dimer structure calculation  ====


XPLOR-NIH has extensive support for RDC-assisted structure refinement. However, the RDC potential implementation in XPLOR-NIH does not utilize the dimer symmetry axis orientation information embedded in the RDCs. Therefore, refining structures with the RDC in XPLOR-NIH will not result in correct oligomer structures.
==== Applicability of the algorithm to weak dimers  ====
Although the scheme was first envisioned for strong dimers, there is increasing evidence that the protocol may be even more valuable for the solution of weak dimer structures. It is usually very difficult to obtain intermolecular NOEs for weaker dimers through X-filtered experiments and, as a consequence, probability of obtaining erroneous monomer structure is also reduced. However, because there is usually a mixture of monomer and dimer species in the solution, RDC measured contains contributions from both monomer and dimer. To obtain an accurate measurement of only the dimer component of the RDC, the dissociation constant of the dimerization process needs to be known and RDC measurements will have to be made on multiple samples with different concentrations. This allows the value of the dimer RDC to be extrapolated based the linear relationship between the fraction of dimer in solution and the RDC value. Although the procedure is tedious, it can provide information that is otherwise not obtainable.
==== Implementation of the grid search  ====
Once the symmetry axis of the oligomer has been identified, the construction of the dimer models can then be carried out via a simple grid search. Computational modeling of dimer structures can be implemented with several approaches. In the absence of the symmetry axis information, the search is a three-dimensional site search that contains a large number of possibilities. But with the symmetry information, the search is restricted to a simple two dimensional search on the plane orthogonal to the symmetry axis. This is because movements parallel to the axis will break the symmetry thus can not be allowed in order to preserve the orientation of the symmetry axis.<br>The grid search can be implemented with a variety of programs. The implementation used in this protocol uses the program VMD, which contains an extensive set of functions that allows the convenient manipulation of PDB coordinates and is capable of calling outside programs to perform molecular dynamics and simulate RDCs. During the grid search, the PDB coordinates of the monomer in the alignment tensor frame is first imported and placed at the origin. An identical copy of the molecule is then made and rotated by 180 degrees around the symmetry axis. The copy is then positioned at regularly spaced points in the plane perpendicular to the symmetry axis. At each grid point, a preliminary analysis of the relations between the two molecules is carried out to judge the suitability of the model being a realistic dimer. The analysis involves mostly calculating the distance between the two molecules to ensure that there are no egregious clashes between the molecules yet they are close enough to have an intermolecular interaction surface. Models judged to be plausible will undergo a round of molecular dynamics involving only residues at the interface. This will improve the interfacial contact if the initial side chain conformation of the monomer is not optimal. All suitable models produced by the model will then be evaluated based on the suitability of the intermolecular interface and, if PEG RDCs are available, the agreement between predicted RDCs based on the dimer structure and the experimental RDCs.


== '''Experimental Procedure'''  ==
== '''Experimental Procedure'''  ==
Line 13: Line 23:
=== <br>Part 1: Obtaining the symmetry axis orientation  ===
=== <br>Part 1: Obtaining the symmetry axis orientation  ===


Information regarding the orientation of the symmetry axis relative to the monomers is embedded in the RDCs. To obtain this information, the structure of the monomer is required. Using the structure, alignment tensor orientation of the RDCs can be calculated. The most convenient way of calculating the alignment tensor is to use the program REDCAT [3]. Please refer to the “RDC Screening and Structure Refinement” section of the TWiki for details on how to prepare input files for and running REDCAT. For dimers, RDCs from two non-degenerate media are required.<br><br>Once the alignment tensors have been calculated, use the plot feature of REDCAT to plot the Sauson-Flamsteed projection of the tensor orientation (Tools-&gt;Plot-&gt;2D SF Plot). Trimers and higher order oligomers should have an axially symmetric alignment tensor with the non degenerate axis being the symmetry axis (Figure&nbsp;??). For dimers, plot the two alignment tensors on the same graph. The two tensor orientations should be related to each other by a rotation around a stationary axis (Figure&nbsp;??). The stationary axis is therefore the symmetry axis of the dimer.<br><br>Once the symmetry axis has been identified, the PDB coordinates of the structure in the alignment tensor frame can be obtained by using the rotate-PDB function in REDCAT, keeping in mind that the angles supplied by REDCAT only applies to the structure from which the alignment tensor is calculated.
Information regarding the orientation of the symmetry axis relative to the monomers is embedded in the RDCs. To obtain this information, the structure of the monomer is required. Using the structure, alignment tensor orientation of the RDCs can be calculated. The most convenient way of calculating the alignment tensor is to use the program REDCAT [3]. Please refer to the “RDC Screening and Structure Refinement” section of the TWiki for details on how to prepare input files for and running REDCAT. For dimers, RDCs from two non-degenerate media are required.<br><br>Once the alignment tensors have been calculated, use the plot feature of REDCAT to plot the Sauson-Flamsteed projection of the tensor orientation (Tools-&gt;Plot-&gt;2D SF Plot). Trimers and higher order oligomers should have an axially symmetric alignment tensor with the non degenerate axis being the symmetry axis. For dimers, plot the two alignment tensors on the same graph. The two tensor orientations should be related to each other by a rotation around a stationary axis. The stationary axis is therefore the symmetry axis of the dimer.  


=== <br>Part 2: Grid search ===
Once the symmetry axis has been identified, the PDB coordinates of the structure in the alignment tensor frame can be obtained by using the rotate-PDB function in REDCAT, keeping in mind that the angles supplied by REDCAT only applies to the structure from which the alignment tensor is calculated.


<br>The grid search can be implemented using virtually any program. The current approach uses VMD, although a XPLOR-NIH based approach is probably more efficient and convenient. VMD is usually considered only as a visualization program. However, it includes two embedded scripting language interfaces (tcl &amp; python), through which graphical elements as well as data such as PDB coordinates can be manipulated very easily. The current grid search is implemented using the tcl interface, which means the scripts that executes the grid search is consisted of a series of tcl commands. More information on tcl commands provided by VMD can be found at the VMD website (www.ks.uiuc.edu/Research/vmd).
=== <br>Part 2: Grid search ===


To set up the grid search several input files need to be prepared:<br>1. PDB of the monomer in the alignment tensor frame.<br>2. Steric alignment RDCs of the monomers in PALES format.<br>3. The dimensions of the monomer in anstroms.<br>4. Configuration files for NAMD.<br>5. The oma.tcl and search.tcl scripts.<br><br>The first two files can be prepared fairly easily. Please consult PALES documentation for the format of its input files. The dimensions of molecule can be found in VMD using the “measure minmax” function. A number of other programs are also capable of giving the same information. The oma.tcl and search.tcl scripts are attached to the end of this entry.<br>Two external programs will also be called during the grid search. NAMD will be used for the binding surface MD/energy minimization. PALES will be used to predict steric-alignment RDCs. NAMD also requires a configuration file as well as the parameter file for the force field for the MD calculation. Furthermore, to create the correct dimer PDB, the VMD plugin psfgen must be installed and working correctly. Please consult VMD documentation for instructions on how to install psfgen.<br>The best practice is to gather all needed files and binaries for NAMD and PALES in one directory and run each grid search in its own directory so that output files are separated.<br>In the case of dimers, all three principal axes maybe the symmetry axis. However, the search.tcl script is axis-specific. To accommodate the possibility all three axes can be the symmetry axis, three versions of the search.tcl files are included. They are named xsearch.tcl, ysesarch.tcl and zsearch.tcl. The first letter in each name specifies the axis that is the symmetry axis. Use the file that is appropriate in the situation.<br><br>Before launching the grid search, one parameter in the search script needs to be adjusted. Because the grid search covers a square area with grid points that are one angstrom apart in both dimensions, the size of the search plane needs to be specified in the search script. This needs to be done at two places: as an argument to the setup function and then as an argument to the gridsearch function. Note that there are axis-specific forms of both setup and gridsearch functions. For the setup function, its arguments consisted of the molecule ID of the monomer structure and a list of two numbers. The first number specifies the dimension of the search area, which should be set to twice the size of the longest side of the protein. The second number specifies the place on the grid to start the search. If n is specified as the size of the grid, then starting point must between 0 and n-1. As for the gridsearch function, it also takes two arguments. Its second argument is a list three numbers. The first number is the size of the grid, the second number is the starting point and the third number is the stopping point. Needless to say the first two numbers must be identical to the two numbers received by the setup function. The name of the PDB file containing the monomer structure also needs to be specified. This is done as an argument to the mol command (line two of the script). Once all the parameters have been set up correctly, the script can be launched without starting the graphical interface using the command:<br>
The grid search can be implemented using virtually any program. The current approach uses VMD, although a XPLOR-NIH based approach is probably more efficient and convenient. VMD is usually considered only as a visualization program. However, it includes two embedded scripting language interfaces (tcl &amp; python), through which graphical elements as well as data such as PDB coordinates can be manipulated very easily. The current grid search is implemented using the tcl interface, which means the scripts that executes the grid search is consisted of a series of tcl commands. More information on tcl commands provided by VMD can be found at the VMD website (www.ks.uiuc.edu/Research/vmd).
 
To set up the grid search several input files need to be prepared:<br>1. PDB of the monomer in the alignment tensor frame.<br>2. Steric alignment RDCs of the monomers in PALES format.<br>3. The dimensions of the monomer in anstroms.<br>4. Configuration files for NAMD.<br>5. The oma.tcl and search.tcl scripts.<br><br>The first two files can be prepared fairly easily. Please consult PALES documentation for the format of its input files. The dimensions of molecule can be found in VMD using the “measure minmax” function. A number of other programs are also capable of giving the same information. The oma.tcl and search.tcl scripts are attached to the end of this entry.<br>Two external programs will also be called during the grid search. NAMD will be used for the binding surface MD/energy minimization. PALES will be used to predict steric-alignment RDCs. NAMD also requires a configuration file as well as the parameter file for the force field for the MD calculation. Furthermore, to create the correct dimer PDB, the VMD plugin psfgen must be installed and working correctly. Please consult VMD documentation for instructions on how to install psfgen.<br>The best practice is to gather all needed files and binaries for NAMD and PALES in one directory and run each grid search in its own directory so that output files are separated.<br>In the case of dimers, all three principal axes maybe the symmetry axis. However, the search.tcl script is axis-specific. To accommodate the possibility all three axes can be the symmetry axis, three versions of the search.tcl files are included. They are named xsearch.tcl, ysesarch.tcl and zsearch.tcl. The first letter in each name specifies the axis that is the symmetry axis. Use the file that is appropriate in the situation.<br><br>Before launching the grid search, one parameter in the search script needs to be adjusted. Because the grid search covers a square area with grid points that are one angstrom apart in both dimensions, the size of the search plane needs to be specified in the search script. This needs to be done at two places: as an argument to the setup function and then as an argument to the gridsearch function. Note that there are axis-specific forms of both setup and gridsearch functions. For the setup function, its arguments consisted of the molecule ID of the monomer structure and a list of two numbers. The first number specifies the dimension of the search area, which should be set to twice the size of the longest side of the protein. The second number specifies the place on the grid to start the search. If n is specified as the size of the grid, then starting point must between 0 and n-1. As for the gridsearch function, it also takes two arguments. Its second argument is a list three numbers. The first number is the size of the grid, the second number is the starting point and the third number is the stopping point. Needless to say the first two numbers must be identical to the two numbers received by the setup function. The name of the PDB file containing the monomer structure also needs to be specified. This is done as an argument to the mol command (line two of the script). Once all the parameters have been set up correctly, the script can be launched without starting the graphical interface using the command:<br>  
<pre>vmd –disp none –e search.tcl &gt;&amp; vmdlog
<pre>vmd –disp none –e search.tcl &gt;&amp; vmdlog
</pre>
</pre>  
<br><br>The grid search uses only a single processor. However, as the search can be started in any place on the grid (albeit this is available for only one dimension), several runs covering non-overlapping areas of the grid can be started simultaneously.  
<br>The grid search uses only a single processor. However, as the search can be started in any place on the grid (albeit this is available for only one dimension), several runs covering non-overlapping areas of the grid can be started simultaneously.  


After the search has been completed, a file containing statistics on each model generated will be produced. Some of the statistics will be used to evaluate the model later on.
After the search has been completed, a file containing statistics on each model generated will be produced. Some of the statistics will be used to evaluate the model later on.  


=== <br>Part 3: Evaluating the models ===
=== <br>Part 3: Evaluating the models ===


<br>The use of symmetry information from RDCs greatly reduces the number of possible interfaces and therefore the number of possible dimer models. However, the two-dimensional grid search can still generate ~1000 geometrically plausible models. An efficient and reliable way of evaluating them is still needed. For this purpose, we have chosen several important criteria to evaluate each model generated by the search.  
The use of symmetry information from RDCs greatly reduces the number of possible interfaces and therefore the number of possible dimer models. However, the two-dimensional grid search can still generate ~1000 geometrically plausible models. An efficient and reliable way of evaluating them is still needed. For this purpose, we have chosen several important criteria to evaluate each model generated by the search.  


As the RDCs are the only experimental data used in the search, good agreement between calculated RDCs based on the predicted alignment of the model and experimental RDCs is essential. To predict RDCs based on steric alignment, the steric PALES mode of the PALES program has been used. PALES can predict the magnitude and orientation of the alignment given the concentration of the steric alignment media and the structure. As part of the grid search, predicted alignment of each model generated will be done automatically. The correlation between the predicted and experimental RDCs is outputted for every model in the “results.dat” file. However, as alignment is based specifically on shape of the model, multiple models may have similar predicted alignments. To further narrow down the number of possible models, residue pairing score for each model is also tabulated.  
As the RDCs are the only experimental data used in the search, good agreement between calculated RDCs based on the predicted alignment of the model and experimental RDCs is essential. To predict RDCs based on steric alignment, the steric PALES mode of the PALES program has been used. PALES can predict the magnitude and orientation of the alignment given the concentration of the steric alignment media and the structure. As part of the grid search, predicted alignment of each model generated will be done automatically. The correlation between the predicted and experimental RDCs is outputted for every model in the “results.dat” file. However, as alignment is based specifically on shape of the model, multiple models may have similar predicted alignments. To further narrow down the number of possible models, residue pairing score for each model is also tabulated.  
Line 34: Line 46:
Residue pairing score is a simple but reliable way of evaluating the validity of the proposed interaction surface between two proteins. The concept was propose by Moont et al. [4] and utilizes empirical statistics of the probability of a pair of amino acids being close to each other in the interface. It is extremely helpful in removing models that have non-probable interaction surface, which can not be recognized by RDC prediction or geometric filters. However, before running the program, the PDB models generated by the grid search needs to be processed. First, the two monomers should be placed in separate PDB files. Then each monomer PDB will then be preprocessed with the script “preprocess-pdb.per”. The parsed PDB file is now suitable for use as input to the program rpdock. Higher residue pairing score usually indicates the interface has high agreement with experimental norm.  
Residue pairing score is a simple but reliable way of evaluating the validity of the proposed interaction surface between two proteins. The concept was propose by Moont et al. [4] and utilizes empirical statistics of the probability of a pair of amino acids being close to each other in the interface. It is extremely helpful in removing models that have non-probable interaction surface, which can not be recognized by RDC prediction or geometric filters. However, before running the program, the PDB models generated by the grid search needs to be processed. First, the two monomers should be placed in separate PDB files. Then each monomer PDB will then be preprocessed with the script “preprocess-pdb.per”. The parsed PDB file is now suitable for use as input to the program rpdock. Higher residue pairing score usually indicates the interface has high agreement with experimental norm.  


Besides predicted RDC and residue pairing scores, other criteria such as van der waal energy of the model and the size of the binding surface can also be used as selection criteria. However, in our experience, predicted RDC and residue pairing score are the most selective. If NOE data is available, they can also serve as highly selective evaluators. Shape data from small angle X-ray scattering (SAXS) experiments can also be used as a replacement for back predicted RDCs. In fact, SAXS data may be more reliable than PALES-predicted RDCs since the shapes are directly measured by SAXS.
Besides predicted RDC and residue pairing scores, other criteria such as van der waal energy of the model and the size of the binding surface can also be used as selection criteria. However, in our experience, predicted RDC and residue pairing score are the most selective. If NOE data is available, they can also serve as highly selective evaluators. Shape data from small angle X-ray scattering (SAXS) experiments can also be used as a replacement for back predicted RDCs. In fact, SAXS data may be more reliable than PALES-predicted RDCs since the shapes are directly measured by SAXS.  
 
== <br>References ==


1. Levy, E.D., et al., 3D complex: A structural classification of protein complexes. Plos Computational Biology, 2006. 2(11): p. 1395-1406.<br>2. Al-Hashimi, H.M., P.J. Bolon, and J.H. Prestegard, Molecular symmetry as an aid to geometry determination in ligand protein complexes. Journal of Magnetic Resonance, 2000. 142(1): p. 153-158.<br>3. Valafar, H. and J.H. Prestegard, REDCAT: a residual dipolar coupling analysis tool. Journal of Magnetic Resonance, 2004. 167(2): p. 228-241.<br>4. Moont, G., H.A. Gabb, and M.J.E. Sternberg, Use of pair potentials across protein interfaces in screening predicted docked complexes. Proteins-Structure Function and Genetics, 1999. 35(3): p. 364-373.
== <br>'''References'''  ==


-- Main.XuWang - 18 Jul 2009
[http://www.ncbi.nlm.nih.gov/pubmed/17112313?itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum&ordinalpos=10 1. Levy, E.D., Pereira-Leal, J.B., Chothia, C., and Teichmann, S.A. 3D complex: A structural classification of protein complexes. Plos Computational Biology, 2006. 2(11): p. 1395-1406.]<br>[http://www.ncbi.nlm.nih.gov/pubmed/10617446?itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum&ordinalpos=6 2. Al-Hashimi, H.M., Bolon, P.J., and Prestegard, J. H.&nbsp; Molecular symmetry as an aid to geometry determination in ligand protein complexes. Journal of Magnetic Resonance, 2000. 142(1): p. 153-158.]<br>[http://www.ncbi.nlm.nih.gov/pubmed/15040978?itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum&ordinalpos=6 3. Valafar, H. and Prestegard, J. H.&nbsp; REDCAT: a residual dipolar coupling analysis tool. Journal of Magnetic Resonance, 2004. 167(2): p. 228-241.]<br>[http://www.ncbi.nlm.nih.gov/pubmed/10328272?itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum&ordinalpos=2 4. Moont, G., Gabb, H.A., and Sternberg, M.J.E.&nbsp; Use of pair potentials across protein interfaces in screening predicted docked complexes. Proteins-Structure Function and Genetics, 1999. 35(3): p. 364-373.]

Latest revision as of 21:02, 6 January 2010

Introduction

Protein homo-oligomers occur in nature frequently and some oligomerization processes can influence biological processes significantly. In fact, according to Levy et al. a high percentage of proteins are capable of forming homo-oligomers [1]. The prevalence of homo-oligomers has made elucidation of their complex structures a major imperative for the structural genomics community. However, solving homo-oligomer structures has always been difficult using traditional NOE-based techniques in solution NMR. This is primarily because obtaining unambiguous intermolecular distance constraints using NOE is limited by both sample preparation and NMR methodology. The traditional approach uses isotopic filtering methods to distinguish between intermolecular and intramolecular NOEs. This requires the preparation of samples that contains a mixture of isotopically labeled and unlabeled proteins. Owing to the probability of oligomer association, only half of the protein can form oligomers with both labeled and unlabeled components. This immediately results in a 50% loss in signal intensity. The isotope-filtering/editing experiment used to obtain the NOEs also suffers from low sensitivity due to the presence of many filtering elements. Further more, the nature of the homo-oligomer implies that the same set of residues will be involved in all intermolecular NOEs. This increases the likelihood of intermolecular NOEs between sequential residues or identical residues, thus lowering the number of unambiguous intermolecular NOEs in many situations.

The disadvantages of the NOE approach in solving homo-oligomer structures have become the motivation for the development of a complementary approach for solving homo-oligomer structures using NMR. The one approach taken by us in this study uses residual dipolar couplings (RDCs) to obtain orientation of the symmetry axis relative to the monomer structure. This information will then be used to model the dimer structure using a simple grid search algorithm to find the suitable binding interface.

Theoretical Background

Dipolar interaction between spins is one of a few dominant interactions in NMR. In solution, the fast reorientation of the molecules usually averages the interaction to zero. However, if the molecules are forced to favor one particular orientation, such as the case of molecules immersed liquid crystalline media or attached to a paramagnetic ion, the small imbalance in average orientation allows a small portion of the dipolar interaction to remain, thus adding an extra component to the splitting observed between the two singlets. This extra component remaining is referred to as the residual dipolar coupling or RDC. As dipolar interaction is orientation dependent, the size of RDC is related to the average orientation of the inter-nuclear vector relative to the magnetic field, and offers information complementary to that of NOE.

For every alignment, information on five parameters is required for each alignment medium to determine the RDC for each atom pair. This information is usually related in the form of a three-by-three symmetrical and traceless matrix, which is known as the alignment tensor. A frame of reference can be chosen such that the matrix adopts the form of a diagonal matrix in the frame. This frame of reference is usually referred to as the principal axis frame of the alignment tensor. The axes of this frame are unique in that RDC values of inter-nuclear vectors related by 180 degrees rotation around any of the axes are identical. This symmetrical property will be exploited in our modeling of dimer structures: since 180 degree rotation around the dimer symmetry axis leaves the complex unchanged, the RDC values of each molecule should also be invariant. This implies that the symmetry axis of a dimeric molecule must be parallel to one of the axes in the principal alignment frame. This is in fact true of all homo-oligomers with cyclic symmetry. Al-Hashimi et al. has shown that homo-oligomers possessing cyclic order of three or higher always align with the symmetry axis parallel to the non-generate axis of the axially symmetric alignment tensor [2]. In the case of a dimer, all three principal axes have the potential to be parallel to the symmetry axis. The degeneracy can only be solved experimentally by aligning the protein in two non-degenerate alignment media. As the symmetry axis adopts the same orientation relative to the monomers irrespective of alignment media, two non-degenerate alignment tensors should share a common axis, thus identifying the orientation of the symmetry axis.

Although RDCs have been used in structural determination and validation for some time, the symmetry information contained in RDCs has rarely been exploited. This is especially unfortunate in the situation of homo-oligomer structure elucidation, where global orientation information can help greatly in the interpretation of intermolecular NOEs, if not determining the structure out right. Our protocol will first determine the orientation of the symmetry axis of the oligomer using the relationship between the alignment tensor axis and the oligomer symmetry axis outlined above. This information greatly restricts the possible positions of the dimeric interface and allows it to be determined using only a simple two dimensional grid search.

It is prudent to mention at this point the pre-requisites for successfully executing the protocol. Since the alignment tensor must be known to place the symmetry axis, a correct monomer structure must be available. If monomer structure is not available due to contamination of the NOE data with intermolecular NOEs, fragments of the structure whose conformation is certain can be used in isolation to calculate the orientation of the alignment tensor. In the case of the dimer, two sets of RDC from non-degenerate alignment media will be required to determine the symmetry axis unambiguously.

XPLOR-NIH and dimer structure calculation

XPLOR-NIH has extensive support for RDC-assisted structure refinement. However, the RDC potential implementation in XPLOR-NIH does not utilize the dimer symmetry axis orientation information embedded in the RDCs. Therefore, refining structures with the RDC in XPLOR-NIH will not result in correct oligomer structures.

Applicability of the algorithm to weak dimers

Although the scheme was first envisioned for strong dimers, there is increasing evidence that the protocol may be even more valuable for the solution of weak dimer structures. It is usually very difficult to obtain intermolecular NOEs for weaker dimers through X-filtered experiments and, as a consequence, probability of obtaining erroneous monomer structure is also reduced. However, because there is usually a mixture of monomer and dimer species in the solution, RDC measured contains contributions from both monomer and dimer. To obtain an accurate measurement of only the dimer component of the RDC, the dissociation constant of the dimerization process needs to be known and RDC measurements will have to be made on multiple samples with different concentrations. This allows the value of the dimer RDC to be extrapolated based the linear relationship between the fraction of dimer in solution and the RDC value. Although the procedure is tedious, it can provide information that is otherwise not obtainable.

Implementation of the grid search

Once the symmetry axis of the oligomer has been identified, the construction of the dimer models can then be carried out via a simple grid search. Computational modeling of dimer structures can be implemented with several approaches. In the absence of the symmetry axis information, the search is a three-dimensional site search that contains a large number of possibilities. But with the symmetry information, the search is restricted to a simple two dimensional search on the plane orthogonal to the symmetry axis. This is because movements parallel to the axis will break the symmetry thus can not be allowed in order to preserve the orientation of the symmetry axis.
The grid search can be implemented with a variety of programs. The implementation used in this protocol uses the program VMD, which contains an extensive set of functions that allows the convenient manipulation of PDB coordinates and is capable of calling outside programs to perform molecular dynamics and simulate RDCs. During the grid search, the PDB coordinates of the monomer in the alignment tensor frame is first imported and placed at the origin. An identical copy of the molecule is then made and rotated by 180 degrees around the symmetry axis. The copy is then positioned at regularly spaced points in the plane perpendicular to the symmetry axis. At each grid point, a preliminary analysis of the relations between the two molecules is carried out to judge the suitability of the model being a realistic dimer. The analysis involves mostly calculating the distance between the two molecules to ensure that there are no egregious clashes between the molecules yet they are close enough to have an intermolecular interaction surface. Models judged to be plausible will undergo a round of molecular dynamics involving only residues at the interface. This will improve the interfacial contact if the initial side chain conformation of the monomer is not optimal. All suitable models produced by the model will then be evaluated based on the suitability of the intermolecular interface and, if PEG RDCs are available, the agreement between predicted RDCs based on the dimer structure and the experimental RDCs.

Experimental Procedure


Part 1: Obtaining the symmetry axis orientation

Information regarding the orientation of the symmetry axis relative to the monomers is embedded in the RDCs. To obtain this information, the structure of the monomer is required. Using the structure, alignment tensor orientation of the RDCs can be calculated. The most convenient way of calculating the alignment tensor is to use the program REDCAT [3]. Please refer to the “RDC Screening and Structure Refinement” section of the TWiki for details on how to prepare input files for and running REDCAT. For dimers, RDCs from two non-degenerate media are required.

Once the alignment tensors have been calculated, use the plot feature of REDCAT to plot the Sauson-Flamsteed projection of the tensor orientation (Tools->Plot->2D SF Plot). Trimers and higher order oligomers should have an axially symmetric alignment tensor with the non degenerate axis being the symmetry axis. For dimers, plot the two alignment tensors on the same graph. The two tensor orientations should be related to each other by a rotation around a stationary axis. The stationary axis is therefore the symmetry axis of the dimer.

Once the symmetry axis has been identified, the PDB coordinates of the structure in the alignment tensor frame can be obtained by using the rotate-PDB function in REDCAT, keeping in mind that the angles supplied by REDCAT only applies to the structure from which the alignment tensor is calculated.


Part 2: Grid search

The grid search can be implemented using virtually any program. The current approach uses VMD, although a XPLOR-NIH based approach is probably more efficient and convenient. VMD is usually considered only as a visualization program. However, it includes two embedded scripting language interfaces (tcl & python), through which graphical elements as well as data such as PDB coordinates can be manipulated very easily. The current grid search is implemented using the tcl interface, which means the scripts that executes the grid search is consisted of a series of tcl commands. More information on tcl commands provided by VMD can be found at the VMD website (www.ks.uiuc.edu/Research/vmd).

To set up the grid search several input files need to be prepared:
1. PDB of the monomer in the alignment tensor frame.
2. Steric alignment RDCs of the monomers in PALES format.
3. The dimensions of the monomer in anstroms.
4. Configuration files for NAMD.
5. The oma.tcl and search.tcl scripts.

The first two files can be prepared fairly easily. Please consult PALES documentation for the format of its input files. The dimensions of molecule can be found in VMD using the “measure minmax” function. A number of other programs are also capable of giving the same information. The oma.tcl and search.tcl scripts are attached to the end of this entry.
Two external programs will also be called during the grid search. NAMD will be used for the binding surface MD/energy minimization. PALES will be used to predict steric-alignment RDCs. NAMD also requires a configuration file as well as the parameter file for the force field for the MD calculation. Furthermore, to create the correct dimer PDB, the VMD plugin psfgen must be installed and working correctly. Please consult VMD documentation for instructions on how to install psfgen.
The best practice is to gather all needed files and binaries for NAMD and PALES in one directory and run each grid search in its own directory so that output files are separated.
In the case of dimers, all three principal axes maybe the symmetry axis. However, the search.tcl script is axis-specific. To accommodate the possibility all three axes can be the symmetry axis, three versions of the search.tcl files are included. They are named xsearch.tcl, ysesarch.tcl and zsearch.tcl. The first letter in each name specifies the axis that is the symmetry axis. Use the file that is appropriate in the situation.

Before launching the grid search, one parameter in the search script needs to be adjusted. Because the grid search covers a square area with grid points that are one angstrom apart in both dimensions, the size of the search plane needs to be specified in the search script. This needs to be done at two places: as an argument to the setup function and then as an argument to the gridsearch function. Note that there are axis-specific forms of both setup and gridsearch functions. For the setup function, its arguments consisted of the molecule ID of the monomer structure and a list of two numbers. The first number specifies the dimension of the search area, which should be set to twice the size of the longest side of the protein. The second number specifies the place on the grid to start the search. If n is specified as the size of the grid, then starting point must between 0 and n-1. As for the gridsearch function, it also takes two arguments. Its second argument is a list three numbers. The first number is the size of the grid, the second number is the starting point and the third number is the stopping point. Needless to say the first two numbers must be identical to the two numbers received by the setup function. The name of the PDB file containing the monomer structure also needs to be specified. This is done as an argument to the mol command (line two of the script). Once all the parameters have been set up correctly, the script can be launched without starting the graphical interface using the command:

vmd –disp none –e search.tcl >& vmdlog


The grid search uses only a single processor. However, as the search can be started in any place on the grid (albeit this is available for only one dimension), several runs covering non-overlapping areas of the grid can be started simultaneously.

After the search has been completed, a file containing statistics on each model generated will be produced. Some of the statistics will be used to evaluate the model later on.


Part 3: Evaluating the models

The use of symmetry information from RDCs greatly reduces the number of possible interfaces and therefore the number of possible dimer models. However, the two-dimensional grid search can still generate ~1000 geometrically plausible models. An efficient and reliable way of evaluating them is still needed. For this purpose, we have chosen several important criteria to evaluate each model generated by the search.

As the RDCs are the only experimental data used in the search, good agreement between calculated RDCs based on the predicted alignment of the model and experimental RDCs is essential. To predict RDCs based on steric alignment, the steric PALES mode of the PALES program has been used. PALES can predict the magnitude and orientation of the alignment given the concentration of the steric alignment media and the structure. As part of the grid search, predicted alignment of each model generated will be done automatically. The correlation between the predicted and experimental RDCs is outputted for every model in the “results.dat” file. However, as alignment is based specifically on shape of the model, multiple models may have similar predicted alignments. To further narrow down the number of possible models, residue pairing score for each model is also tabulated.

Residue pairing score is a simple but reliable way of evaluating the validity of the proposed interaction surface between two proteins. The concept was propose by Moont et al. [4] and utilizes empirical statistics of the probability of a pair of amino acids being close to each other in the interface. It is extremely helpful in removing models that have non-probable interaction surface, which can not be recognized by RDC prediction or geometric filters. However, before running the program, the PDB models generated by the grid search needs to be processed. First, the two monomers should be placed in separate PDB files. Then each monomer PDB will then be preprocessed with the script “preprocess-pdb.per”. The parsed PDB file is now suitable for use as input to the program rpdock. Higher residue pairing score usually indicates the interface has high agreement with experimental norm.

Besides predicted RDC and residue pairing scores, other criteria such as van der waal energy of the model and the size of the binding surface can also be used as selection criteria. However, in our experience, predicted RDC and residue pairing score are the most selective. If NOE data is available, they can also serve as highly selective evaluators. Shape data from small angle X-ray scattering (SAXS) experiments can also be used as a replacement for back predicted RDCs. In fact, SAXS data may be more reliable than PALES-predicted RDCs since the shapes are directly measured by SAXS.


References

1. Levy, E.D., Pereira-Leal, J.B., Chothia, C., and Teichmann, S.A. 3D complex: A structural classification of protein complexes. Plos Computational Biology, 2006. 2(11): p. 1395-1406.
2. Al-Hashimi, H.M., Bolon, P.J., and Prestegard, J. H.  Molecular symmetry as an aid to geometry determination in ligand protein complexes. Journal of Magnetic Resonance, 2000. 142(1): p. 153-158.
3. Valafar, H. and Prestegard, J. H.  REDCAT: a residual dipolar coupling analysis tool. Journal of Magnetic Resonance, 2004. 167(2): p. 228-241.
4. Moont, G., Gabb, H.A., and Sternberg, M.J.E.  Use of pair potentials across protein interfaces in screening predicted docked complexes. Proteins-Structure Function and Genetics, 1999. 35(3): p. 364-373.