Structure Calculation Using CS-Rosetta: Difference between revisions

From NESG Wiki
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
== '''Structure Calculation with CS-Rosetta''' ==
== '''Introduction''' ==


=== '''Original Documentation''' ===
The CS-ROSETTA approach (Ref. 1,2) combines the Monte Carlo based structure assembly program ROSETTA with empirical structural information obtained from backbone and <sup>13</sup>Cb chemical shift data.&nbsp; The robust CS-ROSETTA protocol is capable of successfully predicting 3D protein structures up to 15 kDa in size (Ref. 1).&nbsp; A complete description of the program along with downloads are available from the Bax laboratory web site:


http://spin.niddk.nih.gov/bax/software/CSROSETTA/index.html
http://spin.niddk.nih.gov/bax/software/CSROSETTA/index.html  


=== '''Random Coil Index Prediction''' ===


Perform flexible region prediction on the [http://wishart.biology.ualberta.ca/rci/cgi-bin/rci_cgi_1_e.py RCI Web Page].
 
=== '''Random Coil Index Prediction'''  ===
 
Perform flexible region prediction on the [http://wishart.biology.ualberta.ca/rci/cgi-bin/rci_cgi_1_e.py RCI Web Page].  


RCI will take a bmrb file in the old format, as produced by CYANA 1.0 (the new BMRB format has an extra "chain" column). Unlike AutoStructure input file, the sequence field should be left in place.  
RCI will take a bmrb file in the old format, as produced by CYANA 1.0 (the new BMRB format has an extra "chain" column). Unlike AutoStructure input file, the sequence field should be left in place.  


Use an init.cya file:
Use an init.cya file: <nowiki>
<nowiki>
name:=XXXX            # Replace XXXX with NESG ID
name:=XXXX            # Replace XXXX with NESG ID
cyanalib                # Read the standard library
cyanalib                # Read the standard library
pseudo=2              # Allows HB, HD, etc. pseudoatom names, use with CARA
pseudo=2              # Allows HB, HD, etc. pseudoatom names, use with CARA
read seq $name        # Initialize
read seq $name        # Initialize
</nowiki>
</nowiki>  


If you are using proton list from CARA, convert it first to "dyana" format with Cyana 2.1:
If you are using proton list from CARA, convert it first to "dyana" format with Cyana 2.1: <nowiki>
<nowiki>
read prot XXXX.prot
read prot XXXX.prot
pseudo=0
pseudo=0
translate dyana
translate dyana
write prot XXXX_dyana
write prot XXXX_dyana
</nowiki>
</nowiki>  


Use cyana 1.0.5 to prepare a bmrb file
Use cyana 1.0.5 to prepare a bmrb file <nowiki>
<nowiki>
read prot XXXX_dyana.prot
read prot XXXX_dyana.prot
bmrblist XXXX.bmrb
bmrblist XXXX.bmrb
</nowiki>
</nowiki>  
 
Make sure you change the <tt>_Chem_shift_ambiguity_type</tt> tag to <tt>_Chem_shift_ambiguity_code</tt>. Yes, it is stupid, but RCI will report an error if you don't do it.
 
Flexible N- and C-terminal tails should be removed for CS-ROSETTA calculation to reduce CPU time. Flexible loop regions will later be excluded from calculation of all-atom energy.
 
=== '''Generating MFR fragments on U2 cluster at SUNY Buffalo'''  ===
 
Copy the <tt>runCSRjob.com</tt> file into the working directory and change the number of fragments to be generated.


Make sure you change the <tt>_Chem_shift_ambiguity_type</tt> tag to <tt>_Chem_shift_ambiguity_code</tt>. Yes, it is stupid, but RCI will report an error if you don't do it.
Type <tt>qsub runCSRjob.pbs</tt> to submit the job into queue. This calculation takes ~2 hours for 1000 fragments for a small protein, therefore it cannot be run on a master node.  


Flexible N- and C-terminal tails should be removed for CS-ROSETTA calculation to reduce CPU time. Flexible loop regions will later be excluded from calculation of all-atom energy.
=== '''Running CS-Rosetta on U2 cluster at SUNY Buffalo'''  ===


=== '''Generating MFR fragments on U2 cluster at SUNY Buffalo''' ===
Go to the <tt>rosetta</tt> subdirectory. Figure out how many parallel Rosetta jobs you will need to run. Things to consider are:


Copy the <tt>runCSRjob.com</tt> file into the working directory and change the number of fragments to be generated.
*The total number of fragments to be calculated
*The maximum wall-time for a single job is 72 h
*It takes ~10 min to calculate a single structure of a small protein on a single CPU


Type <tt>qsub runCSRjob.pbs</tt> to submit the job into queue. This calculation takes ~2 hours for 1000 fragments for a small protein, therefore it cannot be run on a master node.
<br> Type <tt>./runRosetta.csh N</tt>, where =N= is the number of parallel Rosetta jobs


=== '''Running CS-Rosetta on U2 cluster at SUNY Buffalo''' ===


Go to the <tt>rosetta</tt> subdirectory. Figure out how many parallel Rosetta jobs you will need to run. Things to consider are:
* The total number of fragments to be calculated
* The maximum wall-time for a single job is 72 h
* It takes ~10 min to calculate a single structure of a small protein on a single CPU




Type <tt>./runRosetta.csh N</tt>, where =N= is the number of parallel Rosetta jobs


-- AlexEletski - 17 Apr 2008


-- Main.AlexEletski - 17 Apr 2008
-- Updated by JimAramini - Nov 2009

Revision as of 17:29, 6 November 2009

Introduction

The CS-ROSETTA approach (Ref. 1,2) combines the Monte Carlo based structure assembly program ROSETTA with empirical structural information obtained from backbone and 13Cb chemical shift data.  The robust CS-ROSETTA protocol is capable of successfully predicting 3D protein structures up to 15 kDa in size (Ref. 1).  A complete description of the program along with downloads are available from the Bax laboratory web site:

http://spin.niddk.nih.gov/bax/software/CSROSETTA/index.html


Random Coil Index Prediction

Perform flexible region prediction on the RCI Web Page.

RCI will take a bmrb file in the old format, as produced by CYANA 1.0 (the new BMRB format has an extra "chain" column). Unlike AutoStructure input file, the sequence field should be left in place.

Use an init.cya file: name:=XXXX # Replace XXXX with NESG ID cyanalib # Read the standard library pseudo=2 # Allows HB, HD, etc. pseudoatom names, use with CARA read seq $name # Initialize

If you are using proton list from CARA, convert it first to "dyana" format with Cyana 2.1: read prot XXXX.prot pseudo=0 translate dyana write prot XXXX_dyana

Use cyana 1.0.5 to prepare a bmrb file read prot XXXX_dyana.prot bmrblist XXXX.bmrb

Make sure you change the _Chem_shift_ambiguity_type tag to _Chem_shift_ambiguity_code. Yes, it is stupid, but RCI will report an error if you don't do it.

Flexible N- and C-terminal tails should be removed for CS-ROSETTA calculation to reduce CPU time. Flexible loop regions will later be excluded from calculation of all-atom energy.

Generating MFR fragments on U2 cluster at SUNY Buffalo

Copy the runCSRjob.com file into the working directory and change the number of fragments to be generated.

Type qsub runCSRjob.pbs to submit the job into queue. This calculation takes ~2 hours for 1000 fragments for a small protein, therefore it cannot be run on a master node.

Running CS-Rosetta on U2 cluster at SUNY Buffalo

Go to the rosetta subdirectory. Figure out how many parallel Rosetta jobs you will need to run. Things to consider are:

  • The total number of fragments to be calculated
  • The maximum wall-time for a single job is 72 h
  • It takes ~10 min to calculate a single structure of a small protein on a single CPU


Type ./runRosetta.csh N, where =N= is the number of parallel Rosetta jobs



-- AlexEletski - 17 Apr 2008

-- Updated by JimAramini - Nov 2009