HarvestDB

From NESG Wiki
Jump to navigation Jump to search

Introduction

HarvestDB is a web-based integrate tool for NESG users to deposit and archive and NMR/X-Ray protein structures. All NESG structures are required to be processed and archived through HarvestDB.

The front end of HarvestDB is java-based servelet pages and the back end is SPiNE database.

HarvestDB has the following major functions:

      A. Archive files
      B. Version tracking
      C. PSVS analysis
      D. Deposit to BMRB/PDB
      E. Update SPiNE and Structure Gallery


PDB and BMRB deposition via HarvestDB

Log in to HarvestDB and select deposit NMR Project

BRMB file is the .bmrb file from CYANA Chemical shift file is the same one used for AutoStructure

User Guideline for NMR Structures (Aug. 2007)

    1.  Submit NMR Protein Structure Information and Files to HarvestDB to Create Protein Record

             a. Open web form -> Deposit NMR
             b. Complete web form: NESG target id, Protein id, version id, Swissprot id, total number of structures, NMR comments.
             c. Complete web form: Coordinates, constraint lists, chemical shift, NOESY peak lists
             d. After completing web form, HarvestDB sends email user with the link of the new structure record 
             e. HarvestDB generates protein pictures: Small (80 by 80), Big static (300 by 300), Big  dynamic (300 by 300)
             f. HarvestDB pulls author list from SPiNE
             g. HarvestDB setups NMR id, construct id, batch id from input PST id
             h. Users can update structure information and NMR files through HarvestDB

     2.  Run PSVS, RPF Analysis through HarvestDB

             a. http://www-nmr.cabm.rutgers.edu/PSVS/
             b. HarvestDB sends Target id, Protein is, Coordinates, Constraint Lists to PSVS
             c. HarvestDB receives zipped PSVS report from PSVS, parse the zipped HTML file to get the z-scores, and send email to notify user
             d. For optimal structures all Z-scores should be > -5
             e. For optimal structures DPF >.7
             f. HarvestDB compares z-scores with previous NSEG structure quality by scatter plots

    '3.  NMR Structure File Version Tracking

             a. HarvestDB duplicates current files and information to create newer version
             b. Update files after refinement, tracks date and notes fro each version

    4.  'Prepare NMRStar File and Coordinate file (mmCIF) through HarvestDB

             a. HarvestDB pulls information from SPiNE and Swissprot site
             b. HarvestDB collects information about molecular entity sequence, contact authors, title, citation, molecule, synthetic, sample conditions, spectrometer, experiment
             c. HarvestDB generates NMRStar file and Coordinate file (by using pdb_extract)

     5.  HarvestDB Runs through BMRB to Initiate or Update Auto-deposition

             a. Send info: Submitter info, PI info
             b. Send files: Coordinates, Constraint Lists and NMRStar file
             c. HarvestDB receives BMRB and PDB id, deposition date, deposition status from BMRB
             d. For successful deposition: HarvestDB updates SPiNE to create NMR record, send notify email to user and PI
             e. For error deposition: HarvestDB asks user to modify and re-deposit

     '6.  HarvestDB Updates Structure Gallery

             a. http://nmr.cabm.rutgers.edu:9090/gallery/jsp/Gallery.jsp
             b. Fix Header: HarvestDB asks user to fix the Title, Protein Name (NO Hypothetical ) and Author list (Last Author / PI name) of coordinates
             c. Fix Protein Pictures
             d. Send info: BMRB and PDB id
             e. Send files: Three Pictures, Coordinates, Constraints, NMRStar file, Zipped PSVS report
             f. Structure Gallery returns structure link to HarvestDB
             g. HarvestDB sends notify email to user and PI

 

User Guideline for X-Ray Structures (Aug. 2007)

    1.  Submit X-Ray Protein Structure Information and Files to HarvestDB to Create Protein Record

             a. Open web form -> Deposit X-Ray
             b. Complete web form: NESG target id, Protein id, version id, Swissprot id, X-Ray comments.
             c. Complete web form: Coordinates, X-Ray Files (Reflection data, Protein Phasing, Scaling Data, Molecular Replacement Data, Density Modification Data)
             d. After completing web form, HarvestDB sends email user with the link of the new structure record 
             e. HarvestDB generates protein pictures: Small (80 by 80), Big static (300 by 300), Big  dynamic (300 by 300)
             f. HarvestDB pulls author list from SPiNE
             g. HarvestDB setups X-Ray id, construct id, batch id from input PST id
             h. Users can update structure information and X-Ray files through HarvestDB

    2.  Run PSVS Analysis through HarvestDB

             a. http://www-nmr.cabm.rutgers.edu/PSVS/
             b. HarvestDB sends Target id, Protein id, Coordinates to PSVS
             c. HarvestDB receives zipped PSVS report from PSVS, parse the zipped HTML file to get the z-scores, and send email to notify user
             d. For optimal structures all Z-scores should be > -5
             e. HarvestDB compares z-scores with previous NSEG structure quality by scatter plots
             f. HarvestDB compares Resolution, R-factor with previous NSEG structure quality by scatter plots

    3.  X-Ray Structure File Version Tracking

             a. HarvestDB duplicates current files and information to create newer version
             b. Update files after refinement, tracks date and notes fro each version

    4.  Prepare Coordinate and Structure Factor file (mmCIF) through HarvestDB

             a. HarvestDB pulls information from SPiNE and Swissprot site
             b. HarvestDB collects information about molecular entity sequence, contact authors, title, citation, molecule, synthetic, sample conditions, spectrometer, experiment
             c. HarvestDB generates Coordinate and Structure Factor files (by using pdb_extract) in mmCIF format
             d. http://pdb-extract.rcsb.org/

    5.  HarvestDB Runs through PDB to Initiate or Update Auto-deposition

             a. Send info: Submitter info, PI info
             b. Send files: Coordinates and Structure Factor file
             c. HarvestDB receives PDB id, deposition date, deposition status from PDB
             d. For successful deposition: HarvestDB updates SPiNE to create X-Ray record, send notify email to user and PI
             e. For error deposition: HarvestDB asks user to modify and re-deposit

    6.  HarvestDB Updates Structure Gallery

             a. http://nmr.cabm.rutgers.edu:9090/gallery/jsp/Gallery.jsp
             b. Fix Header: HarvestDB asks user to fix the Title, Protein Name (NO Hypothetical ) and Author list (Last Author / PI name) of coordinates
             c. Fix Protein Pictures
             d. Send info: PDB id
             e. Send files: Three Pictures, Coordinates, Structure Factor, Zipped PSVS report
             f. Structure Gallery returns structure link to HarvestDB
             g. HarvestDB sends notify email to user and PI