Wednesday, March 3, 2021

Second online meeting on the NMRlipids databank

Second online meeting on the NMRlipids Databank was held on 26th of February 2021. The slides presented in the beginning of the meeting are here. The main questions discussed in the meeting were related to the information that should be indexed in the databank, and the properties that should be automatically analyzed from simulations.

Although the beta version of the databank still exists in the NMRlipids IV project repository, a specific repository for the NMRlipids Databank is now available. Further development of the databank will be done this repository, while the beta version will be used for applications related to the NMRlipids IV project.

A publication introducing the databank with some highlight applications will be prepared (discussion opened in this issue). Original NMRlipids authorship rules will be applied in the first publication of the databank with two exceptions: Samuli Ollila will be the last author and Anne Kiirikki will be the first. The authorship rules for the future publications regarding the updates in the databank are yet to be carefully discussed and decided on.

Outcomes of the discussion on 26th of February 2021 were:

  1. It was decided to include the following optional inputs by the contributors (in addition to the current indexing information listed in the slides): PUBLICATION INFORMATION, SOFTWARE VERSION, AUTHOR INFORMATION AND CONTACT, CPT FILE, LOG FILE, and TOP FILE. In addition to these, the date of running the AddData.py script will be automatically saved to the index files. DONE
  2. The input file format for AddData.py will be changed to yaml. DONE
  3. Inclusion of the information on lipid isomers into the databank was discussed, but practical solutions are not yet clear. Further discussions will be in this issue.
  4. Automatic methods to determine (based e.g. on area per lipid and PCA analysis) how equilibrated the contributed trajectories are, will be investigated. Further discussion will be in this issue.
  5. Consistency of the indexed data will be checked by comparing the total number of atoms between the databank and the actual simulation. Also some other potential sanity checks were discussed. Further discussion will be in this issue.
  6. Properly functioning code for the form factors is needed in the near future. Further discussion will be in this issue
  7. Databank will be made functional also with at least OpenMM and Amber data before the first publication. Further discussion will be in this issue.
In addition, Anne Kiirikki is working with the databank containing experimental data in the form that it could be automatically linked with the MD simulation databank. 

As the next step, we plan to work out the points 1 and 2 from the above list. Then we will probably add all the currently available data into the databank, and continue to work on points 3-7.

2 comments:

  1. The first two points in the above list are now done. Also the first instructions to contribute to the databank are now available in the GitHub: https://github.com/NMRLipids/Databank

    It would be very good if people would test these and give feedback.

    ReplyDelete
  2. In addition to the information mentioned above, I have also added the size of trajectory file in bytes and number of atoms into the dictionary parameters.

    ReplyDelete

Please sign in before writing your comment.