The NMRlipids Project: 2021

Friday, September 24, 2021

NMRlipids20 workshop outcomes

Thirteen contributors participated to the second NMRlipids workshop organized on 6th-9th of September in Prague, Czech Republic.

Three talks on topics related to the NMRlipids project were given on the first day: Pavel Jungwirth presented results on how charge scaling can improve MD simulation quality without additional computational cost, Hanne Antila presented results from a automatic force field optimization algorithm, and Ricky Nencini presented evaluation of drug molecule binding affinities in MD simulations against NMR data.

Batuhan Kav presented the current status of the NMRlipids VI project on polarizable force fields. The main conclusion from the talk and consequent discussion was that the headgroup conformational ensemble and ion binding in Drude model is significantly worse than in original CHARMM36, most likely because forking and signs of headgroup order parameters are not taken into account during parameter optimization. The results from AMOEBA simulations are not yet available due to practical issues, but consensus was that those would be highly relevant.

Second day was opened by Samuli Ollila with the presentation on the current status of the NMRlipids databank. This was followed by work in three groups:

Quality Evaluation. Goal: Define the quality measures for order parameters, find robust code for form factor calculation and include this into the quality measure. Outcomes: Quality measure for order parameters, S=-log(P), where P is the probability mass within the experimental error for a normal distribution around simulation result (mean = order parameter from simulation, standard deviation = error of order parameter from simulation). Implementation of this was preliminary tested and committed to the QualityEvaluation.py. A new form factor code with python is now implemented, preliminary tested, and committed into the MATCH repository.
Extension of the databank to other than Gromacs programs. Goal: Define the required files needed to add simulations ran with other than Gromacs program into the databank. Outcomes: Possibility to incorporate OpenMM data was implemented. The required files are the trajectory (e.g., in dcd format), structure (e.g., in pdb format) and either xml file or inp file. Force field / topology information can be optionally given as psf file. This is implemented in branch of Anne Kiirikki, but not yet merged to main branch.
Analysis of the data. Goal: Test and improve the codes for the analysis of the databank. Outcomes: A new class that makes the analysis of databank significantly easier was introduced. An example code was implemented and commited. Also a new way to organize the information about molecular composition in the databank README.yaml files was proposed to ease the writing of analysis codes. The implementation of this is currently in progress by Anne Kiirikki.

In addition to these, several other topics were touched during discussions. These include, for example, practical ways to handle united atom simulations, stereo specific information in mapping files, sanity checks for the data, NMRlipids III project, and other potential data repositories than Zenodo. However, practical steps to progress these topics were not taken.

In conclusion, the workshop was again highly useful, at least for the NMRlipids project, and hopefully also for the participants. Thank you for all the participants!

Wednesday, March 24, 2021

NMRlipids IVb: Toward submission of the manuscript with PE and PG results (2)

The manuscript is now substantially updated based on comments to the previous post and discussions in the online meeting held on 12th of January 2021. All force field evaluations are moved to the supplementary information, analysis of lipid structures from PDB is improved, illustration of similar structures of different lipids bound to different proteins is added to figure 4, and dihedral distributions in figures 2 and 3 are changed to relative energies in kT units.

I believe that the manuscript is close to the submission stage, and I would like to submit it within one month from now, i.e., before 24th of April 2021. Therefore, the last opportunities to comment the manuscript are approaching.

Before the submission, the figures 1 and S1-S3, and experimental and simulation details needs to be polished.

We also need a good title for the article. The title is discussed in this issue.

In addition to these issues, I would be happy to have any comments on the manuscript as soon as possible, particularly if you have major comments on the content of the manuscript.

Wednesday, March 3, 2021

Second online meeting on the NMRlipids databank

Second online meeting on the NMRlipids Databank was held on 26th of February 2021. The slides presented in the beginning of the meeting are here. The main questions discussed in the meeting were related to the information that should be indexed in the databank, and the properties that should be automatically analyzed from simulations.

Although the beta version of the databank still exists in the NMRlipids IV project repository, a specific repository for the NMRlipids Databank is now available. Further development of the databank will be done this repository, while the beta version will be used for applications related to the NMRlipids IV project.

A publication introducing the databank with some highlight applications will be prepared (discussion opened in this issue). Original NMRlipids authorship rules will be applied in the first publication of the databank with two exceptions: Samuli Ollila will be the last author and Anne Kiirikki will be the first. The authorship rules for the future publications regarding the updates in the databank are yet to be carefully discussed and decided on.

Outcomes of the discussion on 26th of February 2021 were:

It was decided to include the following optional inputs by the contributors (in addition to the current indexing information listed in the slides): PUBLICATION INFORMATION, SOFTWARE VERSION, AUTHOR INFORMATION AND CONTACT, CPT FILE, LOG FILE, and TOP FILE. In addition to these, the date of running the AddData.py script will be automatically saved to the index files. DONE
The input file format for AddData.py will be changed to yaml. DONE
Inclusion of the information on lipid isomers into the databank was discussed, but practical solutions are not yet clear. Further discussions will be in this issue.
Automatic methods to determine (based e.g. on area per lipid and PCA analysis) how equilibrated the contributed trajectories are, will be investigated. Further discussion will be in this issue.
Consistency of the indexed data will be checked by comparing the total number of atoms between the databank and the actual simulation. Also some other potential sanity checks were discussed. Further discussion will be in this issue.
Properly functioning code for the form factors is needed in the near future. Further discussion will be in this issue.
Databank will be made functional also with at least OpenMM and Amber data before the first publication. Further discussion will be in this issue.

In addition, Anne Kiirikki is working with the databank containing experimental data in the form that it could be automatically linked with the MD simulation databank.

As the next step, we plan to work out the points 1 and 2 from the above list. Then we will probably add all the currently available data into the databank, and continue to work on points 3-7.

Pages