Friday, July 3, 2020

Online meeting about the NMRlipids databank

The first NMRlipids online meeting, focused in the development of the NMRlipids databank, was held at 16.00 CET on Monday 29th of June 2020. The meeting started with a presentation by Samuli Ollila about the current status of the databank (slides available in here). The schematic structure of the new databank is shown in Fig. 1.

Figure 1: Schematic structure of the new databank. Beta versions of the Databank, Databank builder and Databank analyzer codes are available.


The presentation was followed by a highly useful discussion, thanks to more than 20 participants. The discussion was mainly focused on urgent issues brought up in the presentation that were complemented by additional points raised by the participants. The outcomes of the meeting and some decisions based on the discussions are listed here.
  1. What information will be stored into the dictionary files composing the databank? Current plan is to include information requested from contributor that cannot be read afterwards (force field information, trajectory length, etc.), and information necessary for using (file names and sources) and searching (number of molecules and temperature) the data. Note, however, that the tpr (or corresponding) and trajectory files are accessible through the databank. Thereby all the information of each simulation is available even thought everything is not written directly into the dictionary. For detailed discussion, see the GitHub issue.
  2. How molecules will be named? When writing and searching the data from the databank, we need unique machine readable names for molecules. There will be a list of molecule names (for example, POPC, POT, TIP3P, etc.) that will be used by default. If the uploaded simulation has different names, user has to tell those. For detailed discussion, see the GitHub issue.
  3. Unique convention for the atoms within the molecules. For now, we will use the idea of mapping files updated with a third column that tells the residue name for each atom. This should be useful in situations where parts of one lipid are named with different residue names, such as in the current Amber force field convention. For detailed discussion see the GitHub issue.
  4. File format for the dictionary. If practically feasible, we will consider saving dictionary in yaml format instead of json. For detailed discussion see the GitHub issue.
As a first step, we will build a prototype databank containing simulations from NMRlipids IVb manuscript and use this to analyze, for example, P-N vector and dihedrals angles required for the manuscript. Therefore all the related issues are now in the GitHub repository of NMRlipids IVb.

1 comment:

  1. Hi,

    One issue that I think would be good to discuss is to where and how to host the databank. Shall I create a specific issue for this on GitHub?

    ReplyDelete

Please sign in before writing your comment.