Figure 1: Schematic structure of the new databank. Beta versions of the Databank, Databank builder and Databank analyzer codes are available. |
The presentation was followed by a highly useful discussion, thanks to more than 20 participants. The discussion was mainly focused on urgent issues brought up in the presentation that were complemented by additional points raised by the participants. The outcomes of the meeting and some decisions based on the discussions are listed here.
- What information will be stored into the dictionary files composing the databank? Current plan is to include information requested from contributor that cannot be read afterwards (force field information, trajectory length, etc.), and information necessary for using (file names and sources) and searching (number of molecules and temperature) the data. Note, however, that the tpr (or corresponding) and trajectory files are accessible through the databank. Thereby all the information of each simulation is available even thought everything is not written directly into the dictionary. For detailed discussion, see the GitHub issue.
- How molecules will be named? When writing and searching the data from the databank, we need unique machine readable names for molecules. There will be a list of molecule names (for example, POPC, POT, TIP3P, etc.) that will be used by default. If the uploaded simulation has different names, user has to tell those. For detailed discussion, see the GitHub issue.
- Unique convention for the atoms within the molecules. For now, we will use the idea of mapping files updated with a third column that tells the residue name for each atom. This should be useful in situations where parts of one lipid are named with different residue names, such as in the current Amber force field convention. For detailed discussion see the GitHub issue.
- File format for the dictionary. If practically feasible, we will consider saving dictionary in yaml format instead of json. For detailed discussion see the GitHub issue.
Hi,
ReplyDeleteOne issue that I think would be good to discuss is to where and how to host the databank. Shall I create a specific issue for this on GitHub?
Next online meeting on the databank will take place on 26.2.2021 in Zoom. If you have not received the invitation by email, but would like to join, please send me email.
ReplyDeleteThe schedule of the meeting is the following (CET zone).
Presentations before noon:
9.00-10.00 Samuli Ollila: Welcome and current status of the NMRlipids databank
10.00-11.00 Batuhan Kav: Current status of the NMRlipics VI on the polarizable force fields
11.00-12.00 Break
12.00-13.00 Discussion on NMRlipids databank
13.00-14.00 Discussion on NMRlipids VI project
14.00-14.30 General discussion and closing of the meeting.