Defining a quantitative quality measure for lipid bilayer simulations has been one of the goals of the NMRlipids project since the beginning. Such measure is highly useful when selecting the best force field for a specific application, and for improving force field parameters, particularly with automated procedures. Based on literature review and results of the NMRlipids Project, summarized in the NMRlipids V publication, we have concluded that the C-H bond order parameters from NMR can be used to evaluate the conformational ensembles of individual lipids, and the x-ray scattering form factors can be used to evaluate the lipid bilayer dimensions. Based on the work in the NMRlipids workshops in Berlin (2019) and Prague (2021), we have now written a code that evaluates the quality of simulations in the NMRlipids Databank. The key ideas and results of the quality evaluation are described in this post. More details and results can be found from the NMRlipids Databank manuscript and from GitHub.
Results. The order parameter quality of 58 simulations and the form
factor quality of 99 simulations have so far been evaluated in the
NMRlipids Databank. Figure 1a shows the results for the 13 best simulations according to
the overall order parameter quality; and Figure 1b shows
comparison between simulations and experiments for the best simulations
concerning the overall order parameters (left column), the headgroup order parameters (middle column), and the form factor (right column). Results for all ranked simulations ordered in various ways
are available on GitHub.
Conformational ensembles evaluated against C-H bond order parameters from NMR. After the workshop in Prague, our idea was to define the poorness Š of each order parameter as Š=-log(P); here P is the probability mass within the experimental
error for a normal distribution, whose mean is the order
parameter from the simulation, and whose standard deviation is the standard error of the mean from the simulation. However, when testing this definition of Š on the simulations in the NMRlipids Databank, it turned out that the probability of the simulated order parameters to locate within experimental errors was often below the numerical accuracy of computers. To avoid such numerical instability, we decided to use the first order Student’s t-distribution instead, and calculate the probability from the equation\begin{equation} P = f \left( \frac{S_{\rm CH} - (S_{\rm exp}+\Delta S_{\rm exp})}{s/\sqrt{n}} \right) - f \left( \frac{S_{\rm CH} - (S_{\rm exp}-\Delta S_{\rm exp})}{s/\sqrt{n}} \right),
\end{equation}where f(t) is the first order Student's t-distribution, s is the variance of the order parameter SCH calculated over individual lipids and n is the number of lipids in the simulation. Because Student's t-distribution has heavier tails than normal distribution, even order parameters far from experiments have distinguishable non-zero probabilities. Therefore, the logarithm used to define the poorness Š is not needed, and we report the qualities directly as probabilities. However, it should be noted that using the first order Student's t-distribution instead of the normal distribution slightly underestimates the statistical accuracy of order parameters calculated from simulations.
In order to rank simulations based on headgroup, acyl chain, or individual lipid qualities, the average probabilities can be calculated over lipid fragments and types. For more details see the NMRlipids Databank manuscript.
Lipid bilayer dimensions evaluated against x-ray scattering form factor. The qualities of form factors in simulations are evaluated as in the SIMtoEXP program \begin{equation}
\chi^2 = \frac{\sqrt{\sum_{i=1}^{N_q}(|F_s(q_i)|-k_e|F_e(q_i)|)^2/(\Delta F_e(q_i))^2}}{\sqrt{N_q-1}},
\end{equation}
where
Fs is the form factor from simulation and Fe from experiment, the summation goes over the experimentally available Nq
points, and \begin{equation}
k_e = \frac{\sum_{i=1}^{N_q}
\frac{|F_s(q_i)||F_e(q_i)|}{(\Delta F_e(q_i))^2}}{\sum_{i=1}^{N_q}
\frac{|F_e(q_i)|^2}{(\Delta F_e(q_i))^2}}.
\end{equation}It should be noted that in this evaluation the simulation uncertainty is not accounted for in any way.