## Friday, March 14, 2014

### The lipid forcefield comparison

[Updated Friday, April 11, 2014]

Our open research collaboration has tested the abilities of 10 12 current force fields to correctly sample the headgroup and glycerol conformations in phospholipids. The C-H order parameters calculated from simulations were compared to the corresponding ones measured in accurate NMR experiments. We found that none of the force fields is perfect, but promising results were obtained with the CHARMM36 and the unpublished Maciejewski–Rog force fields.

 Figure 1. The Lipid forcefield comparison. The absolute values of PC-lipid headgroup and glycerol C-H order parameters compared to values measured in NMR experiments. Pure systems, full hydration. The following 10 12 force fields were tested: Berger—the united atom (UA) force field published in Berger 1997, Kukol—UA in Kukol 2009, Chiu—UA in Chiu 2009, Poger—UA in Poger 2010, UlmUlm—UA in Ulmschneider 2009, Slipids—the all-atom (AA) force field published in Jämbeck 2012, CHARMM36—AA in Klauda 2010, GAFFlipid—AA in Dickson 2012,  Lipid14—AA in Dickson 2014, HoNiLy—AA in Högberg 2008, OPLS-AA—AA in Róg 2009, and MacRog—AA by Arek Maciejewski and Tomasz Róg (in press) . The computational values have been posted to the nmrlipids.blogspot.fi-blog by the project participants, except for the GAFFlipid and one of the CHARMM36 data sets, which were taken from the original publications [Dickson 2012, Klauda 2010]. The experimental values were taken from the following publications: DMPC 303K from Gross 1997, DMPC 314K from Dvinskikh 2005, DPPC 310K from Gally 1981, DPPC 322K from Seelig 1977, DPPC 323K from Akutsu 1981, and POPC 298K from Ferreira 2013. The vertical bars shown for some of the computational values are not error bars, but demonstrate that for these systems we had two data sets; the ends of the bars mark the values from the two sets, and the dot marks their measurement-time-weighted average.

Our results are summarised in Fig. 1; in the following we discuss the key points.

Experimental data fall close to one another, irrespective of the lipid tails, temperature, salt (NaCl) concentration, or the experimental technique used. This excellent repeatability has already been discussed at length in Samuli's post on accuracy. Motivated by the high experimental repeatability, we have highlighted in Fig. 1 subjective sweet spots (light blue areas), within which we expect the calculated order parameters of a well-performing force field to fall.

Forking of the order parameters was observed in some cases. (Note that to avoid confusion with the established NMR term splitting—used e.g. to refer to the dipolar splitting in NMR spectra—we use here the word forking to describe the situation where the order parameters of different hydrogens in a single carbon are not equal.)

If the corresponding experimental value is not forked (as is the case for the $$\alpha$$- and $$\beta$$-carbons), forking in the calculated value signals that the simulation is sampling an unrealistic conformation, for which the C-H bond distributions for the hydrogens disagree; this must be considered as a serious shortcoming of the force field. Note, however, that for the g3- and g1-carbons the experimental values do fork (although the forking of the g3 carbon is small and some experiments only report the larger or the average value).

None of the forcefields was perfect—three all-atom force fields, however, came close (Fig. 2 focuses on just these three to help comparison):

1) CHARMM36 was the only force field that clearly had no forking in the $$\alpha$$- and $$\beta$$-carbons, but it suffered from slightly too ordered $$\beta$$-, g3- and g1-carbons.

2) Maciejewski–Rog had almost all the values within the sweet spots, but suffered from forking in $$\beta$$ (and possibly in $$\alpha$$) as well as a too wide split in g3.

3) GAFFlipid appears generally promising. It suffers, however, from slight forking in $$\alpha$$ and $$\beta$$, and from too ordered g2 and g1. In addition, we have not been able to reproduce these order parameters so far (see the note on technical issues in the end of this post), and thus show here only the originally reported values [Dickson 2012].

 Figure 2. The same data as in Fig. 1, but only for the three best-performing force fields.

We note that also the all-atom HoNiLy force field performed quite well in our test, although it did suffer from forking in the $$\alpha$$-carbon, too ordered $$\alpha$$- and g3-carbons, and a too wide forking in g3 (Fig. 1). However, because HoNiLy is a modified CHARMM force field and, in fact, an early development version of the Slipids force field, it appears not urgent to study its further properties.

We also note that by far the best order parameters for the fully hydrated case were reported on March 9th 2014 by Antti Lamberg, who used simple brute force heuristics to optimise the dihedral parameters (of a united atom force field) to accurately reproduce the experimental values. As this approach is very fresh, and currently under discussion, we have not yet included these results into the comparison.

Conclusion: Next we need the responses. There clearly is still room for improvement in the current force fields, even in the standard system of a pure lipid bilayer at full hydration. However, the quality of a force field (in terms of the goal we set for this project, that is, correctly reproducing the headgroup conformations) depends ultimately on its ability to correctly respond to changing environmental conditions.

Based on the here presented analysis at full hydration, it is now justified to focus mostly on testing the responses—to hydration levels, ion concentrations, and cholesterol content—in the top three force fields: CHARMM36, MacRog, and GAFFlipid. (In fact, some response data have already been reported in the blog. An overview of these will be presented in a separate post.) The responses of the other seven nine force fields do not need to be studied, unless of course new data is posted suggesting that further discussion might be justified.

Should any of the three promising all-atom force fields pass the response tests, we might try to use it as a basis on which to build a well-performing united atom model. If successful, this approach would fulfil the goal of our project.

Things still To Do for full hydration:
• Calculate the order parameters in the new AmberLipid14: Work already in progress by Samuli.
• Calculate the order parameters in the Ulmschneider force field: No work yet in progress.
• Calculate the order parameters in the CHARMM36-UA force field: No work yet in progress.

Notes on some technical issues that came up while the results presented here were produced. These notes are probably not crucial for the goals of the nmrlipids-project, but are presented here as they might be of general interest to the project participants or to the lipid simulation community at large.

We were not able to completely reproduce the results for GAFFlipids (area per molecule, order parameters) reported in the original publication [Dickson 2012]. We tested various approaches (different lipid types, different bond constraints, different thermostats, different hydrogen masses), but the issue remains so far unsolved. It is possibly related to differences between Gromacs and Amber simulation packages. The issue is currently on hold, however, as we are testing the new AmberLipid14 force field. Should it behave reasonably, we probably will simply use it to replace the GAFFlipids results in Figs. 1 and 2. Update on 11 Apr 2014: The headgroups (in particular g1 and g3 carbons) seem to be better described in GAFFlipids than in Lipid14 (see Fig. 1 and this plot of Samuli's simulation results). Therefore, let us try and solve the reproducibility problems with GAFFlipids and then check their response to varying conditions.

Three problems were encountered while reproducing the results of Poger et al. 2012:
1. We could not reproduce the area per molecule reported in the publication.
2. Contrary to what was reported in the publication, results (area per molecule and order parameters) were affected by the choice of method for calculating the electrostatic interactions: reaction field vs. PME.
3. The order parameters for the headgroup and glycerol carbons of different models (Poger, Kukol, Berger) reported by Poger et al (in their Table 2) differ from the results that have been obtained in the blog.
Possible partial explanations for the observations:
1. It turned out that Gromacs versions 4.0 and 4.5 give different results for reaction field simulations. For more details see the Redmine Bug Report.
2. While changing from reaction field to PME, Poger et al. also decreased their Lennard-Jones cut-off from 1.4 nm to 1.0 nm. This could possibly compensate for the changes we observed when switching from reaction field to PME and keeping the rest of our parameters fixed.
3. This was not investigated further, as none of the reported order parameter sets (neither in the blog nor in the publication) were in good agreement with the experimental values. However, it is not clear to us if the Gromacs tool g_order was maybe used by Poger et al. for calculating the headgroup order parameters—this can be problematic should the hydrogens of a single carbon experience a split, i.e., give different order parameters.

1. Now the "Things still To Do for full hydration" mentioned above are both done:

Lipid14 was done by myself with the help of Marius Retegan: http://nmrlipids.blogspot.com/2013/10/welcome-if-you-are-new-here-reading.html?showComment=1395744941307#c7520665937666074397

https://www.dropbox.com/s/no09z01v5dicpmp/compTOgaffEXP.png

Now also the numbers are here:
1 0.007067
1 0.000448662
2 0.072488
2 0.0709187
3 0.158456
3 0.261809
4 0.245347
5 0.263484
5 0.0184436

Ulmschneider force field calculation was done by Matti Javanainen: http://nmrlipids.blogspot.com/2013/10/welcome-if-you-are-new-here-reading.html?showComment=1395838721832#c8520038250973168506

I did plot those with experimental values here:
https://www.dropbox.com/s/eqx07taaaojxwmy/OPulmschneiderCOMPtoEXP.png

It looks like that the glycerol order parameter are quite large.

Markus will update the text and the figure as soon as he has time.

1. Hi Samuli, could you report also the signs for these Lipid14 order parameters?

2. https://github.com/NMRLipids/nmrlipids.blogspot.fi/blob/master/POPClipid14/OrderParameters.dat

2. I am currently writing a post about the signs of the order parameters. I think that we do not have the results with signs for the MacRog. Could it be possible to post those here? POPC in full hydration would be the best system.

1. POPC, full hydration, no ions:

beta: +,+
alpha: +,+
g1: -,-
g2: -
g1: -,+ (plus with the smaller value)

3. I have now some preliminary results on the response of the CHARMM model to ions. The system is POPC and temperature 303K. I call the results preliminary because the simulations are only 30ns. For the system without ions nothing is not really changing and the last 20ns is analyzed. The properties for the simulation with NaCl are still possibly drifting and only the last 10ns was analyzed. However, I think that the qualitative effect of NaCl is already clear.

The results are added to the figure with other ion response results:
https://www.dropbox.com/s/ugnkq37c1mkwa8t/OrderParameterIONSnacl2.pdf

and also to the figure with zoomed scale:
https://www.dropbox.com/s/oub4nx3absvsqbv/OrderParameterIONSexpVSsim.pdf

For me it seems that the order parameters are qualitatively changing as in the experiments with CaCl_2 but more than in the experiments with Na. Thus for me it seems that the partition of Na is too strong also in CHARMM model, but the response to the penetrating ion is qualitatively correct. By looking the density profiles for POPC and Na from the CHARMM simulation:
it seems to me that the penetration is something between Orange and MacRog, as also seen in the order parameter changes. From the CHARMM density plot it is also clear that there is not enough statistics yet.

Despite of the lack of statistics, I have shared the results since they are very promising regarding the CHARMM model and gives a strong motivation to further investigate the CHARMM, even though it is significantly slower to simulate compared to the other models I have tried.

I will continue running these simulations, but the results from other people from CHARMM simulations with ions would be still very useful.

4. I update here the values I first reported for Poger DPPC in my earlier comments (see http://nmrlipids.blogspot.fr/2013/10/welcome-if-you-are-new-here-reading.html?showComment=1382718128134#c3341186425139768422 and http://nmrlipids.blogspot.com/2013/10/welcome-if-you-are-new-here-reading.html?showComment=1385663290230#c2033156291937576514). These were calculated with the new script (for some reason I got the absolute values right in my comments), but in addition I here report the sign as well as the carbon configuration :

PME
beta -0.00755345
beta 0.0274534
alpha 0.0602613
alpha 0.0668811
g3R -0.0685195
g3S -0.240482
g2 -0.0839291
g1R 0.119438
g1S -0.0706316

RF
beta -0.000290485
beta 0.0147462
alpha 0.0860009
alpha 0.121084
g3R -0.117486
g3S -0.234245
g2 -0.12262
g1R 0.0588791
g1S -0.0451602

PME2
beta -0.0103783
beta 0.0214663
alpha 0.0835412
alpha 0.0985141
g3R -0.10375
g3S -0.238471
g2 -0.111633
g1R 0.11745
g1S -0.0737633

RF2
beta -0.00709458
beta 0.0278493
alpha 0.0895108
alpha 0.0855304
g3R -0.0849479
g3S -0.229221
g2 -0.0947689
g1R 0.117584
g1S -0.103398

5. As my earlier results with the Kukol model were simulated with 150 mM NaCl, I also ran the system now without the ions. The parameters were the same as before and 512 lipids were simulated for 50 ns and the last 30 ns was employed for the analysis:

1) -0.0595 0.0087
2) -0.1202 -0.1357
3) 0.2792 0.1116
4) 0.1497
5) -0.0161 -0.2257

1. Just to make sure: is this for POPC?

6. Yes, this is for POPC.