The NMRlipids Project: NMRlipids IVb: Assembling the PE & PG results

Tuesday, April 23, 2019

NMRlipids IVb: Assembling the PE & PG results

As discussed in the previous post, the NMRlipids IV project concerning PE, PG and PS lipids was divided into two parts. The first part about the results from PS lipids is approaching to the submission, while the second part about PE and PG lipids is in a more preliminary stage.

I have now started to assemble the contributed data for PE and PG lipids into a manuscript in a new GitHub repository. From now on, I will call this part of the project as NMRlipids IVb. Current status of the manuscript and most important things to todo are summarized in this post.

Headgroup and glycerol backbone order parameters from lipids with different headgroups in experiments

Experimental order parameters delivered by Tiago Ferreira with the information about the sign revealed that the 𝛽-carbon order parameter of PG lipids is positive, while other lipids have negative order parameter for this carbon (Fig. 1). Therefore, the structure of PG lipid seems to be distinct from PC and PE, in contrast to the conclusions from the order parameter data without sign information in the previous post.

Figure 1: Experimental order parameters of the headgroup and glycerol backbone region C-H bonds from experiments. For literature references, see the manuscript draft.

Headgroup and glycerol backbone order parameters of PE and PG lipids in simulations

The experimental information about order parameter signs enables a similar comparison between different simulation models and experiments for PE and PG lipids as done previously for PC and PS lipids in NMRlipids I and NMRlipids IV projects. This comparison (Fig. 2) and the results in NMRlipids I and NMRlipids IV projects suggest that the differences between PC, PE, PG, and PS headgroups are well captured in CHARMM36 simulations, although order parameters of all C-H bonds are not within the experimental error for any of the lipids. Therefore, the structural differences between lipid headgroups could be analyzed from the CHARMM36 simulations, perhaps in a similar way as done for the PS headgroup in the NMRlipids IV project. The contributed data for PE and PG lipids from united atom simulations is not yet analyzed, partly because of the lack of suitable programs for the automatic analysis of the data.

Figure 2: Headgroup and glycerol backbone order parameters from PE (left) and PG (right) lipid bilayers from experiments and different simulation models. For literature references, see the manuscript draft.

Cation binding to lipid bilayers containing PE and PG lipids

Sodium binding to PE lipid bilayers seems to be slightly weaker than in corresponding simulations for PC lipids, but experimental data is not available. For calcium binding to PE lipids, we do not have experimental nor simulation data yet.

As discussed in the NMRlipids IV project, the counterion binding to negatively charged lipid bilayer is complicated to assess against experimental data. However, the results from PC:PG mixtures from CHARMM36 and Slipids simulations indicate that sodium ions could be slightly overbinding to PG lipids, similarly to PS lipids in the NMRlipids IV project. For calcium binding to PG containing lipid bilayers, we have data only from CHARMM36 simulations and the conclusions are not yet fully clear.

ToDo:

Analyze united atom simulations of PE lipids (GROMOS, CHARMM36ua, OPLS-UA, Berger, and GROMOS-CKP) listed in Table I in the manuscript with or without the new code for united atom data.
Analyze structural differences between POPC, POPE, POPG and POPS headgroups from CHARMM36 simulations. The most reasonable first step would be probably to follow a similar analysis that was done for PS lipids in the NMRlipids IV project.
Simulation data for PG lipids with different force fields would be useful. Currently, we have only CHARMM36 and Slipids.
Simulation data from POPC:POPG (1:1) mixtures with sodium counterions and different CaCl₂ concentrations (0-1M) from different force fields would be useful. Currently, we have results only from CHARMM36.

37 comments:

Ángel PiñeiroMay 28, 2019 at 3:24 PM
I have just uploaded a python code to determine the OP of UA/AA force fields to the scratch folder. It is also possible to ignore the explicit H atoms in AA forcefields by using the option "-a ignH". The minimum input is a pdb file and the itp file of the molecule for which you want to get the OP (your lipid molecule). Of course, you can also specify a xtc trajectory in case you want the average OP over it with the standard deviations. The script does not require any specific input file with the list of C atoms but reads them directly from the itp file.
ReplyDelete
Replies
Ángel PiñeiroJune 15, 2019 at 2:50 PM
Thank you for the comment Josef. We used your script in order to validate ours and the results are identical for the OP values of explicit H atoms. The only difference is in the determination of the STEM values. As you say, you do the calculation over lipids and then over frames while we get the STEM of all lipids over all frames together, this is why I said that our calculation of the STEM is a bit more conservative, our STEM values are a bit larger than yours. For a single frame our results are identical for OP, STD and STEM.

Regarding the OP of non-explicit H atoms, we had not compared our results to those obtained from the awk scripts mentioned by Samuli. As I explained above, we validated the reconstruction of H atoms and, once we have the coordinates, we validated at quantitative level the determination of OP for explicit atoms using your script.
ReplyDelete
Replies
Patrick FuchsJune 17, 2019 at 8:47 PM
Hello all, sorry for my late answer, I have little time now. I'll try to give a first answer to the different questions. Since my message is too big, I cut it in two.

(Message 1/2)
@Samuli
1) OP of the Berger POPC simulation: I was unsure what to analyze since there are 2 trr files. I have tried with the first one (0-25 ns). After a quick look (no real quantification), it seems that I have generally differences on the 3rd digit (more seldom to the 2nd digit). For the two beta op my script finds 0.03634 and 0.06587. Could you please confirm on what time window I should analyze? Then I can make a more quantitative estimate of the difference.

4) My program needs a file (for now let's call it dic_lipids.py) containing this:
POPC = { "atom_1": ("typeofH2build", "helper1", "helper2"),
"atom_2": ("typeofH2build", "helper1", "helper2"),
...
"atom_n": ("typeofH2build", "helper1", "helper2", "helper3"),
...}
Where "typeofH2build" is in ("CH3", "CH2", "CH", "CHdoublebond"), "helper1" to "helper3" are the names of helper atoms that are needed to build the H(s). For example on a saturated chain, if we have C10, C11 and C12 that follow each other, and if C10 is a CH2, we would have: "C11": ("CH2", "C10", "C12"). I also use (for now) the same type of file as the one needed by Josef's script (found in https://github.com/NMRLipids/MATCH/blob/master/scripts/orderParm_defs/) so that I can have the same output as Josef and it makes comparison easy. This is a tedious task to write such files, and we have to do it for each type of lipid *and* force field. However, as I told above, I'm developing another script that should be able to build automatically dic_lipids.py for any force field.

The script is not ready yet, but I can explain the idea. First I build a graph of one lipid molecule with the nodes having a name you use in your mapping files (e.g. M_G1_M, M_G1H1_M, etc), say for example a POPC. Then I use a pdb with a single POPC (structure chosen from a given force field, with PDB atom names as indicated in itp or psf), I build another graph with the nodes having the name of each atom found in that pdb. Then I match the two graphs and get the correspondance between the generic names and the names in the pdb (as a python dictionnary).

So what we need in the end is only a valid pdb (or gro file) with one lipid. With that script we should be able to build any of those files:
- generic to pdb atom names (such as https://github.com/NMRLipids/MATCH/tree/master/MAPPING)
- the one needed by my program (dic_lipid.py)
- the one needed by the script of Josef (https://github.com/NMRLipids/MATCH/tree/master/scripts/orderParm_defs)
One could argue that at the beginning, it'll be needed to build a file with the connectivity of a lipid using generic names. However, we have to do it only once for a given lipid (e.g. POPC), and building a new lipid will be quite trivial if the structure is not so different (e.g. building POPS or POPE from POPC). In the end, if we already know the structure (connectivity) of a lipid, e.g. POPC, we will only need a PDB file (no itp, tpr, psf, etc., needed).
ReplyDelete
Replies
Patrick FuchsJune 17, 2019 at 8:47 PM
(Message 2/2)
@Angel & Josef:
So far my strategy for validation was the following :
i) I calculate the OP with my code from a UA trajectories.
ii) Using my code, I generate from a UA trajectory a new xtc traj *with* hydrogens. Then using this new traj, I run Josef's script and compare the output of both programs.
Based on this, I find very similar results (main absolute difference of 0.001). The tiny difference is due to the fact that when my code builds an H, it stores many digits for its coordinates. But when I write an xtc it's rounded to 3rd digit. If you think I should validate it some other way, please let me know.
Concerning the STEM, I'll implement it like Josef, so comparison will be easier.
Last, I also validated my H reconstruction against g_protonate. The RMSD of the new built Hs with my program against those from g_protonate is below 0.01 A.

Last about making available my two codes:
- For the script calculating OP from UA traj, I do my best to upload a first version this week. Unfortunately, I have little time since I'm attending a conference in Berlin.
- For the other script I hope to upload something at the beginning of July.
ReplyDelete
Replies
Ángel PiñeiroJune 25, 2019 at 7:39 PM
As far as I see the three codes (Josef, Patrick and mine) provide exactly the same OP values once the Hydrogens are rebuilt. There is a difference in the STEM values (my calculation is a bit more conservative than yours) but this can be corrected easily, if necessary. At the moment I do not see a reason to change my calculation. The only possible source of errors that I can imagine is in the rebuilding of Hydrogens. This is more tricky than the determination of OP. Since my code can ignore explicit H atoms (using the "-a ignH" option) I checked this for AA trajectories with explicit and rebuild Hydrogens. The results were not identical (as expected) but very similar. As explained before, I validated my rebuilding of H atoms by comparing the explicit Hydrogens in different AA trajectories with those rebuilt by using it.

@Patrick, in case of doubts perhaps it is useful to compare the output of your code for UA with that coming from my code... although I understand that your code has already been well validated and it is robust. In the end I guess your code will be the best one since it will not need any input other than the trajectory.
ReplyDelete
Replies
RebecaJune 26, 2019 at 4:19 PM
Hi,
we leave here the routes of the MD trajectories we have uploaded to the respository, indicating the membrane composition of each one and the different force-fields used. All of them were carried out using a concentration of 150 mM NaCl (we are now carrying out analogous MD simulations but using just the minimum amount of ions to neutralize the systems):

POPC-POPG 7:3

Gromos-CKP

https://zenodo.org/record/2582721#.XIjZg4XgrLA

SLIPID

https://zenodo.org/record/2581186#.XIjZhIXgrLA

CHARMM36

https://zenodo.org/record/2580902#.XIjZhYXgrLA

LIPID17

https://zenodo.org/record/2585523#.XIjZhoXgrLA

POPG-POPE 3:1

Gromos-CKP

https://zenodo.org/record/2580158#.XIjZh4XgrLA

SLIPID

https://zenodo.org/record/2579675#.XIjZiYXgrLA

CHARMM36

https://zenodo.org/record/2580153#.XIjZiIXgrLA

LIPID17

https://zenodo.org/record/2579344#.XIjZioXgrLA

POPG-POPE 1:3

Gromos-Ckp

https://zenodo.org/record/2579063#.XIjZjoXgrLA

SLIPID

https://zenodo.org/record/2579224#.XIjZi4XgrLA

CHARMM36

https://zenodo.org/record/2579108#.XIjZjYXgrLA

LIPID17

https://zenodo.org/record/2579061#.XIjZj4XgrLA

POPE

Gromos-CKP

https://zenodo.org/record/2574491#.XIjZk4XgrLA

SLIPID

https://zenodo.org/record/2578069#.XIjZkYXgrLA

CHARMM36

https://zenodo.org/record/2577454#.XIjZlIXgrLA

LIPID17

https://zenodo.org/record/2577305#.XIjZlYXgrLA

POPG

Gromos-CKP

https://zenodo.org/record/2572897#.XIjZnYXgrLA

SLIPID

https://zenodo.org/record/2562853#.XIjZnIXgrLA

CHARMM36

https://zenodo.org/record/2573531#.XIjZm4XgrLA

LIPID17

https://zenodo.org/record/2573905#.XIjZmoXgrLA

POPC

Gromos-CKP

https://zenodo.org/record/2574691#.XIjZloXgrLA

SLIPID

https://zenodo.org/record/2574689#.XIjZkoXgrLA

CHARMM36

https://zenodo.org/record/2575587#.XIjZlYXgrLA

LIPID17

https://zenodo.org/record/2574959#.XIjZl4XgrLA

ReplyDelete
Replies
Ángel PiñeiroJuly 8, 2019 at 7:30 PM
Sorry for taking so much time. I have done a comparison of the OP using the explicit and rebuilt H atoms for Charmm36 POPG and for slipids POPC using my code. I took two of the trajectories recently uploaded by Antonio Peón and Rebeca García (see the list just above). In order to directly see the impact of the H-rebuilding I determined the OP for a single lipid, for a single frame and also the average OP over 20 ns. You can see the results here: http://smmb.usc.es/docs/OPcomparisonAAvsUA.pdf. I have also generated a pdb file of POPC with the rebuilt H atoms of the lipid head over the original explicit H atoms (http://smmb.usc.es/docs/POPC_rebuiltHs.pdb). For the rebuilt ones I used X as atomname in the pdb, to facilitate the selection and comparison with a molecular viewer (note that the the bond distance for my rebuilt H atoms is exactly 1 A, since this is irrelevant for the OP calculation). I am commenting here just the OP for the lipid heads since the results for the tails were almost indistinguishable.

Some comments:
As you can see in the first column of the plots (http://smmb.usc.es/docs/OPcomparisonAAvsUA.pdf) the impact of my rebuilt H atoms in the OP is significant for some positions and almost negligible for others. G1-G3 seem to be more sensitive than alpha and beta.
If the differences are just random noise, they are expected to be diluted when averaging over many lipids. Actually this is visible in the second and third columns (average over a single frame with 500 lipid molecules and average over 20 ns for the same system).
The results for charmm-POPG are reasonably good for all the atoms but for slipids-POPC the difference between explicit and rebuilt H atoms is significant for G3S and for G2R. The reason of the differences is not the method employed for the calculation since I used the same code. I do not know if the rest of the methods behave similarly. I will try to repeat the same analysis for other trajectories (lipids and forcefields).
ReplyDelete
Replies
Antonio PEÓNJuly 17, 2019 at 2:03 PM
Hello, I have compared the head orden parameters for the different all atoms force fields (CHARMM36, SLIPID) using explicit and rebuilt H atoms. The analysis was performed over the last 100 ns of the trajectories (500 ns long with 500 lipids) with POPE, POPC and POPG bilayers in 150 mM NaCl. You can see the OP for both explicit Hs(dot) and rebuilt Hs ( line, calculated using the Ángel's code) giving very similar results (within the statistical uncertainties). The biggest difference is in hydrogen g2 and specifically in SLIPID. Link: https://drive.google.com/open?id=10tT0YLT4OHajEbsLNDP1XBLm71sKXtoO
ReplyDelete
Replies
Patrick FuchsAugust 14, 2019 at 9:19 PM
Hello all, I'm very sorry for my late answer but it took me longer than expected. Especially because I wanted to make a thorough comparison / validation of my program for rebuilding hydrogens / calculating OPs vs the programs from Josef and Angel. And also because I wanted to add docstrings and an understandable documentation.

So my program is called "buildH" and the repository is on github: https://github.com/patrickfuchs/buildH. I didn't upload any script on nmrlipids github repo for now since more than one file is required, I thought it would be easier like that. If you use it, please do some regular "git pull" since we will work on it on august / september. There are still many little things to do, but the core of the program is working.
Importantly, I made a thorough validation of buildH which is also on github (https://github.com/patrickfuchs/buildH/tree/master/CHARMM36_POPC_validation, see report_buildH.ipynb or report_buildH.pdf). For easier comparison, buildH outputs the OPs in the format of the program from Josef *and* that from Angel. For the validation, I used the strategy like in the paper from Thomas Piggot: starting from an all-atom (POPC CHARMM36) traj, I remove the Hs, reconstruct them and calculate the OPs and compare them to the initial traj with Hs (OPs calculated with Josef's prog). You will see in the report that both Angel program and buildH agree very well (except for a few details). I also put there some links on output files with numbers in case they are useful. Don't hesitate if you have questions / comments.

For the other script building automatically the required files for buildH, I didn't have time to complete it.

In the near future, we plan to add some improvements to buildH (such as multithreading, unit tests, automatic topology detection using the other script, etc).
ReplyDelete
Replies
RebecaAugust 17, 2019 at 6:08 PM
Hi,
In my opinion, it is noticeable the difference found in the order parameters calculated with the script from Ángel Piñeiro for some of the hydrogens in carbons g2 and g3 in CHARMM and SLIPIDs (notice the graphics uploaded by Antonio Peón: https://drive.google.com/open?id=10tT0YLT4OHajEbsLNDP1XBLm71sKXtoO). It does not seem to be a problem with the reconstruction of implicit hydrogens done with the script, since the script’s reconstruction matches much better with the explicit hydrogens found in other force-fields (CHARMM, in particular). I think it is a more general problem that has the origin in the different torsion angles considered for the angles involving g2 and g3 in both force-fields (CHARMM and SLIPIDS). This difference was also observed in your previous manuscript (Figure 2 in https://pubs.acs.org/doi/pdf/10.1021/acs.jpcb.5b04878?rand=p3onqvgq). There you can notice that again Slipids is the force-field fitting worse with experimental order parameters calculated for DPPC and POPC for these carbons.
Curiously, if you go to the original paper of Slipids (https://pubs.acs.org/doi/pdf/10.1021/jp212503e?rand=mo3qdgjs), they say they take the parameters for all covalent bonds and angles, LJ and torsional parameters for the lipid head group from the original CHARMM36 (C36) FF (https://pubs.acs.org/doi/pdf/10.1021/jp101759q?rand=19u4avp3), BUT they also say they derived new parameters “because of the known difficulties of reproducing the vdW dispersion interaction by ab initio methods, experimental heats of vaporization and densities were used during the fitting of the LJ parameters. First, the new charges were used with the original parameters from C36 (LJ and torsional) and the LJ parameters were then altered to agree with experiments experimental results. After this, the torsional potentials were fitted from ab initio computations for the model compound. After one round in the parametrization scheme, it was necessary to refit the LJ parameters again and the torsional potentials until self-consistency was obtained.” So, it seems that people from Slipids “manually” changed some torsion parameters, affecting the order parameters for atoms g2 and g3, as our results (and previous results) are showing.
As a conclusion I would say that inconsistences between OP coming from explicit and rebuilt H atoms are due to the mis-geometrical-optimization of the explicit H atoms in the target forcefield.
ReplyDelete
Replies
Antonio PEÓNAugust 23, 2019 at 12:52 PM
Running a similar orden parameters analysis for tails as with heads for the different all atoms force fields (CHARMM36, SLIPID) using explicit and rebuilt H atoms (implicit, Ángel’s script). For both hydrogens in a tail the analyses are very similar to the implicit hydrogens.
https://drive.google.com/file/d/1viLnB9UAIN8Kpo7LKvOxWbiBG512TfjP/view?usp=sharing
ReplyDelete
Replies
Ángel PiñeiroSeptember 26, 2019 at 10:43 PM
Dear Samuli, Patrick, Josef and any other interested or involved in the calculation of order parameters. I would like to know your opinion, and to reach an agreement, on a couple of points:

It seems that there are some differences in the order parameter values depending on the calculation method and on the force field. Rebeca found that the force fields for which the OP determined using the explicit and the rebuilt H atoms (based on optimized geometry) are significantly different, match with those for which the theoretical OP differ more from the experimental values. The comparison between OP values obtained from explicit and rebuilt H atoms is useful for the validation of the scripts that determine OP values but I think we should accept the relatively small differences observed between both methods since they seem to come from the force field more than from errors in the scripts, do you agree?

On the uncertainties for the OP values obtained for a trajectory: in my scriptI take the OP values for a given C atom over frames and lipids all together, then the average value, the standard deviation of the sample (STD), and finally the standard deviation of the mean (STD/sqrt(len(OP)) are determined. As far as I understand, the uncertainty of the final OP value with a confidence level of ~68% should be directly 2*STDmean. This value is typically very small (ridiculously small) for a reasonable trajectory (several thousand frames and hundreds of lipids). In contrast, the STD of the sample, determined as I did, is quite large. The other two scripts do the calculation in a different way: the average of the OP for each C-atom for all the lipids within each frame is first determined and then the average over these average values over all the frames and the corresponding STD are provided. A much smaller value is obtained for the standard deviation of the sample in this way but I do not know how this is justified. I guess I am missing something basic…
ReplyDelete
Replies

Add comment

Please sign in before writing your comment.

Pages