{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# Quantum heat capacity of water\n\n:Authors:\n Filippo Bigi [@frostedoyster](https://github.com/frostedoyster/);\n Michele Ceriotti [@ceriottm](https://github.com/ceriottm/)\n\nThis example shows how to estimate the heat capacity of liquid water\nfrom a path integral molecular dynamics simulation. The dynamics are\nrun with [i-PI](http://ipi-code.org), and\n[LAMMPS](http://lammps.org) is used\nas the driver to simulate the [q-TIP4P/f water\nmodel](http://doi.org/10.1063/1.3167790).\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import os\nimport subprocess\nimport time\nimport xml.etree.ElementTree as ET\n\nimport ipi\nimport matplotlib.pyplot as plt\nimport numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## A non-trivial energy estimator\n\nAs introduced in the\n[path-integrals example](http://lab-cosmo.github.io/atomistic-cookbook/latest/examples/path-integrals),\npath-integral estimators\nfor observables that depend on momenta are generally not trivial to compute.\n\nIn this example, we will focus on the constant-volume heat capacity,\n$c_V$, which is one such\nobservable, and we will calculate it for liquid water at room temperature.\nBecause of the presence of high-frequency vibrations, many of the nuclear\ndegrees of freedom are trapped in the vibrational ground state, which reduces\nsubstantially the heat capacity from the value that would be obtained\nin the classical limit. See [this review](http://doi.org/10.1021/acs.chemrev.5b00674)\nfor an overview of the impact of quantum nuclei on the properties of water.\nFrom a computational perspective, this means that it is necessary to use\nspecialized simulations and estimators to evaluate the correct value.\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Running the PIMD calculation\n\nThis follows the same steps as the ``path-integrals`` example. One important\ndifference is that we will request the ``scaledcoords`` output to the relevant\nsection of the ``i-PI`` input XML file, which\ncontains estimators that can be used to calculate the total energy and\nheat capacity as following\n[Yamamoto, J. Chem. Phys. (2005)](https://arxiv.org/abs/physics/0505109).\n\nThe input file is shown below. It should be noted that ``scaledcoords``\nis given a finite differences displacement as a parameter. This is necessary\nas the estimators require higher order derivatives of the potential energy,\nwhich are calculated using finite differences. This also means that\nevaluating the estimator adds substantial overhead (so it is wise to only\ncompute it every few simulation steps, to eliminate correlations between\nsnapshots) and that one should be careful to use well-converged simulation\nparameters to avoid discontinuities and noise (for instance, we increase\nthe accuracy of the particle-mesh electrostatic calculation, and use a\nshifted Lennard-Jones potential to avoid a discontinuity at the cutoff).\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Open and show the relevant part of the input\nxmlroot = ET.parse(\"data/input.xml\").getroot()\nprint(\" \" + ET.tostring(xmlroot.find(\".//properties\"), encoding=\"unicode\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We launch the i-PI and LAMMPS processes, exactly as in the\n``path-integrals`` example.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# don't rerun if the outputs already exist\nipi_process = None\nif not os.path.exists(\"water-cv.out\"):\n ipi_process = subprocess.Popen([\"i-pi\", \"data/input.xml\"])\n time.sleep(2) # wait for i-PI to start\n lmp_process = [subprocess.Popen([\"lmp\", \"-in\", \"data/in.lmp\"]) for i in range(2)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Skip this cell if you want to run in the background\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"if ipi_process is not None:\n ipi_process.wait()\n lmp_process[0].wait()\n lmp_process[1].wait()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Analyzing the results\nLet's plot the potential and conserved energy as a function of time,\njust to check that the simulation ran sensibly.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"output_data, output_desc = ipi.read_output(\"water-cv.out\")\nfix, ax = plt.subplots(1, 1, figsize=(4, 3), constrained_layout=True)\nax.plot(\n output_data[\"time\"],\n output_data[\"potential\"] - output_data[\"potential\"][0],\n \"b-\",\n label=\"Potential, $V$\",\n)\nax.plot(\n output_data[\"time\"],\n output_data[\"conserved\"] - output_data[\"conserved\"][0],\n \"r-\",\n label=\"Conserved, $H$\",\n)\nax.set_xlabel(r\"$t$ / ps\")\nax.set_ylabel(r\"energy / a.u.\")\nax.legend()\nplt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As described in the [i-PI documentation](https://ipi-code.org/i-pi/output-tags.html),\nthe two quantities returned by the ``scaledcoords`` output are ``eps_v``\nand ``eps_v'``, defined in the aforementioned\n[paper](https://arxiv.org/abs/physics/0505109).\n\nThese estimators ($\\epsilon_v$ and $\\epsilon_v'$) are derived in the\n\"scaled coordinates\" formalism, which is a useful trick to avoid the\ngrowth of the error in the instantaneous values of the estimators with\nthe number of beads used in the path integral simulation.\n\nThe same paper contains the formulas to calculate the total energy and\nheat capacity from these estimators:\n\n\\begin{align}E = \\langle \\epsilon_v \\rangle \\quad\n C_V = k_B \\beta^2 \\left( \\langle \\epsilon_v^2 \\rangle - \\langle\n \\epsilon_v \\rangle^2 - \\langle \\epsilon_v' \\rangle \\right)\\end{align}\n\nFirst the energy, whose estimator will be compared to the total energy\ncalculated as the sum of the potential and kinetic energy estimators.\nSince the kinetic energy is itself calculated from a scaled-coordinates\nestimator (the \"centroid virial\" estimator), the two total energies are\nthe same.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"eps_v = output_data[\"scaledcoords(fd_delta=5e-3)\"][:, 0]\neps_v_prime = output_data[\"scaledcoords(fd_delta=5e-3)\"][:, 1]\n\nenergy_estimator = eps_v # first formula\n\nfix, ax = plt.subplots(1, 1, figsize=(4, 3), constrained_layout=True)\nax.plot(\n output_data[\"time\"],\n energy_estimator - energy_estimator[0],\n \"b-\",\n label=\"scaled coordinates estimator\",\n)\nax.plot(\n output_data[\"time\"][:],\n (\n output_data[\"potential\"]\n - output_data[\"potential\"][0]\n + output_data[\"kinetic_cv\"]\n - output_data[\"kinetic_cv\"][0]\n ),\n \"r.\",\n label=\"potential + virial kinetic\",\n)\nax.set_xlabel(r\"$t$ / ps\")\nax.set_ylabel(r\"total energy / a.u.\")\nax.legend()\nplt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And, finally, the heat capacity. Note that normally the simulation\nrequires a few ps for equilibration. Here we discard a few dozen steps\nto eliminate the initial jump, which is due to the relaxation of the\nring polymers starting from a single atomic configuration.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# i-PI scaledcoords outputs are in atomic units (see docs)\nkB = 3.16681e-6 # Boltzmann constant in atomic units\nT = 298.0 # temperature in K, as defined in the input file\nbeta = 1.0 / (kB * T)\n\nskip = 20\nheat_capacity = ( # second formula\n kB\n * (beta**2)\n * (\n np.mean(eps_v[skip:] ** 2)\n - np.mean(eps_v[skip:]) ** 2\n - np.mean(eps_v_prime[skip:])\n )\n)\nheat_capacity_per_molecule = heat_capacity / 32 # 32 molecules in the simulation\nprint(f\"Heat capacity (per water molecule): {(heat_capacity_per_molecule/kB):.2f} kB\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You may recognize that the first part of the estimator is reminiscent\nof the classical estimator for the heat capacity as the fluctuations of the\n(quantum) total energy, which in this case however requires a correction given\nby the mean of the second part of the scaled-coordinates estimator.\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Estimating the statistical error\n\nEspecially with such an underconverged simulation, it is important to\nestimate the statistical error in the heat capacity.\n\nGenerally, errors on measurements are computed\nas \"standard errors\", i.e. the standard deviation of a series of data points\ndivided by the square root of the number of data points. In our case,\nhowever, this is made more complicated by the correlation between\nclose steps in the molecular dynamics trajectory, which would lead to an\noverestimation of the number of independent samples. To fix this, we can\ncalculate the autocorrelation time of the estimators whose errors we\nwant to estimate, and apply a correction factor to the number of samples.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def autocorrelate(x):\n n = len(x)\n xo = x - x.mean() # remove mean\n acov = (np.correlate(xo, xo, \"full\"))[n - 1 :]\n return acov[: len(acov) // 2]\n\n\ndef autocorrelation_time(x):\n acov = autocorrelate(x)\n return 1.0 + np.sum(acov) / acov[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using these helper functions, we can now calculate the error on the various\nparts of the heat capacity estimator. Note also the autocorrelation times, that\nare just a little larger than one, indicating that the stride used to print out\nthe estimators is appropriate (as there is little correlation between the samples).\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Autocorrelation times (i.e. number of steps needed to have independent samples)\nautocorr_time_error_delta_eps_v = autocorrelation_time(\n (eps_v[skip:] - eps_v[skip:].mean()) ** 2\n)\nautocorr_time_error_eps_v_prime = autocorrelation_time(eps_v_prime[skip:])\n\nprint(\n f\"\"\"\nAutocorrelation times (in number of samples):\n(eps-)^2: {autocorr_time_error_delta_eps_v:.2f}\neps': {autocorr_time_error_eps_v_prime:.2f}\n\"\"\"\n)\n\n# Effective number of samples\neffective_samples_delta_eps_v = len(eps_v[skip:]) / autocorr_time_error_delta_eps_v\neffective_samples_eps_v_prime = len(eps_v[skip:]) / autocorr_time_error_eps_v_prime\n\n# Standard errors using the effective number of samples\nerror_delta_eps_v = np.std((eps_v[skip:] - eps_v[skip:].mean()) ** 2) / np.sqrt(\n effective_samples_delta_eps_v\n)\nerror_eps_v_prime = np.std(eps_v_prime[skip:]) / np.sqrt(effective_samples_eps_v_prime)\n\n# Error on the heat capacity (assuming quadrature sum)\nerror_heat_capacity = (\n kB * (beta**2) * np.sqrt(error_delta_eps_v**2 + error_eps_v_prime**2)\n)\n\nerror_heat_capacity_per_molecule = (\n error_heat_capacity / 32\n) # 32 molecules in the simulation\n\nprint(\n \"Error on the heat capacity (per water molecule): \"\n f\"{(error_heat_capacity_per_molecule/kB):.2f} kB\"\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The obtained heat capacity is consistent with the values from the literature\n(see e.g. [Ceriotti et al., J. Chem. Phys. (2011)](http://doi.org/10.1063/1.3556661)\nwhere the convergence of the heat capacity with number of beads is shown for the same\nwater model used in this example).\nHowever, the error is quite large, which is expected given the short simulation time.\nTo reduce the error, one would need to run a longer simulation. Other important error\nsources, which are not accounted for in the error estimate, are the finite size of the\nsystem and number of beads. Both of these are too small in this example to give\nreliable results.\n\nIn a realistic simulation, up to a few 100s of picoseconds might be needed to reduce\nthe sampling error to a small value (1-2% of the heat capacity). For water at room\ntemperature, you will need 32 beads at the very least (8 were used in this example).\nIt is more difficult to give a general rule for the system size: (quantum) energy\nfluctuations are usually localized, but to guarantee accurate sampling of the\nliquid structure, a few hundred water molecules would be a reasonable guess\n(32 were used in this example).\n\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}