Getting Started With Molecular Dynamics Simulation

molecular dynamics simulation

Molecular Dynamics (MD) simulation is one of the most effective computational tools and has numerous applications in physics, chemistry, biochemistry, and materials science. However, the proper employing of MD without understanding its theoretical backgrounds is arduous, if not impossible. This article explains some of the most essential molecular dynamics simulation concepts in a nutshell and without mind-boggling mathematical equations. If you want to begin learning molecular dynamics simulation or improve your knowledge of its underlying concepts, then this article is for you.

Follow me on LinkedIn

Table of Contents

What Is Molecular Dynamics Simulation?

I would define molecular dynamics simulation concisely as the approximate, numerical, and step-by-step solving of the Newtonian equation of motion for a many-body molecular system in a specified time.

Now, let me explain it more clearly. Imagine a system of many particles—a particle can be an atom, a group of atoms, or a molecule—which have determined initial positions. According to Newton’s second law of motion, if we have a way to calculate the forces acting on each particle, then we can anticipate the particles’ position after a given time interval, which is called a time step. Then, we can recalculate the forces acting on particles at their new positions and advance to the next step, and so forth. Every time we calculate the forces and anticipate a new set of positions, we do one step of the simulation, and we can carry on this step-by-step procedure throughout a pre-specified time. Because molecular systems typically consist of numerous particles, it is impossible to solve the problem analytically. Therefore, the MD method tackles the problem numerically.

Each molecular dynamics simulation step has two parts: calculating the forces acting on the particles and integrating (solving) the equation of motion. The coordinates of particles calculated in each step alongside their velocities and forces specify a point in phase space. At the end of the simulation, we have a sequence of points in the phase space called a trajectory. Molecular dynamics simulations are approximate in two senses. First, they calculate the forces approximately. Second, their integration algorithms solve the equations of motion approximately.

Why do we use molecular dynamics simulation?

Imagine a small and rigid molecule; such a molecule has a potential energy surface (PES) with a few deep wells. The system will likely stay at the bottom of one of those wells. The geometry of the bottom of the well properly represents the molecule’s geometry and can be found by a relatively simple geometry optimisation. We also have some analytical solutions from statistical mechanics to estimate the thermodynamic properties of such systems from the optimised structure’s electronic energy and vibrational frequencies. Now imagine a macro and floppy molecule or a system consisting of many molecules. This system has a vast PES with numerous shallow wells and doesn’t stay in any of the wells for a long time. Now, how can we determine the state of the system? What geometry represents the geometry of the system? How can we estimate the thermodynamic properties of this system?

In situations like these, none of the many shallow wells can accurately represent the system’s state, and the actual state of the system fluctuates between them. The most effective approach to dealing with such systems is to take an average over a reasonable phase space area. For this very reason, we use molecular dynamics simulation. However, Molecular dynamics simulation takes a time average instead of a phase average. Still, according to ergodic theory, the time average is almost equal to the space average for an ergodic transformation.

Shallow vs. deep potential energy surface
Figure 1: deep vs shallow potential energy surface (PES). The image depicts a two dimensional PES for simplicity; the actual PES is in an n-dimensional space, in which n is the number of internal coordinates plus one.

Another significant functionality of molecular dynamics simulations is studying the time evolution of an atomistic system. In many cases, we are not interested in finding the system’s equilibrium properties; instead, the path to reaching equilibrium is important to us. Sometimes, we need to determine if a particular molecular process is feasible. For example, consider a situation where we study the permeation of a molecule through a membrane. Here, we can devise a system where the membrane is exposed to a specific set of molecules and then conduct a molecular dynamics simulation. If the simulation time is sufficiently long, the system’s time evolution can determine the membrane’s permeability.

In summary, we use molecular dynamics simulation to determine the equilibrium properties of systems with extensive and shallow potential energy surfaces, such as  protein-ligand complexes, and to study the time evolution of molecular systems.

What are force fields in molecular dynamics simulation?

As I mentioned earlier, molecular dynamics simulation basically solves the Newtonian equation of motion. According to Newtonian physics, we can calculate force from the derivative of potential energy ( MD simulations: potential energy equation ). The energy of an atomistic system is the function of its electronic structure and is called electronic energy. Thus, we need to determine electronic energy first. In classical molecular dynamics simulation, electronic energy is considered a parametric function of the nuclear coordinates rather than the result of solving the laborious Schrodinger equation. The MD parameters can be fit to match experimental or high-level ab initio pre-computed data.

A force field is a combination of certain parametric functions and their parameters, defining the interaction between the system’s particles. The force field functions consist of bonded, non-bonded, and cross terms, each describing the energy required to distort the molecular system in a specific way (figure 3).
The bonded terms are the stretch, bending, and torsional terms, and non-bonded terms are the van der Waals and electrostatic terms. Finally, the cross-terms describe the coupling between the bonded terms. Different force fields utilize similar functional forms for each term; for example, many force fields use Morse potential as the stretch function or Lennard-Jones potential as the van der Waals function, but each has a unique set of parameters.

bonded and non-bonded interactions in molecular dynamics force field.
Figure 2: bonded and non-bonded interactions

Classification of Molecular Dynamics Force Fields

Many different force fields have been designed for various purposes. Force fields can be categorized into five main groups: Fixed-Charge atomistic force fields, polarizable force fields, coarse-grained force fields, reactive force fields, and machine-learning force fields.

Fixed-Charge Atomistic Force Fields

Most of the well-known force fields like AMBER 1, CHARMM 2, GROMOS 3, and OPLS 4 fit in this category. “The empirical functional form of classical fixed-charge force fields dates back to 1969 and remains essentially unchanged. In a fixed-charge force field, the polarization is not modelled explicitly, i.e. the effective partial charges do not change depending on conformation and environment.” 5 This simplification causes a dramatic reduction in computational cost compared to polarizable force fields. Nonetheless, the simulations employing carefully parametrized fixed-charge force fields can provide useful insights into biological and chemical questions.

Polarizable force fields

“Standard force fields describe electrostatic interactions in terms of fixed, usually atom centred, charges. However, real physical systems polarize substantially when placed in a high-dielectric medium such as water — or even when a strongly charged system approaches a neutral body in the gas phase. Such polarization strongly affects the geometry and energetics of molecular recognition.” 6 . Polarisable force fields solve this problem by allowing the charge distribution to alter with the dielectric environment. The GEM 7, ORIENT 8, PFF 9, and PIPF 10 are some examples of polarizable force fields; also there are polarizable versions of AMBER 11 and CHARMM 12 force fields.

Coarse-grained force fields

Although Molecular dynamics simulation with atomistic force fields is a highly efficient tool to study chemical and biological processes, its computational cost still is too expensive for studying many essential phenomena like protein folding, ion channel gating, and membrane remodelling 13. Coarse-grained force fields reduce the computational cost of calculations by reducing the model system’s complexity by grouping atoms together into virtual particles.

Reducing the number of particles reduces the number of interactions to calculate, allowing us to simulate larger systems for more extended times by sacrificing chemical details. MARTINI, a modern coarse-grained force field can lead to a speed-up of 2–3 orders of magnitude compared to atomistic simulations. On the other hand, the range of applicability of coarse-grained force fields is more limited than those of atomistic force fields, which means that coarse-grained force fields parameterized for specific molecules and phenomena are more likely to fail for other molecules and phenomena than their atomistic counterparts 13.

Some of the most popular Coarse-grained force fields are MARTINI 14, SIRAH 15, VAMM 16, and DPD 17.

Reactive force fields

Classical atomistic force fields are well suited for nonreactive interactions such as angle-strain, dispersion, and Coulombic interactions. However, they are incapable of modelling changes in atom connectivity (i.e., for modelling chemical reactions as bonds break and form) 18. Reactive force fields have been devised to address this issue by including some connection-dependent terms in the force field. There are two main formulas for making force fields reactive: EVB (Empirical valence bond) invited by  Warshel and coworkers 19; and ReaxFF developed by Adri van Duin, William Goddard and coworkers 20.

The Reax force fields utilize a bond-order formalism to describe reactive events, where bond order is empirically calculated from interatomic distances. Electronic interactions are treated implicitly in this method, allowing us to simulate chemical reactions without cumbersome, explicit, quantum mechanical consideration.

Machine learning force fields

Forget everything I said above about what is a force field. Machine learning force fields are a whole different story. In contrast to conventional force fields, which have analytical functional forms based on physical laws, ML force fields are just mathematical functions to elicit potential-energy or forces from atomic configuration without any direct physical meaning. Different types of machine learning force fields have been invented based on different machine learning branches. The most promising recipe, devised by Behler and Parrinello, are based on artificial neural networks 21. An ML force field can be as accurate as the first-principles data set they trained to, but several orders of magnitude faster than that. Artificial neural network force fields can describe labyrinthine potential energy surfaces, which are hard to define by physical force fields 22. Still, they have disadvantages; the most critical drawback of ML force fields is that they are limited to systems containing only a few different chemical elements 23

Integration Schemes in Molecular dynamics

Having forces acting on each particle, you can solve Newton’s equation of motion. However, solving the equations for numerous particles simultaneously is not as easy as the problems of classical physics textbooks; therefore, we need an efficient way to deal with the equations. An Integration scheme is basically an approximate approach to solve Newton’s equations of motion for a poly-particle system. Various Integration techniques have been devised. One of the best yet simplest strategies, the Verlet algorithm is simply the Taylor expansion of particles’ coordinates around time 24.

molecular dynamics simulation: verlet equation

The Verlet algorithm is a very efficient way of solving the equation of motion; however, its estimation of the new position contains an error of the order of t2. A more complicated algorithm, the velocity-Verlet algorithm, incorporates particles’ velocity into the scheme to reduce the error to the order of

molecular dynamics simulation: velocity verlet equation

A few more accurate integration schemes have been invented, Still, all of them are approximate methods, and you should bear in mind that the estimation error always increases rapidly with the time step.

Molecular Dynamics Simulation in various Ensembles

An ensemble is a concept borrowed from statistical thermodynamics, which J. W. Gibbs introduced in 1902. In statistical thermodynamics, each system can be described by a set of macroscopically observable variables resulting from different microscopic constraints. There are five ensembles. In each of them, one group of the observable variables are kept macroscopically static:

Ensemble Name Fixed Macroscopic Variables
Microcanonical (NVE) number of particles, volume, and total energy
Canonical (NVT) number of particles, volume, and temperature
Grand canonical (μVT) chemical potential, volume, and temperature
Gibbs or isobaric-isothermal (NPT) number of particles, pressure, and temperature
Enthalpy or isoenthalpic-isobaric (NPH) number of particles, pressure, and enthalpy

If we don’t impose any extra constraints by additional algorithms in a molecular dynamics simulation, the number of particles, volume, and total energy remain static; therefore, the microcanonical ensemble is the natural ensemble of the molecular dynamics technique. However, ensembles other than the microcanonical one could be generated in a molecular dynamics simulation to mimic the experimental conditions better; for example, it is more realistic to simulate biological processes in the NPT ensemble.

Various algorithms have been invited to convert molecular dynamics ensembles into each other. We can categorize these algorithms based on the variables they keep static. The algorithms which keep temperature steady are called thermostats, and those that keep pressure stable are called barostats. Some popular thermostats and barostats are:

Andersen Thermostat

Approach

The Andersen thermostat couples the velocity of some randomly selected particles to a heat bath. It draws the new velocity of the particles from a Maxwell-Boltzmann distribution corresponding to the desired T. 25 

Pros
  • Produce a canonical distribution
  • Good results for static properties
Cons
  • The dynamics generated by it is unphysical
  • Unrealibe results for dynamics properties

Nose-Hoover Thermostat

The Nose-Hoover thermostat adds artificial coordinates and velocities to the Lagrangian Eq. of motion. 26 

Pros
  • Deterministic,
  • The most reliable method for simulation at equilibrium and for predicting thermodynamic properties.
Cons
  • Slow,
  • Not suitable for pre-equilibrium states
  • Not appropriate for NVT equilibration

Berendsen Thermostat

Approach

The Berendsen thermostat couples the system temperature to an external bath with constant pressure and uses adjustable time constants for coupling. 27 

Pros
  • Fast and smooth
  • The coupling can be made as weak as desired to minimize the disturbance of the system
  • Suitable for non-equilibrium molecular dynamics
Cons

Canonical Sampling Velocity Rescale Thermostat

Approach

The velocity rescale thermostat rescals the velocities of all the particles by an adequately chosen random factor. This thermostat is similar to Berendsen, but the stochastic term improves the quality of the canonical ensemble. 28 

Pros
  • It leads to fast equilibration.
  • Once the equilibrium is reached, the proper canonical ensemble is sampled,
  • Less artifacts than Berendsen Thermostat

Berendsen Barostat

Approach

The Berendsen Barostat couples the system pressure to an external bath with constant pressure and adjustable time constants for the coupling. 27

Pros
  • Fast and smooth
  • The coupling can be made as weak as desired to minimize the disturbance of the system.
  • Suitable for non-equilibrium molecular dynamics
Cons
  • Not suitable for simulation at equilibrium states

Parrinello-Rahman Barostat

Approach

The Parrinello-Rahman barostat Extends the Lagrangian equation on motion so that the MD cell shape and size can change according to dynamical equations. 29

Pros
  • Deterministic
  • The most reliable method for simulation at equilibrium
  • The most reliable method for predicting thermodynamic properties
Cons
  • Slow
  • Not suitable for pre-equilibrium states

Molecular Dynamics simulations vs Monte Carlo simulations

Monte Carlo and Molecular Dynamics are allied methods and have similarities and differences. Both methods probe phase space using the same force fields to collect enough samples for further analysis. Unlike molecular dynamics, which solve the deterministic Newton’s equation of motion, Monte Carlo uses a stochastic manner to probe phase-space. The Monte Carlo sampling algorithm, the Metropolis procedure, produces a sequence of points in phase space (trajectory) from an initial geometry by adding a series of small random disturbances to the system’s coordinates. After each disturbance, the algorithm accepts or rejects the move according to the energy difference created by it. If the energy is reduced, then the new coordinate is accepted immediately. If the energy increases, then the odds of accepting the new coordinate is based on the Boltzmann distribution function.

monte carlo simulation formula
ref 31

Performing the Metropolis procedure is simpler than solving the equation of motion. Therefore, compared with molecular dynamics simulations, Monte Carlo simulations demand less computational cost. Another advantage of Monte Carlo methods is the ability to reach the high energy regions of phase space, which are hard to find by the MD approach. On the other hand, the MC scheme has some disadvantages. The most notable one is the lack of the time dimension and atomic velocities, which means we can not study transport properties like diffusion coefficients or viscosity.

As a final note, the Monte Carlo simulations’ natural ensemble is the canonical ensemble (NVT), which differs from the natural ensemble of molecular dynamics, the micro-canonical ensemble (NVE).

Classical molecular dynamics vs ab initio molecular dynamics

Despite the outstanding success of classical molecular dynamics methods, there are two fundamental obstacles to utilizing these methods: first, the need for predefined potentials that restrict its practicality scope. Second, the lack of ability to change bonding patterns during the simulation means the lack of ability to study phenomena involving chemical bond breaking or formation. The ab initio molecular dynamics (AIMD) methods have been invited to tackle these issues. The basis of every ab initio molecular dynamics scheme is to compute forces acting on the atoms from electronic structure calculations in the course of simulations, instead of using fixed, precalculated potentials 30. AIMD simulations are prevalent in materials science and chemistry, where the difficulties mentioned above are severe. This method may be called by other names in literature like quantum molecular dynamics, first-principle molecular dynamics, or on-the-fly molecular dynamics.

There are three primary schemes of ab initio molecular dynamics:

Ehrenfest molecular dynamics

Born-Oppenheimer molecular dynamics

Car-Parrinello molecular dynamics

Finally, one must bear in mind that the computational cost of ab initio molecular dynamics methods is a lot more than classical molecular dynamics ones. Therefore, they are restricted to systems of up to a few hundred atoms and simulations of a few hundred pico-seconds.

Periodic Boundary Conditions (PBCs)

In molecular dynamics simulations, we simulate the movement and interactions of a collection of particles, such as atoms or molecules. However, simulating a sufficiently large number of particles requires impractical computational resources. Even a microscopic bit of material comprises trillions of particles (recall Avogadro’s law). In a practical molecular dynamics simulation, we simulate only a minuscule part of the material, called a simulation box. However, some problems arise because of the nanometric dimensions of simulation boxes. In such a simulation box, the ratio of edge particles to inner particles is unusually high, and the particles near the edges behave unnaturally since there is nothing beyond them. This is different from how actual materials work because real systems are much larger and continuous. To fix this, we use periodic boundary conditions.

Periodic boundary conditions allow a simulation box to behave as if it is part of an infinite system. You can think of the simulation box as a tile in a grid, with copies of the box surrounding it. When a particle exits the box on one side, it immediately re-enters from the opposite side. Likewise, when particles interact near the edges, they can see particles in the neighbouring copies of the box. This setup produces the illusion of a continuous, infinite material without really simulating infinite particles.

To visualize periodic boundary conditions, think of the arcade video game Pac-Man. In Pac-Man, if the character exits the screen on the right, he reappears on the left. The game world feels continuous, even though it’s just a tiny screen. Similarly, the simulation box in molecular dynamics is like that continuous screen but with three dimensions instead of two, and the particles re-enter the simulation box when they leave the edges. This allows us to simulate behaviour closer to how particles behave in an actual bulk material.

periodic boundary conditions
Figure 3. A 2d illustration of a simulation box and its neighbouring boxes. When a water molecule leaves the central box from the left side, it re-enters the box from the right side.

Periodic boundary conditions are essential because they reduce artificial edge effects. Without them, particles at the edges of the box behave differently from those in the centre because they don’t feel neighbour particles. By applying PBCs, every particle has a similar environment, no matter where it is in the box. PBCs help us model bulk properties, like temperature or pressure, as they appear in a larger system. It’s a clever trick to balance computational efficiency with physical realism.

While periodic boundary conditions are very useful, they have some limitations. For example, if the simulation box is too small, particles might interact with their own “images” in the neighbouring boxes, which isn’t realistic. This can lead to errors in the results. To minimize these effects, we ensure the box is large enough that particles don’t interact with their own images. Additionally, PBCs are unsuitable for systems with surfaces since these scenarios naturally involve edges. However, there are solutions for that. Despite these limitations, periodic boundary conditions are one of the most widely used techniques in molecular dynamics to simulate realistic systems efficiently.

Follow me on LinkedIn

References

  1. J. W. Ponder, D. A. Case, "Force Fields for Protein Simulations", Advances in Protein Chemistry, 66, 27-85.
  2. A. D. MacKerell et al., "All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins", The Journal of Physical Chemistry B, 102, 3586-3616.
  3. L. D. Schuler, X. Daura, W. F. van Gunsteren, "An improved GROMOS96 force field for aliphatic hydrocarbons in the condensed phase", J. Comput. Chem., 22, 1205-1218.
  4. W. L. Jorgensen, D. S. Maxwell, J. Tirado-Rives, "Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids". J. Am. Chem. Soc. 118, 11225–11236.
  5. S. Riniker, "Fixed-Charge Atomistic Force Fields for Molecular Dynamics Simulations in the Condensed Phase: An Overview", Journal of Chemical Information and Modeling, 58 , 565-578.
  6. T. A. Halgren, W. Damm, "Polarizable force fields", Current Opinion in Structural Biology, 11, 236-242.
  7. N. Gresh et al., "Anisotropic, Polarizable Molecular Mechanics Studies of Inter- and Intramolecular Interactions and Ligand-Macromolecule Complexes. A Bottom-Up Strategy", Journal of Chemical Theory and Computation, 3, 1960–1986.
  8. "Anthony Stone: Computer programs". www-stone.ch.cam.ac.uk
  9. J. R. Maple et al., "A Polarizable Force Field and Continuum Solvation Methodology for Modeling of Protein-Ligand Interactions", Journal of Chemical Theory and Computation, 1, 694–715.
  10. J. Gao, D. Habibollazadeh, L. Shao, "A polarizable intermolecular potential function for simulation of liquid alcohols". The Journal of Physical Chemistry, 99, 16460–7.
  11. L. Yang et al., "New-generation amber united-atom force field", The Journal of Physical Chemistry B, 110, 13166–76.
  12. S. Patel, C. L. Brooks, "CHARMM fluctuating charge force field for proteins: I parameterization and application to bulk organic liquid simulations", Journal of Computational Chemistry, 25, 1–15.
  13. J. Barnoud, L. Monticelli, (2015) "Coarse-Grained Force Fields for Molecular Simulations. In: Kukol A. (eds) Molecular Modeling of Proteins. Methods in Molecular Biology (Methods and Protocols)", vol 1215. Humana Press, New York, NY.
  14. S. J. Marrink et al., "The MARTINI force field: coarse grained model for biomolecular simulations". The Journal of Physical Chemistry B., 111, 7812–24.
  15. D. Leonardo et al.,"SIRAH: A Structurally Unbiased Coarse-Grained Force Field for Proteins with Aqueous Solvation and Long-Range Electrostatics", Journal of Chemical Theory and Computation, 11, 723-739.
  16. A. Korkut, W. A. Hendrickson, "A force field for virtual atom molecular mechanics of proteins", Proceedings of the National Academy of Sciences of the United States of America, 106, 15667–72.
  17. Español, P. Warren, "Statistical Mechanics of Dissipative Particle Dynamics", Europhysics Letters, 30, 191–196.
  18. Senftle et al. "The ReaxFF reactive force-field: development, applications and future directions". npj Comput Mater 2, 15011.
  19. Warshel, R. M. Weiss, "An empirical valence bond approach for comparing reactions in solutions and in enzymes", Journal of the American Chemical Society, 102, 6218-6226.
  20. C. T. van Duin, S. Dasgupta, F. Lorant, W. A. Goddard, "ReaxFF: A Reactive Force Field for Hydrocarbons", The Journal of Physical Chemistry A, 105, 9396-9409.
  21. Behler, M, Parrinello, "Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces", Phys. Rev. Lett. 98, 146401.
  22. V. Botu et al., "Machine Learning Force Fields: Construction, Validation, and Outlook, The Journal of Physical Chemistry C", 121, 511-522.
  23. J. Behler et al., "Perspective: Machine learning potentials for atomistic simulations", J. Chem. Phys., 145, 170901.
  24. D. Rapaport, "The Art of Molecular Dynamics Simulation" (2nd ed.), Cambridge: Cambridge University Press.
  25. H. C. Andersen, "Molecular dynamics simulations at constant pressure and/or temperature", J. Chem. Phys. 72, 2384.
  26. W. G. Hoover, "Canonical dynamics: Equilibrium phase-space distributions", Phys. Rev. A, 31, 1695.
  27. H. J. C. Berendsen et al., "Molecular dynamics with coupling to an external bath", J. Chem. Phys. 81, 3684-90.
  28. G. Bussia, D. Donadio, M. Parrinello, "Canonical sampling through velocity rescaling", J. Chem. Phys. 126, 014101.
  29. M. Parrinello, A. Rahman, "Polymorphic transitions in single crystals: A new molecular dynamics method, Journal of Applied Physics", 52, 7182.
  30. John von Neumann Inst. for Computing, NIC series 1., IV, 562 S. (2000).
  31. J. Chen, 2018 IOP Conf. Ser.: Earth Environ. Sci., 128, 012110.

7 thoughts on “Getting Started With Molecular Dynamics Simulation”

  1. An attractive alternative is to work with an atomic-level computer simulation of the relevant biomolecules. Molecular dynamics MD simulations predict how every atom in a protein or other molecular system will move over time, based on a general model of the physics governing interatomic interactions Karplus and Mc Cammon, 2002. These simulations can capture a wide variety of important biomolecular processes, including conformational change, ligand binding, and protein folding, revealing the positions of all the atoms at femtosecond temporal resolution.

  2. In general, a small thermostat timescale parameter causes a tighter coupling to the virtual heat bath and a system temperature that closely follows the reservoir temperature. Such a tight thermostat coupling is typically accompanied by a more pronounced interference with the natural dynamics of the particles, however. Therefore, when aiming at a precise measurement of dynamical properties, such as diffusion or vibrations, you should use a larger value of the thermal coupling constant or consider running the simulation in the NVE ensemble to avoid artifacts of the thermostat.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top