Getting Started With Molecular Dynamics Simulation

Molecular Dynamics (MD) simulation is one of the most effective computational tools and has numerous applications in physics, chemistry, biochemistry, and materials science. However, proper employing of MD without understanding its theoretical backgrounds is arduous, if not impossible. This article explains some of the most essential molecular dynamics simulation concepts in a nutshell and without mind-boggling mathematical equations. If you want to begin learning MD simulation or improve your knowledge of its underlying concepts, then this article is for you.

What is molecular dynamics simulation?

I would define molecular dynamics simulation concisely as the approximatenumerical, and step-by-step solving of the Newtonian equation of motion for a many-body system in a specified time.

Now let me explain it more clearly. Imagine a system of many particles—a particle can be an atom, a group of atoms, or a molecule—which have determined initial positions. According to Newton’s second law of motion, if we have a way to calculate the forces acting on each particle, then we can anticipate particles’ position after a given time interval, which is called a time step. And then, we can recalculate the forces acting on particles at their new positions and advance to the next step, and so forth. Every time we calculate the forces and anticipate a new set of positions, we do one step of the simulation, and we can carry on this step-by-step procedure throughout a pre-specified time. Because molecular systems typically consist of numerous particles, it is impossible to solve the problem analytically. Therefore, the MD method tackles the problem numerically.

Each MD simulation step has two parts: calculating the forces acting on the particles and integrating (solving) the equation of motion. The coordinates of particles calculated in each step alongside their velocities and forces specify a point in phase-space. At the end of the simulation, we have a sequence of points in the phase-space called a trajectory. MD simulations are approximate in two senses. First, they calculate the forces approximately. Second, their integration algorithms solve the equations of motion approximately.

The aim of molecular dynamics simulation: Why do we perform MD simulation?

Imagine a small and rigid molecule; such a molecule has a potential energy surface (PES) with a few deep wells. The chances are that the system stays at the bottom of one of those wells. The geometry of the bottom of the well properly represents the molecule’s geometry and can be found by a relatively simple geometry optimization. We also have some analytical solutions from statistical mechanics to estimate the thermodynamic properties of such systems from the electronic energy and vibrational frequencies of the optimized structure. Now imagine a macro and floppy molecule or a system consisting of so many molecules. This system has a vast PES with numerous shallow wells and doesn’t stay in any of the wells for a long time. Now, how can we determine the state of the system? What geometry represents the geometry of the system? How can we estimate the thermodynamic properties of this system?

In such situations, none of that many shallow wells can represent the system’s state, and the actual state of the system is fluctuating between them. The best way to deal with such systems is to take an average over a reasonable amount of phase space. For this very reason, we use molecular dynamics simulation. However, MD simulation takes time average instead of phase average, but according to ergodic theory, the time average is almost equal to the space average for an ergodic transformation.

Another significant functionality of MD simulations is studying the time evolution of an atomistic system. In many studies, we do not explore the system’s equilibrium properties, but the path to reaching the equilibrium state is important for us. Moreover, sometimes we want to know whether a specific result is possible. For example, consider a situation where we study permeation of a molecule through a membrane. In this case, we can devise a system in which the membrane is prone to a set of molecules and perform an MD simulation. If the simulation time is long enough, the system’s time evolution can assess the membrane’s permeability.

What is a force field in molecular dynamics simulation?

As I mentioned earlier, MD simulation is basically solving the Newtonian equation of motion. According to Newtonian physics, we can calculate force from the derivative of potential energy $\displaystyle F=-\frac{{\partial u}}{{\partial t}}$ . The energy of an atomistic system is the function of its electronic structure and is called electronic energy. Thus, we need to determine electronic energy first. In classical molecular dynamics simulation, electronic energy is considered a parametric function of the nuclear coordinates rather than the result of solving the laborious Schrodinger equation. The MD parameters can be fit to experimental or high-level ab-initio pre-computed data.

A force field is basically a combination of some parametric functions and their parameter sets and defines the interaction between the system’s particles. The force field function consists of bonded, non-bonded, and cross terms, each describing the energy required to distort the atomistic system in a specific way (figure 3). The bonded terms are the stretch, bending and torsional terms, and non-bonded terms are the van der Waals and electrostatic terms, and finally, the cross-terms describe the coupling between the bonded terms. Different force fields utilize similar functional forms for each term, for example, many force fields use Morse potential as the stretch function or Lennard-Jones potential as the van der Waals function, but each has a unique set of parameters.

Many different force fields are designed for various purposes, so far. We can categorize force fields into five main groups:

Fixed-Charge Atomistic Force Fields

Most of the well-known force fields like AMBER 1, CHARMM 2, GROMOS 3, and OPLS 4 fit in this category. “The empirical functional form of classical fixed-charge force fields dates back to 1969 and remains essentially unchanged. In a fixed-charge force field, the polarization is not modelled explicitly, i.e. the effective partial charges do not change depending on conformation and environment.” 5 This simplification causes a dramatic reduction in computational cost compared to polarizable force fields. Nonetheless, the simulations employing carefully parametrized fixed-charge force fields can provide useful insights into biological and chemical questions.

Polarizable force fields

“Standard force fields describe electrostatic interactions in terms of fixed, usually atom centred, charges. However, real physical systems polarize substantially when placed in a high-dielectric medium such as water — or even when a strongly charged system approaches a neutral body in the gas phase. Such polarization strongly affects the geometry and energetics of molecular recognition.” 6 . Polarisable force fields solve this problem by allowing the charge distribution to alter with the dielectric environment. The GEM 7, ORIENT 8, PFF 9, and PIPF 10 are some examples of polarizable force fields; also there are polarizable versions of AMBER 11 and CHARMM 12 force fields.

Coarse-grained force fields

Although Molecular dynamics simulation with atomistic force fields is a highly efficient tool to study chemical and biological processes, its computational cost still is too expensive for studying many essential phenomena like protein folding, ion channel gating, and membrane remodelling 13. Coarse-grained force fields reduce the computational cost of calculations by reducing the model system’s complexity by grouping atoms together into virtual particles.

Reducing the number of particles reduces the number of interactions to calculate, allowing us to simulate larger systems for more extended times by sacrificing chemical details. MARTINI, a modern coarse-grained force field can lead to a speed-up of 2–3 orders of magnitude compared to atomistic simulations. On the other hand, the range of applicability of coarse-grained force fields is more limited than those of atomistic force fields, which means that coarse-grained force fields parameterized for specific molecules and phenomena are more likely to fail for other molecules and phenomena than their atomistic counterparts 14.

Some of the most popular Coarse-grained force fields are MARTINI 15, SIRAH 16, VAMM 17, and DPD 18.

Reactive force fields

Classical atomistic force fields are well suited for nonreactive interactions such as angle-strain, dispersion, and Coulombic interactions. However, they are incapable of modelling changes in atom connectivity (i.e., for modelling chemical reactions as bonds break and form) 19. Reactive force fields have been devised to address this issue by including some connection-dependent terms in the force field. There are two main formulas for making force fields reactive: EVB (Empirical valence bond) invited by  Warshel and coworkers 20; and ReaxFF developed by Adri van Duin, William Goddard and coworkers 21.

The Reax force fields utilize a bond-order formalism to describe reactive events, where bond order is empirically calculated from interatomic distances. Electronic interactions are treated implicitly in this method, allowing us to simulate chemical reactions without cumbersome, explicit, quantum mechanical consideration.

Machine learning force fields

Forget everything I said above about what is a force field. Machine learning force fields are a whole different story. In contrast to conventional force fields, which have analytical functional forms based on physical laws, ML force fields are just mathematical functions to elicit potential-energy or forces from atomic configuration without any direct physical meaning. Different types of machine learning force fields have been invented based on different machine learning branches. The most promising recipe, devised by Behler and Parrinello, are based on artificial neural networks 22. An ML force field can be as accurate as the first-principles data set they trained to, but several orders of magnitude faster than that. Artificial neural network force fields can describe labyrinthine potential energy surfaces, which are hard to define by physical force fields 23. Still, they have disadvantages; the most critical drawback of ML force fields is that they are limited to systems containing only a few different chemical elements 24

Integration Schemes in Molecular dynamics

Having forces acting on each particle, you can solve Newton’s equation of motion. However, solving the equations for numerous particles simultaneously is not as easy as the problems of classical physics textbooks; therefore, we need an efficient way to deal with the equations. An Integration scheme is basically an approximate approach to solve Newton’s equations of motion for a poly-particle system. Various Integration techniques have been devised. One of the best yet simplest strategies, the Verlet algorithm is simply the Taylor expansion of particles’ coordinates around time 25.

$\displaystyle r\left( {t+\Delta t} \right)\approx 2r\left( t \right)-r\left( {t-\Delta t} \right)+\frac{{f\left( t \right)}}{m}\Delta {{t}^{2}}$

The Verlet algorithm is a very efficient way of solving the equation of motion; however, its estimation of the new position contains an error of the order of $\displaystyle {{t}^{4}}$. A more complicated algorithm, the velocity-Verlet algorithm, incorporates particles’ velocity into the scheme to reduce the error to the order of $\displaystyle {{t}^{2}}$.

$\displaystyle r\left( {t+\Delta t} \right)\approx r\left( {t-\Delta t} \right)+2v\left( t \right)\Delta t+a\left( t \right)\Delta {{t}^{2}}$

A few more accurate integration schemes have been invented, Still, all of them are approximate methods, and you should bear in mind that the estimation error always increases rapidly with the time step.

Molecular Dynamics Simulation in various Ensembles

An ensemble is a concept borrowed from statistical thermodynamics, which J. W. Gibbs introduced in 1902. In statistical thermodynamics, each system can be described by a set of macroscopically observable variables resulting from different microscopic constraints. There are five ensembles in each of them, one group of the observable variables are kept macroscopically static:

If we don’t impose any extra constraints by additional algorithms in a molecular dynamics simulation, the number of particles, volume, and total energy remain static; therefore, the microcanonical ensemble is the natural ensemble of the molecular dynamics technique. However, ensembles other than the microcanonical one could be generated in a molecular dynamics simulation to better mimic the experimental conditions; for example, it is more realistic to simulate biological processes in the NPT ensemble.

Various algorithms have been invited to convert MD ensembles into each other. We can categorize these algorithms based on the variables they keep static. The algorithms which keep temperature steady are called thermostats, and those that keep pressure stable are called barostats. Table 1 lists some of the popular thermostats and barostats alongside their pros and cons.

Molecular Dynamics simulations vs Monte Carlo simulations

Monte Carlo and Molecular Dynamics are allied methods and have similarities and differences. Both methods probe phase space using the same force fields to collect enough samples for further analysis. On the other hand, unlike MD simulations, which solve the deterministic Newton’s equation of motion, Monte Carlo simulations use a stochastic manner to probe phase-space. The Monte Carlo sampling algorithm, the Metropolis procedure, produces a sequence of points in phase space (trajectory) from an initial geometry by adding a series of small random disturbances to the system’s coordinates. After each disturbance, the algorithm accepts or rejects the move according to the energy difference created by it. If the energy is reduced, then the new coordinate is accepted immediately. If the energy increases, then the odds of accepting the new coordinate is based on the Boltzmann distribution function.

Performing the Metropolis procedure is simpler than solving the equation of motion. Therefore, compared with MD simulations, Monte Carlo simulations demand less computational cost. Another advantage of Monte Carlo methods is the ability to reach the high energy regions of phase space, which are hard to find by the MD approach. On the other hand, the MC scheme has some disadvantages. The most notable one is the lack of the time dimension and atomic velocities, which means we can not study transport properties like diffusion coefficients or viscosity

As a final note, the Monte Carlo simulations’ natural ensemble is the canonical ensemble (NVT), which is different from the natural ensemble of molecular dynamics, the microcanonical ensemble (NVE).

Classical molecular dynamics vs ab initio molecular dynamics

Despite the outstanding success of classical molecular dynamics methods, there are two fundamental obstacles to utilizing these methods: first, the need for predefined potentials that restrict its practicality scope. Second, the lack of ability to change bonding patterns during the simulation means the lack of ability to study phenomena involving chemical bond breaking or formation. The ab initio molecular dynamics (AIMD) methods have been invited to tackle these issues. The basis of every ab initio molecular dynamics scheme is to compute forces acting on the atoms from electronic structure calculations in the course of simulations, instead of using fixed, precalculated potentials 33. AIMD simulations are prevalent in materials science and chemistry, where the difficulties mentioned above are severe. This method may be called by other names in literature like quantum molecular dynamics, first-principle molecular dynamics, or on-the-fly molecular dynamics.

There are three primary schemes of ab initio molecular dynamics:

Ehrenfest molecular dynamics

Born-Oppenheimer molecular dynamics

Car-Parrinello molecular dynamics

Finally, one must bear in mind that the computational cost of ab initio molecular dynamics methods is a lot more than classical molecular dynamics ones. Therefore, they are restricted to systems of up to a few hundred atoms and simulations of a few hundred pico-seconds.

Scroll to Top