Next: Optimization in QM/MM surfaces Up: main Previous: Some short conclusions and Contents

Optimization of systems constituted by thousands of atoms

State of the art in the study of the reaction paths:
These last years theoretical chemistry has focused its efforts on a deeper understanding of reactive systems constituted by thousands of atoms. Condensed phase reactions, enzymatic reactions and solid state catalysis are some of the most successful areas. When so many degrees of freedom are taken into account, in addition to the eternal problem of having an adequate potential, an appropriate exploration of the configurational space is crucial to calculate thermodynamic, kinetic magnitudes and to understand the mechanism of the considered process.

Some work has been done on the acceleration of this exploration by molecular dynamics for example[169,240]. Another important issue has to do with the need of building up a path connecting reactant to products prior to any dynamics calculation. That is, no matters how fast is our molecular dynamics if we explore a wrong region that is far away from any of the reaction pathways.

As we stated in the introduction section, some strategies have been proposed to elucidate the reaction pathway. Going from the classical minimum energy path (MEP)[141,138], to consider the effect of the temperature on this MEP[241,145], or even taking into account dynamical effects on an ensemble of transition paths[137].

Location of stationary points:
The procedures mentioned above need of the availability of a very cheap potential energy. However, the state of the art in enzymatic catalysis makes usage of the QM/MM methods [72]. Although QM/MM potentials were already designed to obtain a fast energy evaluation in big systems, its computational cost is still too big to afford the full exploration of the configurational space or even to locate a temperature-depending path. So the location of one of the local saddle points of a certain reaction step in the PES is still a good tool to verify the reliability of the reaction pathway. Moreover, when a QM/MM potential can be so expensive that a posterior dynamics is prohibitive, mainly in QM(ab initio)/MM, the exploration of the stationary points in a local valley is the only way out to obtain reliable information about the reactivity of our chemical system.

The location of stationary points in chemical systems involving up to few tens of atoms is a very well established field. The most effective algorithms are based on the Newton-Raphson equation that employs second derivatives of the energy. However, when hundreds of atoms need to be moved some problems emerge due to the manipulation of these very big matrices, mainly its computation, diagonalization and storage.

Coming from the reduced dimension of systems usually studied in ab initio quantum chemistry, the problem is not only to overcome the computational problems but it is also a conceptual problem. There will be many available parallel reaction paths at a given finite temperature. Actually in systems such as enzymes the potential energy surface becomes so flat that special care must be taken to perform always the search in the same local valley. So we need efficient and tight algorithms. Even more, when expensive potentials such as the QM/MM ones are used the process needs to spend as few steps as possible.

The available software is not adequate:
Most of program packages designed to study biological systems have several standard minimizers. These algorithms have been conceived for cases in which potential energy, usually molecular mechanics, is computationally cheap and the main computer demand is the storage of enormous vector arrays and the slow matrix diagonalizations. An example of these methods have already been outlined, they are steepest descent, conjugate gradient, or the more accurate Adopted Basis Newton Raphson (ABNR)[53,144], Truncated Newton[156,157] and Limited Memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)[153] (see reference [242] for a recent comparison between some of these methods). Most of times they are applied to minimize an experimental Protein Data Bank (PDB) structure before running a molecular dynamics or a normal mode analysis. However none of those methods were originally designed to search for TS structures. Then we need efficient methods to locate directly TS structures in enzymatic catalysis using the information provided by the second derivatives. In addition these methods have to be developed to use the quite expensive QM/MM potentials, but just demanding a reasonable computer cost. This is why we found the necessity to implement a method capable to find TS with QM/MM potentials in big systems. Although there are some packages such as GRACE [159,160] specially designed to perform geometry optimizations in enzymes that may be downloaded from the web free of charge ^4.1, an accurate design and test of the algorithms involved in the optimization process is still needed.

Overview of the chapter:
In the first section 3.1 we propose a methodology to locate stationary points on a QM/MM potential energy surface. This algorithm is based on a suitable approximation to an initial full Hessian matrix, either a modified Broyden-Fletcher-Goldfarg-Shanno (m-BFGS) developed here or a Powell update formula for the location of, respectively, a minimum or a transition state, and the so-called Rational Function Optimization. Although this second order method is tested on systems of small and medium size, it is an essential stage before we move to the study of big systems. The work of this section has already been published as indicated in reference [2].

In section 3.2 a systematic analysis of the micro-iterative method described in section 1.3.7.4 (page ) is presented. The accurate RFO method developed and tested in section 3.1 is coupled to a cheap minimizer such as L-BFGS. We decided that the micro-iterative procedure is the most suitable strategy to locate stationary points in our systems. So we formulated, implemented and tested several options of this method. Features such as the two regions (core/environment) size, the interaction between these two regions and the alternating frequency between the two corresponding optimization processes are analyzed. The methods are tested on two different and representative steps of Mandelate Racemase mechanism. This work coincides with our recent publication [3].

In section 3.3 we will investigate how important the accurate location of transition states is. Sometimes, when the appropriate reaction coordinate is more complicated than just a unique distance or an angle, the direct location of the stationary point making use of second derivatives is a good strategy. This will be the case to prevent from wrong conclusions when both MEP analysis or posterior free energy calculation along the reaction path are made. In particular, all the TS found in section 2.3 have been refined and the differences have been evaluated. The reference [4] is the published work that contains the data described in this section.

The last section 3.4 is a description of a method that we designed to locate stationary points on very big systems without splitting the system in different parts like micro-iterative method does. This objective has been a challenge for many researchers, that is, the search with a second-order algorithm with a strict control of positive eigenvalues without the storage and diagonalization problem. We propose a limited-memory update combined with an iterative diagonalization method similar to those used in multiconfigurational electronic problems[243,244]. We advance that this method did not work as we desired and we will describe its advantages and its failures.

Subsections

Next: Optimization in QM/MM surfaces Up: main Previous: Some short conclusions and Contents

Xavier Prat Resina 2004-09-09