next up previous contents
Next: General scheme for the Up: Avoiding the memory problem Previous: Avoiding the memory problem   Contents

Initial hessians, sparsity and storage

Before we explain the general scheme for the optimization procedure, a short discussion on the amount of memory needed for a successful optimization will be made, mainly, the initial Hessian characteristics.

The initial Hessian matrix needs of a special consideration. The matrix shape employed in section 3.1 displayed in page [*] with a small squared Hessian for a core and a diagonal vector for the environment was suitable for a very localized reaction. But when many atoms have to rearrange during the optimization a bigger square Hessian would be needed.

The sparsity of the Hessian matrix:
It is interesting to see how a Hessian looks like. In figure 3.7 there is a representation of the Hessian elements bigger than a threshold. The Hessian matrix corresponds to 148 atoms in the active site of Mandelate Racemase enzyme. The Hessian is calculated numerically in Cartesian coordinates representation. We can see that the biggest elements are in the diagonal, but there are also elements bigger than 10 kcal/(mol$ \cdot $Å$ ^2$) even quite far away from the diagonal. This fact is probably due to the Cartesian coordinates coupling and the long-range interaction in the enzymatic system. In the next section we will have to recall this fact, in other words, this kind of Hessian is not like a MC-SCF Hamiltonian which is rather sparse, concentrating the significative elements in the diagonal or near-diagonal region[243].

Figure 3.7: Elements of a (444$ \times $444) Hessian matrix bigger than a threshold. a) 10$ ^2$, b) 10$ ^1$, c) 10$ ^0$, d) 10$ ^{-1}$, e) 10$ ^{-2}$, f) 10$ ^{-3}$ in kcal/(mol$ \cdot $Å$ ^2$) This Hessian corresponds to the numerical calculation in Cartesian coordinates of the active site of Mandelate Racemase
a) \includegraphics[width=0.4\textwidth]{Figures/Matrix/abs_2.ps} \includegraphics[width=0.4\textwidth]{Figures/Matrix/abs_1.ps} b)
c) \includegraphics[width=0.4\textwidth]{Figures/Matrix/abs0.ps} \includegraphics[width=0.4\textwidth]{Figures/Matrix/abs1.ps} d)
e) \includegraphics[width=0.4\textwidth]{Figures/Matrix/abs2.ps} \includegraphics[width=0.4\textwidth]{Figures/Matrix/abs3.ps} f)

When there are many significative elements out of the diagonal the initial Hessian matrix represented as a small matrix plus a vector is not the most suitable. A sparse Hessian when only some elements are stored has been suggested for the Newton-like optimizations[157].

Optimization with a sparse Hessian:
We have tested how important is the exclusion of some elements in the Hessian matrix. The antamanide polypeptide studied in section 3.1 is used here to carry out several TS optimizations using RFO method. The tests consist in taking as initial Hessian matrix a sparse Hessian where only the elements bigger than a threshold are stored.

Procedure: In order to test only the influence of a sparse Hessian we work with the same LAPACK diagonalization subroutine [246] that needs the whole matrix in memory. In the next section we implement a real save memory strategy with the iterative Lanczos diagonalization.4.7

We calculate the Hessian numerically. When the calculated element matrix is smaller than the given threshold this element is set to zero. This matrix element will be zero all along the optimization process and the Powell Hessian update will not be permitted to change it. RFO equations are solved and the optimization process is performed as explained in section 3.1. The convergence criteria is 10$ ^{-4}$ kcal/(mol$ \cdot $Å) on the gradient norm.

Results: In table 3.17 we show the results of two different structure optimizations. We represent the steps required to converge depending on the initial sparse Hessian matrix.

Table 3.17: Steps required to converge two different structures taking as initial Hessian a sparse matrix where only elements bigger than a threshold (first column) are stored
Threshold$ ^a$ $ \vert g_0\vert=8.1$ $ \vert g_0\vert=0.1$ approx % stored$ ^b$
10 not converged not converged 6.2
1 142 25 10.5
10$ ^{-1}$ 165 24 22.1
10$ ^{-2}$ 193 6 41.1
10$ ^{-3}$ 195 9 65.3
10$ ^{-4}$ 147 9 85.7
10$ ^{-5}$ 158 9 96.3
10$ ^{-6}$ 149 9 99.4
$ ^a$ In (kcal/(mol$ \cdot $Å$ ^2$)
$ ^b$ The percentage depends on the initial structure, but in both cases are very similar


From the table 3.17 we can extract some conclusions. When the initial structure is too far from the stationary point ($ \vert g_0\vert=8.1$) the initial Hessian has few influence on the number of iterations. After one hundred of steps the Hessian has been updated one hundred of times and there is no trace of the initial matrix shape. On the contrary, when the initial gradient is closer to the stationary point the optimization obviously takes less steps to converge. It can be seen that we obtain the same results storing the elements bigger than 10.0$ ^{-2}$ kcal/(mol$ \cdot $Å$ ^2$), that is, storing only the 41.1 % of a Hessian, than storing all the Hessian, In addition, we obtain reasonable results storing only the 22.1 % and even the 10.5 %. The optimization of both structures fail when the threshold is too big (10 kcal/(mol$ \cdot $Å$ ^2$)), this means that the essential matrix elements are not stored.

From the above results a rapid conclusion is that a considerable number of Hessian elements can be discarded in the storage and consequently a method to save memory is a worthy task.

In the next section we will test these kind of Hessians with the general shape displayed in figure 3.8.

Figure 3.8: General scheme for an initial Hessian. In black a core where all elements are computed, in grey sparse Hessian where only elements bigger than a threshold are stored. The rest is represented by a diagonal vector
\includegraphics[width=0.3\textwidth]{Figures/Matrix/sparsescheme.eps}

From the input we will decide which zone is represented with a full Hessian(in black), sparse with a threshold (grey) and with a diagonal vector.
next up previous contents
Next: General scheme for the Up: Avoiding the memory problem Previous: Avoiding the memory problem   Contents
Xavier Prat Resina 2004-09-09