IBM Skip to main content
  Home     Products & services     Support & downloads     My account  
  Select a country  
Journals Home  
  Systems Journal  
Journal of Research
and Development
  ·  Current Issue  
  ·  Recent Issues  
  ·  Papers in Progress  
  ·  Search/Index  
  ·  Orders  
  ·  Description  
  ·  Patents  
  ·  Recent publications  
  ·  Author's Guide  
  Staff  
  Contact Us  
  Related link:  
     IBM Life Sciences  
IBM Journal of Research and Development  
Volume 45, Numbers 3/4, 2001
Deep computing for the life sciences
 Table of contents: arrowHTML arrowPDF arrowASCII   This article: HTML arrowPDF arrowASCII   DOI: 10.1147/rd.453.0427 arrowCopyright info
   

The adaptive multilevel finite element solution of the Poisson–Boltzmann equation on massively parallel computers

by N. A. Baker, D. Sept, M. J. Holst, and J. A.McCammon
By using new methods for the parallel solution of elliptic partial differential equations, the teraflops computing power of massively parallel computers can be leveraged to perform electrostatic calculations on large biological systems. This paper describes the adaptive multilevel finite element solution of the Poisson–Boltzmann equation for a microtubule on the NPACI Blue Horizon—a massively parallel IBM RS/6000® SP with eight POWER3 SMP nodes. The microtubule system is 40 nm in length and 24 nm in diameter, consists of roughly 600000 atoms, and has a net charge of -1800 e. Poisson–Boltzmann calculations are performed for several processor configurations, and the algorithm used shows excellent parallel scaling.

1. Introduction

Electrostatics plays a vital role in determining the specificity, rate, and strength of interactions in a variety of biomolecular processes [1, 2]. Accurate modeling of the contributions of solvent, counter ions, and protein charges to the electrostatic field can often be very difficult and typically acts as the rate-limiting step for a variety of numerical simulations. Rather than explicitly treating the solvent and counter-ion effects in atomic detail, continuum methods such as those based on the Poisson–Boltzmann equation (PBE) are often used to represent the effects of solvation on the electrostatic properties of a biomolecule. Despite this simplification, current methods for the calculation of electrostatic properties from the PBE still require significant computational effort and typically do not scale well with increasing problem size [3].

This paper describes the investigation of a large biomolecular system using the Adaptive Poisson– Boltzmann Solver (APBS) software [4]. APBS is a new Poisson–Boltzmann solver which uses adaptive multilevel finite element techniques [3, 5, 6] to efficiently treat the numerically difficult aspects of the PBE. APBS is built on the Finite Element toolkit (FEtk) adaptive multilevel finite element package [3, 5], which is used for most of the numerically intensive aspects of the solution of the PBE.

Section 2 provides some of the background of the Poisson–Boltzmann equation and its relation to biomolecules. A brief overview of adaptive multilevel finite element techniques and their parallelization is presented in Section 3. Discussion of the implementation of these methods in APBS and FEtk is presented in Section 4, and the results of the solution of the PBE for the electrostatic potential around a microtubule structure are given in Section 5. Finally, conclusions and future work are discussed in Section 6.

2. The Poisson–Boltzmann equation

The Poisson–Boltzmann equation is a second-order elliptic partial differential equation which describes the electrostatic potential around a fixed charge distribution in an ionic solution. For more thorough reviews of this equation and its role in biological electrostatics calculations, see Davis and McCammon [1] and Sharp and Honig [7]. There are three components of the solvated biomolecular system that we must consider in order to accurately model the electrostatic potential: the solute molecule, the solvent, and the solvated ions. The solute molecule is modeled as a dielectric continuum of low polarizability embedded in a dielectric medium of high polarizability which represents the solvent. Reflecting this difference in polarizabilities, the interior of the molecule is typically assigned a relative permittivity (or dielectric constant) between 2 and 20, while the solvent is given a much larger dielectric constant, generally near 80. In most cases, the atomic charge distribution inside the biomolecule is represented by a collection of delta functions. Finally, the solvated ions surrounding the biomolecule are also modeled as a continuum, distributed according to a Boltzmann distribution. Combining Poisson's equation, used to describe the electrostatic behavior of the point charges in the dielectric continuum, with the Boltzmann charge distribution for the solvated ions gives the nonlinear Poisson–Boltzmann equation (NPBE),

nabla•[epsilon(x)nablau(x)] + kappa bar2(x) sinh [u(x)]=f(x) u(infinity)=0, (1)

or, after linearization of the hyperbolic sine term, the linearized Poisson–Boltzmann equation (LPBE),

nabla•[epsilon(x)nablau(x)] + kappa bar2(x)u(x)=f(x) u(infinity)=0, (2)

where the source term is a sum of delta functions,

f(x)= 4piec2
kT
Nm
summation
i
=1
zidelta(xxi).
(3)

In Equations (1) and (2), the variable u(x) = ecphi(x)/kT represents a dimensionless electrostatic potential; epsilon(x) is the dielectric coefficient; kappa bar2 is the Debye–Hückel screening parameter, which describes ion concentration and accessibility; kT is the thermal energy; ec is the electron charge; Nm is the number of protein charges; zi is the partial charge of each protein atom; and xi is the position of each atom. Figure 1 shows a schematic of a solute (here taken to be a protein), ions, and solvent system modeled by the Poisson–Boltzmann equation. The dielectric coefficient epsilon changes by nearly two orders of magnitude across the “interior” protein–solvent boundary (solid line in Figure 1), and the screening parameter jumps from zero to a positive value across the “exterior” boundary (dashed line in Figure 1).

Figure 1Figure 1

The accurate pointwise evaluation of the dielectric and screening parameter coefficients for a typical biomolecule is a nontrivial task. APBS evaluates the dielectric coefficient epsilon by using the Lee and Richards [8] definition of solvent accessibility. In short, the algorithm considers the volume OmegaSA defined by the union of the (infinite) set of spheres with centers at all y member Omega such that parallely – xiparallel > ri + sigma for all atoms i and positions xi. Given some point y, the coefficient epsilon(y) is assigned the solvent dielectric constant if it is inside OmegaSA; otherwise, it is assigned the solute dielectric value. This definition of OmegaSA is shown in more detail in Figure 2 for a simplistic model of a protein molecule (white circles). The molecular surface (black line) is defined by the accessibility of the solvent probe molecules (blue circles); points outside this surface are assigned the solvent dielectric value (typically around 80), while points inside are assigned the solute dielectric (generally 2–20). The assignment of the screening parameter values kappa bar2 is much simpler; points outside a distance ri + sigmaion from all atoms i are assigned the bulk screening parameter value, while points closer than ri + sigmaion to any atom i are assigned a value of 0.

Figure 2Figure 2

As described here, the PBE equation contains three sources of discontinuities. Both the dielectric coefficient epsilon and the screening parameter kappa bar2 have jump discontinuities (analogous to Heaviside step distribution functions) near the protein–solvent interface. Additionally, the source term f(x), which models the point charges at the protein atoms, is represented by a sum of delta functions. Although these jump and delta function discontinuities of coefficients in the PBE can pose serious numerical difficulties for traditional uniform or nonadaptive mesh partial differential equation solvers, these features can be efficiently described using the adaptive finite element techniques described in Section 3 [3–5].

3. Parallel multilevel adaptive finite element methods

This section briefly describes the theory behind the parallel multilevel adaptive finite element scheme used to solve the PBE for the electrostatic potential around biomolecules. The first subsection describes basic finite element techniques, and the second discusses the incorporation of adaptivity into these methods. A very short description of multilevel techniques is presented in the third subsection, and the theory behind parallelization of these methods is described in the fourth subsection.

Finite element discretization
To solve the PBE on a finite computational platform, it is necessary to truncate and discretize the infinitely large problem domain implicit in Equations (1) and (2). Specifically, we solve the PBE equation inside a polygonal domain Omega subset R3 subject to some Dirichlet boundary condition

u(x)=g(x) on deltaOmega,

where deltaOmega denotes the boundary of Omega. To discretize the problem, we subdivide Omega by tessellation with tetrahedral simplices. The resulting tetrahedral mesh forms the structure over which we define Vh = span{vi}, as the space spanned by the piecewise-polynomial basis functions {vi}. APBS currently uses the piecewise-linear finite element support provided by FEtk [5]. A representative basis function is depicted (on a two-dimensional triangular mesh) in Figure 3. The solution to the PBE is approximated by a function uh member ubarh + Vh constructed by a linear combination of the basis functions

uh(x)= N
summation
i
alphaivi(x).
(4)

The trace function u barh is not explicitly constructed, but is assumed to satisfy the Dirichlet boundary conditions.

Figure 3Figure 3

In order for the construction of uh from piecewise-linear functions to be successful, we must restate the PBE equations in their “weak” form. Clearly, the second derivative [as required by Equations (1) and (2)] of a piecewise-linear function is not well defined. This difficulty can be overcome by integrating the PBE with a test function nu tilde,

integralOmega [–nablaepsilonnablau+kappa bar2 sinh(u)]nu tilde dx=integralOmega nu tilde dx,
(5)

and applying integration by parts to the second-order differential term to give

integralOmega [epsilonnablau•nabla nu tilde +kappa bar2 sinh(u) nu tilde ] dx=integralOmega f nu tilde dx.
(6)

Equation (6) can also be written as

<F(u), nu tilde>L2(Omega) =integralOmega [epsilonnabla unablanu tilde+kappa bar2 sinh(u)nu tildefnu tilde] dx=0,
(7)

where < • , • >L2(Omega) denotes the L2(Omega) inner product and F(u) is the weak form of the residual. This allows us to restate the PBE in weak form:

Find uhmember ubar+Vh such that <F(uh), vi>=0 for all vi member Vh. (8)

This form of the PBE requires only one order of differentiation under an integral (with integration in the Lesbegue sense) and is therefore a “weaker” formulation of the PBE than the original second-order differential equations [(1) and (2)]. Although the above discussion pertained to the use of the NPBE, similar manipulations can be performed for the LPBE to produce an expression for the residual F(u) which is linear in u.

Given uh as a linear combination of the finite element basis functions and the above weak form of the PBE [Equation (8)], we have a discretization of the partial differential equation suitable for numerical solution. In the case in which F(u) is linear (LPBE), Equation (8) explicitly defines a sparse-matrix equation that can be solved using standard linear algebra methods or the multilevel methods described in the subsection on multilevel solution. However, when F(u) is nonlinear (NPBE), we employ a damped inexact Newton iteration as implemented by FEtk [5] to find uh [39–13]. In brief, this method is used to solve the linear matrix equations in an iterative fashion to determine improvements w to the solution. When these improvements become sufficiently small, the Newton iteration stops, and the resulting uh is used as the solution. The linear systems used to find the solution improvements are defined by the functional (Gateaux) derivatives of the nonlinear residual,

<DF(u)w, v>= integralOmega [epsilonnablaw•nablav+kappa bar2 cosh(u)wvfv] dx, (9)

and the resulting linear equations for the improvements w are as follows:

Find w such that <DF(u)w, vi>=–<F(u), vi>+r   for all vimemberVh, (10)

where r is a residual that allows for enhanced efficiency by accounting for the possibly inexact solution of Equation (10). The updated solution is then obtained by addition of the improvement times a damping factor lambda that stabilizes the algorithm,

uleft arrowu+lambdaw. (11)

For more detailed discussion of the finite element method applied to the PBE, see Holst, Baker, and Wang [3] and Baker, Holst, and Wang [4]. The texts by Axelsson and Barker [14] and Braess [15] are good sources for more general reviews of the finite element method.

Error estimation and mesh refinement
While the methods of the previous section can be used to determine the solution on a given finite element mesh, they do not provide information about the accuracy with which the numerical solution uh represents the true solution or indicate whether it can be improved. The answers to these two questions lie within the domain of error estimation and adaptive refinement techniques. Again, we present only a cursory overview of this topic, briefly discussing the a posteriori error estimation and adaptive refinement techniques that are applied to the PBE in the present work. For more detailed information about the implementation of these methods in the solution of the PBE, see Holst, Baker, and Wang [3]. A posteriori error estimation has been the subject of several publications1 [5, 16–19] which provide much more information about the theory and implementation of these methods.

Adaptive refinement methods typically employ error-estimation techniques to approximate the distance between the numerical and true solution parallelu – uhparallelX (using some norm parallelparallelX) and determine the regions of the problem domain where the error is above a certain tolerance. The mesh is then refined in these regions of excess error and the PBE equation is re-solved to provide a more accurate finite element representation of the solution. The error-based refinement of the mesh can also be interpreted as the local enrichment of the finite element basis set in regions where the true solution is not adequately represented. In general, an a posteriori error estimator is used to determine the error in each simplex. Simpler a priori or geometry-based error estimators can also be used, but the reduction of error in the solution with each level of refinement is typically less efficient.

APBS employs the residual-based a posteriori error estimation framework provided by FEtk which generates a per-simplex error estimate etas in simplex s by using the residual defined by the strong form [Equations (1) and (2)] of the PBE [3, 18],

etas2=hs2parallelnablaepsilonnablauh+kappa bar2 sinh(uh)–fparallel 2
L2(s)
+ 1
2
summation
f member s
hfparallelleft floornfepsilonnablauhright floorfparallel 2
L2(f)
(12)

where hs denotes the size of the simplex, f member s denotes a face of simplex s, hf is the size of the face f, left floornu tilderight floorf denotes the jump across the face f of some function nu tilde, nfepsilonnablauh is the component of epsilonnablauh normal to simplex face f, and the L2 norm over a simplex or face is given by

parallelvparallel 2
L2(s or f)
=integrals or f |v|2 dx.
(13)

Since each error estimate is defined over a simplex or simplex face, the solution is linear over the entire domain of the norm [Equation (12)], and the contribution from the second-order term
nablaepsilonnablauh is zero. In general, the second (jump) term of the error estimator typically dominates etas; therefore, the first term is not implemented in APBS. An estimate of the global error over the problem domain is obtained as the root mean square of the per-simplex estimates

etaglobal= 1
N
open parenthesis summation
i
eta 2
s
close parenthesis ½ .
(14)

Although this etaglobal provides only an upper bound (within a constant) of the true error in the solution, it offers a practical measure for the reduction of error during solution of the PBE.

Given a per-simplex error estimate, those simplices with errors above a certain tolerance etatol are marked for subdivision. APBS employs the longest edge simplex subdivision algorithm in FEtk [5] for adaptivity. This subdivision method, along with other examples, is shown in Figure 4. Subdivision of only the marked simplices typically results in a nonconforming mesh, i.e., a mesh in which the faces of some simplices intersect the vertices of other simplices. This situation is depicted in Figure 4 where, without the subdivision depicted by the dotted line in the left-hand scheme, the triangular mesh would be nonconforming. Since nonconforming finite element meshes pose a variety of numerical difficulties, adaptive mesh refinement is carried out in an iterative fashion, via a “queue-swapping” algorithm [3, 5]. This algorithm, as implemented by the FEtk libraries [5], creates two empty queues (Q1, Q2) and fills one (Q1) with the list of simplices marked for refinement by the error estimator. The simplices in Q1 are refined, and the resulting nonconforming simplices (if any) are placed in Q2. After all of the simplices in Q1 have been refined, the roles of the queues are swapped (Q1 = Q2, Q2 = empty set) and the algorithm is repeated. This loop continues until the entire mesh is conforming, whereupon both queues are empty.

Figure 4Figure 4

Multilevel solution
The time required to solve the linear algebra equations, either within the Newton steps for NPBE or explicitly defined by the LPBE, generally dominates the solution of the PBE. Therefore, it is important to make these steps as efficient as possible. Multilevel methods are well-established techniques for efficiently solving such equations through algebraic hierarchies [9, 20–25]. Such methods have been shown to give optimal (for uniformly refined meshes [26]) or nearly optimal (for adaptively refined meshes [6]) time and memory complexity for the solution of the linear matrix equations.

APBS employs the multilevel finite element solver technology in FEtk [5] to form an algebraic hierarchy of problems based on the refinement of the mesh [3, 5, 27]. Specifically, a prolongation operator Pk is constructed which relates basis functions on refinement levels k and k + 1 of the finite element mesh. Given operator Ak on level k of the mesh, the prolongation operator Pk can be used to reconstruct the problems Ak+1 from coarser levels of the mesh by applying Pk to the current problem Ak via Ak+1 = PTkAkPk. By using this prolongation-based reconstruction, the problem can then be solved in a multilevel fashion, employing a direct solver for the problem on the coarsest level.

Parallel finite element methods
By using the parallel refinement techniques of Bank and Holst [28], the methods described in the previous sections can be performed in a parallel fashion. In the parallel implementation, each of the P = 2p processors is given the same initial mesh. Using the techniques described earlier in the subsection on error estimation and mesh refinement, each processor solves the problem on this coarse mesh and generates an a posteriori error estimate for every simplex in the mesh. This error estimate is then used to weight a spectral bisection method which partitions the mesh into P pieces of approximately equal error. Finally, each of these mesh partitions Mi is assigned to a different processor i. After partitioning, the usual solution and adaptive refinement methods of the previous sections are performed with only a small modification: When the per-simplex error estimates are calculated on processor i, only simplices within the local partition Mi and a boundary region of size sigma surrounding Mi are given nonzero error estimates. The simplices with errors greater than a specified tolerance are marked for refinement, and these marked simplices (which are located only in Mi and the overlap) are subdivided. The queue-swapping algorithm then proceeds as usual, with simplices from any partition being refined to ensure fully conforming mesh. The initial error-based partitioning step acts as the load-balancing mechanism for this algorithm; the number of simplices in an error-based adaptive refinement is related to the total error in the mesh region it is refining. By partitioning the mesh such that all processors have roughly the same amount of error, this algorithm provides a reasonable amount of a priori load balancing.

The overlap region surrounding each mesh partition is implemented by APBS in a simple fashion. Let xi be the center of geometry of partition Mi, and let Ri be the radius of the sphere circumscribing Mi. The parameter sigma greater or = to 1 is the desired relative size of the overlap region with respect to Ri. APBS then enforces parallel refinement with partition overlap by allowing only error-based simplex marking (on processor i) of simplices within a distance sigmaRi of the center xi of partition Mi.

A two-dimensional example of this method applied to a four-processor system is shown in Figure 5. In this example, all simplices within sigma = 1.2 times the radius of the green partition were assigned the same error (which was chosen to be greater than the error-based marking tolerance). The resulting refinement over the partition of the green processor and the overlap region is evident, as is the additional refinement outside the radius sigmaRi required for conformity.

Figure 5Figure 5

As noted by Bank and Holst [28], this error-decoupling parallel algorithm essentially trades computation for communication. Whereas the algorithm requires little or no communication between processors, it compensates by duplicating the computational effort spent in some portions of the solution algorithm. Specifically, partitioning steps of the mesh and computations on the solution in overlap regions are duplicated across processors. Although the overlap region can be neglected for some problems [28], a nonzero overlap region proportional to the size of Mi must generally be implemented to satisfy the requirements underlying the decoupled error estimates [28, 29].

4. Implementation

The APBS program provides parallel and sequential implementations of the multilevel adaptive finite element techniques described in Section 3 to solve the linear and nonlinear versions of the PBE around biomolecules in ionic solutions. APBS makes extensive use of FEtk [5] for a variety of tasks, including the implementation of finite element mesh structures, refinement algorithms and data structures, assembly and solution of the linear and nonlinear equations, spectral bisection mesh partitioning, and residual-based error estimation. Because of the underlying hardware abstraction design of FEtk, the APBS code is designed for portability and can be used, with no modifications, on both single-processor workstations and massively parallel supercomputers. Parallel communication is currently provided via vendor implementations of the MPI 1.1 standard; however, future plans include support for OpenMP protocols to better leverage the capabilities of shared memory platforms. APBS is currently in beta testing phase, but is scheduled for release in open-source form pending the addition of OpenMP support and other features. Information about obtaining FEtk [5] and related tools is available at http://www.scicomp.ucsd.edu/~mholst.

The sequence of operations followed by APBS during a typical solution of the PBE is outlined in Algorithm 1.

Algorithm 1: APBS program execution

  1. Initialize P processors; all subsequent steps are carried out by each processor.
  2. Read in the very coarse initial mesh (pre-mesh) and molecule systems.
  3. For each molecule (or molecular system):
    1. Assign atomic, dielectric, and ionic strength data.
    2. Map atomic charges to mesh simplices.
    3. Construct nonlinear and/or linear algebra structures for each molecular system.
    End for.
  4. Uniformly refine the mesh to contain no less than cP simplices (where c is usually 10–100).
  5. Solve the PBE and estimate the error for a reference molecular system.
  6. Partition the mesh into P pieces and assign each piece to a processor.
  7. While the size of simplices on the molecule–solvent boundaries is too big:
    1. Mark those simplices in the local partition and overlap region (see the subsection on parallel finite element methods) which contain a point charge or lie on the boundary between the solvent and the interior of any molecule.
    2. Refine marked simplices; refine the mesh to conformity.
    End while.
  8. While the global error estimate is too big:
    1. Solve the weak form of the PBE for each molecular system.
    2. Estimate the per-simplex error using the residual-based error estimator.
    3. Mark simplices on the local partition or overlap region with errors greater than some tolerance.
    4. Refine marked simplices; refine the mesh to conformity.
    End while.
The details of implementation for the particular steps of this algorithm have been mostly covered in previous sections. In summary, after initialization of the problem, partitioning of the mesh, and initial a priori refinement, APBS carries out the adaptive solve–estimate–refine procedure outlined in the subsection on error estimation and mesh refinement until a target accuracy has been reached.

Appropriate a priori refinement of the initial mesh can provide acceleration in convergence of the expensive solve–estimate–refine steps (Step 8 in Algorithm 1). The first step in the procedure is the reading of a very coarse “pre-mesh” (Step 2 in Algorithm 1) which completely encapsulates the desired problem domain. In general, this coarse mesh can be of any polyhedral shape; in practice it is typically a simple rectangular prism comprising six tetrahedra. To allow for the accurate representation of boundary conditions by an analytical Green's function model, the outer boundary of the pre-mesh should be at least twice the radius of the sphere which circumscribes the molecular complex. After the pre-mesh is read, it must be uniformly refined to a desired number (cP) of simplices for partitioning (Step 4, Algorithm 1). In general, a suitable number of simplices for partitioning is roughly 10–100 times the target number of partitions. This number of initial simplices not only provides a reasonable error estimate for error-weighted partitioning of the mesh, but also provides adequate flexibility for the spectral bisection partitioning algorithm. Although uniform refinement is not a requirement, the errors on the very coarse pre-mesh are typically so large that error-based refinement schemes lead to uniform marking and subdivision. Following error-weighted partitioning of the mesh, geometry-based refinement near point charges and molecular surfaces is carried out on the local partition and overlap region until the specified mesh resolution is reached (Step 7 in Algorithm 1). This process is essentially an a priori estimate of the problematic features of the system that will be refined by subsequent a posteriori estimate–refine steps. By subdividing simplices (on the local partition and overlap region of each processor) which lie across the dielectric or ionic strength boundaries or contain charges, the a priori refinement scheme is attempting to resolve some of the overall structure of the discontinuous problem coefficients prior to the more costly solve–estimate–refine loop.

5. Application to biomolecular systems

One of the advantages of using adaptive methods to solve the PBE is the ability to study large biomolecular systems that are not manageable with uniform mesh techniques. One such system of interest is the cytoskeleton, the complex array of filaments and proteins within every eukaryotic cell. The largest cytoskeletal component, the microtubule, is a hollow cylindrical filament (see Figure 6) assembled from the long protofilaments composed of tubulin subunits [30, 31]. The microtubule cylinders are 25 nm in diameter and, depending on function, can have lengths from nanometers to several millimeters. While microtubules are the most rigid structures in the cell and play an important structural role, they are also involved in a variety of other functions, including cellular transport, motility, and division. Many of these more dynamic functions involve interactions with other proteins or filaments in the cell, often through electrostatic interactions. For this reason, the ability to calculate the electrostatic properties of a microtubule can provide important insight into many cellular processes. It is the large size of microtubules that poses tremendous computational challenges; for example, the atomically detailed solution of the PBE for a 1-µm-long microtubule requires more than 21 million delta functions in the source term of the PBE to model the charge distribution to full atomic detail.

Figure 6Figure 6

APBS was used to solve the LPBE for a 40-nm-long microtubule consisting of 605205 atoms with a net charge of –1800 e. The microtubule structure was assembled by one of us (D. Sept) using microtubule structures derived from the work of Nogales, Whittaker, Milligan, and Downing [32]. The biomolecule was assigned an internal dielectric constant of 2 and surrounded by a solvent of dielectric 78.54 and ionic strength of 150 mM. The molecular volume was defined with 0.14-nm-radius solvent probes, and the ion accessibility was calculated using 0.20-nm probes. The pre-mesh was a six-tetrahedron cubic box with 90-nm sides. For each P-processor calculation, the pre-mesh was uniformly refined to over 100P simplices and partitioned by error-weighted spectral bisection. In order to ensure the best possible load balancing, no a priori adaptive refinement was performed. Instead, each partition in the mesh was subjected to the solve–estimate–refine adaptive refinement loop using the residual-based a posteriori error estimator until each processor had the target number of vertices (40000). These calculations were performed on one, two, four, eight, sixteen, and thirty-two processors of the NPACI2 Blue Horizon supercomputer—a massively parallel IBM RS/6000* SP with eight POWER3 SMP nodes. Because of the low communication costs, each job was submitted to the IP space queues. Jobs requiring less than 32 processors were run with all eight processors per node, giving each processor roughly 400 MB of heap memory. However, to provide adequate memory for the initial mesh refinement and partitioning steps, the 32-processor job was run with four processors per node, allowing approximately 800 MB of memory per processor.

The parallel scaling results for these calculations are shown in Figure 7. Although it was anticipated that each calculation would take roughly the same amount of execution time with the various processor configurations, the actual runs showed a decrease in the execution time with an increasing number of processors. The execution times fit the polynomial t1P + t0 with a correlation coefficient r2 = 0.91, a slope of t1 = (–140 ± 20) s/processor, and an intercept of t0 = (14800 ± 300) s. The solid curve in Figure 7 shows the global number of simplices in the mesh
L(P) (the sum of the number of simplices from each partition) as a function of the number of processors. This function was fit to a straight line L(P) = l1P + l0 with correlation coefficient r2 = 0.999, slope l1 = (2.05 ± 0.03) × 105 simplices per processor, and intercept l0 = (1.7 ± 0.6) × 105 simplices. Finally, the parallel efficiency was defined as

E(P)= L(P)
PL(1)
(15)

and plotted as the dotted curve on Figure 7. The efficiency was also fit to a linear polynomial E(P) = e1P + e0 with correlation coefficient r2 = 0.61, slope e1 = (7 ± 3) × 10–3 per processor, and intercept e0 = (1.08 ± 0.05). The mean efficiency of the six runs was E bar = 1.0 ± 0.1.

Figure 7Figure 7

As shown in Figure 7, APBS exhibits excellent scaling behavior for up to 32 processors. The parallel efficiency is very high for all processor configurations and shows only a slight decrease for larger (16 and 32) calculations. The “superlinear” scaling [E(2) = 1.9 and E(4) = 1.08] of the two- and four-processor configurations can be considered an artifact of the parallel efficiency definition. Since it is very difficult to refine mesh partitions to an exact number of simplices, the solve–estimate–refine loop can only be constrained to refine the partition to contain more than a specified number of simplices, therefore not providing an exact cutoff for each partition and processor configuration. The resulting scatter in L(1) and L(P) leads to parallel efficiencies that can deviate, both positively and negatively, from their “ideal” values.

Because of the large size of the resulting electrostatic potential datasets, it was not possible to visualize the results of the parallel calculations shown in Figure 7. However, a much lower-resolution calculation was performed on a slightly larger (60 nm long, 901083 atoms, –3000 e charge) microtubule to generate the electrostatic potential contours shown in Figure 8. As expected, the highly charged microtubule shows mostly negative electrostatic potential near the molecular surface (Figure 8, red contour). However, several regions of positive potential are visible, especially near the ends of the microtubule. Such localized variations in electrostatic potential often play important roles in molecular recognition and binding and suggest interesting modes of microtubule assembly and stability.

Figure 8Figure 8

6. Conclusions

By using new methods for the parallel solution of elliptic partial differential equations, it is possible to leverage the teraflops computing power of massively parallel computers to perform electrostatic calculations on biological systems at scales approaching the cellular level. Using the APBS and FEtk software on the NPACI Blue Horizon supercomputer, it was possible to solve the Poisson–Boltzmann equation for the electrostatic potential around a microtubule containing more than 600000 atoms. The code showed excellent parallel scaling, providing incentive to attempt further calculations to probe the structure and function of even larger macromolecules.

Future work is aimed at the inclusion of OpenMP extensions into APBS to take better advantage of shared-memory machines and reduce the memory overhead associated with the duplicated storage of biomolecular atomic information. Work is also in progress to utilize more of the Blue Horizon's parallel capabilities and investigate much larger biomolecular systems involving calculations on millions of atoms. By enabling such research through the use of parallel computing, theoreticians should be able to move from computational chemistry at molecular scales to computational biology at the cellular level.

Acknowledgments

N. Baker's work was supported by predoctoral fellowships from the Howard Hughes Medical Institute and the Burroughs–Wellcome La Jolla Interfaces in Science program. M. Holst's work was supported in part by a UCSD Hellman Fellowship, and in part by NSF CAREER Award No. 9875856. J. A. McCammon's work was supported in part by grants from NIH and NSF. N. Baker is indebted to Amit Majumdar, Giridhar Chukkapalli, and Greg Johnson at the San Diego Supercomputer Center for help with technical issues on the NPACI Blue Horizon supercomputer and visualization of the large datasets. N. Baker would also like to thank Dr. Adrian Elcock (University of California at San Diego, Department of Chemistry and Biochemistry) for helpful explanations of the various solvent- and ion-accessible volume definitions used in PBE solvers.

*Trademark or registered trademark of International Business Machines Corporation.

References

Footnotes

1M. Holst and D. Bernstein, "Adaptive Finite Element Solution of the Initial-Value Problem in General Relativity: Theory and Algorithms", Commun. Math. Phys., submitted.
2National Partnership for Advanced Computational Infrastructure.

Received August 2, 2000; accepted for publication October 5, 2000