Research paper3D Kirchhoff depth migration algorithm: A new scalable approach for parallelization on multicore CPU based cluster
Introduction
With fast and continuous improvement in computer hardware architectures, computations and performance enhancement of seismic data processing and imaging applications has become a fundamental requisite that attracts the attention of many researchers and High Performance Computing (HPC) scientists (Linda, 2015). Seismic imaging methods are required for accurate estimation of the image of subsurface geology including properties of rocks beneath, using acoustic measurements, recorded on the surface of the earth. These methods are one of the oldest candidate of HPC technologies, since they are mathematically complex and needs to be solved for very large data (Almasi et al., 1992). Our current work is related to computational aspects of 3D Kirchhoff Depth Migration (KDM) method which is one of the oldest seismic migration method (Hagedoorn, 1954). In algorithmic perspective, 3D KDM is highly compute intensive and its resource requirements increases with the increase in input data sizes. Therefore, enhancement of its computational performance on various computer architectures is still an open research problem.
3D KDM consists of two crucial compute intensive operations, traveltime computation and migration summation of seismic data (Schneider, 1978). The quality of 3D KDM outcome depends upon the accuracy of traveltimes (Uwe et al., 1996). There are many methods in which the traveltime can be computed such as, Finite-Difference eikonal solver based methods and ray-tracing based methods with increasing level of computational complexities (Coman, 2003). In spite of any method adopted for traveltime computations, the storage of their computed values and supply to the migration process is the prime determination factor for the computational speed of the algorithm on any platform (Alkhalifah, 2011). The second compute intensive operation is summation of diffraction amplitudes computed using seismic data, which needs to process all the traces in the data guided by aperture function, in order to image a single grid point in the 3D subsurface model.
In the last few years, researchers and HPC scientists are actively involved in optimizing the performance of this algorithm using state of the art computer architectures. Panetta et al. (2007) have described the computational characteristics of the KDM on quad-core IBM Blue Gene using MPI and OpenMP. Li et al. (2009) proposed a MPI based partitioning strategy for 3D prestack KDM algorithm to handle large memory requirement by dividing the imaging space on a multicore CPU based system. This strategy improved efficiency of the migration in terms of memory for a limited number of processors, but for large number of processors I/O and communication overhead increased significantly. Teixeira et al. (2013) tested 3D prestack KDM on GPU-based clusters and found significant gain in efficiency when compared to CPU only version of the algorithm. They have used ray-tracing algorithm for traveltime computation which was not ported on GPU due to its memory limitations. Wang et al. (2014) described various methods of porting KDM on GPU and showed that 8–15X speedup can be obtained.
In our previous work, an efficient parallel poststack and prestack 3D Kirchhoff depth migration algorithm had been successfully demonstrated on current class of multicore systems (Rastogi et al., 2015) . We had introduced a concept of flexi-depth iterations while depth migrating data in parallel imaging space. The parallelization approach conveyed an effective utilization of available node memory for traveltime computations without the need of interpolation during runtime. The storage, I/O and communication requirements of the algorithm had been successfully minimized, however an in depth performance analysis shows that the scalability of the application slows down after increase in certain number of nodes. To further optimize the parallel performance of previously developed algorithm, the parallelization approach for 3D KDM algorithm is been re-designed in order to improve the overall scalability as well as to reduce the compute time of the application.
The theoretical foundation of both, previous and new implementation of 3D KDM is same which is described in detail in Section 2. Major focus of the current article is on the parallelization strategy that has driven acceleration in performance of the algorithm over the previous approach. The results have been demonstrated using 3D Overthrust data on the CPU based multicore cluster, PARAM Yuva II,1 a PARAM series of supercomputers. A comparative study of the previous and current approach is performed using computational experiments, and conclusions have been drawn based on the computing time. Study of performance metrics of the resultant parallel application with respect to increase in number of nodes shows promising results and proves the effectiveness of the re-designed parallelization approach.
Section snippets
Theory and methodology
The theory of Kirchhoff migration is very well established. It is an integral solution to the scalar wave equation based on diffraction summation which is governed by Huygens principle (Yilmaz, 2001). Fig. 1 depicts theoretical aspects of 3D KDM. The discrete form of the integral solution can be written as shown in Eq. (1).where, is the migration outcome at imaging location in 3D space, and are inline and crossline receiver
Parallelization methodology
Theoretically, 3D KDM algorithm exhibits inherent parallelism since the computation of diffraction surface amplitude for a imaging location is independent of the other locations. Post computations, the summation process of amplitudes can be staged for final imaging. This theoretical aspect's advantage is exploited for parallelization in the current implementation. The algorithm's compute time is largely governed by the computation and storage of traveltimes and its feeding mechanism to the
Implementation details
Major features of the current parallel implementation of 3D KDM are described below:
The system
The application is developed and tested on PARAM series of supercomputer, PARAM Yuva II. It is a 225 nodes Linux cluster having 64 GB memory per node. Each node has dual socket with octa-core Xeon processor of E5-2670 series with 2.6 GHz core frequency. Cluster's primary System Area Network (SAN) is FDR InfiniBand™.
Numerical experimentation data
Performance evaluation of the current 3D KDM algorithm is done using synthetic 3D Overthrust data developed by SEG/EAGE modeling committee (Aminzadeh et al., 1996), in dip
Conclusions
A new scalable parallelization approach for 3D Kirchhoff depth migration application has been presented, which is robust and can efficiently migrate both prestack and poststack data on state of the art multicore CPU based cluster. Traveltime computations are efficiently managed on actual grid size during runtime within the node memory using Flexi-Trace iterations. The retain/replace policy for the traveltime computations balances the migration requirements of the algorithm with minimal
Acknowledgement
This work was supported by “Development and Adaptation of Applications, System Software and Hardware Technologies for Hybrid Architecture Based HPC systems” project, Department of Electronics and Information Technology (DeitY), Government of India. Authors are thankful to Centre for Development of Advanced Computing (CDAC), Pune, for providing PARAM Yuva II, computing facility along with permission to publish this work and the team of National PARAM Supercomputing Facility for their support.
References (18)
- et al.
Parallel distributed seismic migration
Future Gener. Comput. Syst.
(1992) - et al.
An efficient parallel algorithm:poststack and prestack Kirchhoff 3D depth migration using flexi-depth iterations
Comput. Geosci.
(2015) Efficient traveltime compression for 3D prestack Kirchhoff migration
Geophys. Prospect.
(2011)- et al.
Three dimensional SEG/EAGE models - an update
Lead. Edge
(1996) - Cohen, J.K., Stockwell, J.J.W., 2010. CWP/SU: Seismic Un*x. Release no - 42: An open source software package for...
- Coman, R., 2003. Computation of Mutivalued Traveltimes in Three Dimensional Heterogeneous Media. Thesis...
A process of seismic reflection interpretation
Geophys. Prospect.
(1954)- Hwang, K., 1992. Advanced Computer Architecture: Parallelism, Scalability, Programmability, 1st Edition. McGraw-Hill...
- Jean-Pierre, G., Jeff, L., Michael, Y., 2000. Metacomputing and the Master-Worker Paradigm. Tech. rep., In Preprint...