A GPU Implementation and Optimization for Parallel Double-difference Seismic Tomography
  Pei-Cheng Liao     李政其(Li, Cheng-chi)     Yu-Chi Lai     Ping-Yu Chang     Haijiang Zhang     Clifford Thurber  
Liao, P.-C., Li, C.‐C., Lai, Y.‐C., Chang, P.‐Y., Zhang, H., and Thurber, C.,  “A Graphics Processing Unit Implementation and Optimization for Parallel Double‐Difference Seismic Tomography”, Bulletin of the Seismological Society of America, 2014, 104, 2, 953-961

Double-difference seismic tomography can estimate the velocity structure and the event locations with higher precision, but its high computation cost along with large memory usage prevent the usage of a personal computer to process very large datasets and requires a long computation time. This work proposes GPU-based acceleration schemes to run the algorithm on a personal computer for very large datasets more efficiently. Generally, the algorithm can be divided into five major steps: input, ray tracing, matrix construction, inversion, and output. This work focuses on accelerating the ray tracing and inversion steps which take almost two-thirds of the computation time. Before ray tracing, our algorithm preprocesses the data by sorting all recorded event-station paths according to their lengths. Therefore, those path estimation jobs assigned to GPU cores are suitable for the GPU architecture. Furthermore, our work also minimizes the usage of global and local memory to reduce the GPU computing time needed to handle a very large dataset. In addition to parallelizing the inversion computation, our work proposes a GPU-based elimination method to reduce redundant computation in inversion for further acceleration. In our test, the proposed acceleration schemes can gain a maximum speed-up factor of 31.17 and 35.46 for ray tracing and inversion respectively in our test. Overall, the GPU-based implementation can reach a maximum of 5.98 times faster than the CPU-based implementation.