1 Introduction

The 2004 Sumatra–Andaman earthquake (3.295° N, 95.982° E, depth = 30.0 km, M = 9.1 at 00:58:53 UTC on 26 December 2004 according to the United States Geological Survey [USGS]) is the largest event to have occurred so far in the twenty-first century, generating a devastating tsunami and causing severe damage and loss of human life (Satake, 2014). The 2004 tsunamis propagated through the Indian Ocean, spread to the Atlantic Ocean and the Pacific Ocean (Titov et al., 2005), and were recorded at numerous tide gauges (TG), ocean bottom pressure gauges (OBPG), and DART [Deep-ocean Assessment and Reporting of Tsunamis] stations (Rabinovich & Thomson, 2007; Rabinovich et al., 2011a, 2017). The 2004 earthquake was the first tsunami captured by satellite altimetry (SA) measurements (Gower, 2005; Smith et al., 2005). Using these instrumentally recorded tsunami data, previous studies estimated the slip distributions or rupture propagation on the faults of the earthquake (Arcas & Titov, 2006; Fujii & Satake, 2007; Gopinathan et al., 2017; Hirata et al., 2006; Lorito et al., 2010). Among them, our previous study (Fujii & Satake, 2007) carried out joint inversion of TG and SA data and concluded that the tsunami source was about 900 km long, assuming rupture velocity (Vr) of 1.0 km/s. While the inversion of only SA data indicated that the tsunami source extended to the Andaman Islands with a total length of 1400 km, such a long fault model with Vr of 1.0 km/s produced much larger tsunami waveforms than observed at Indian tide gauge stations.

Observed far-field tsunamis are known to delay and have a reversed polarity of the first tsunami wave relative to computed ones based on the long-wave theory (Rabinovich et al., 2011c; Watada et al., 2014). These are due to the effects of elasticity of solid earth, compressibility of seawater and gravitational potential variation associated with the motion of mass during the tsunami propagation, that are not modeled in long-wave computations. To account for these effects, a phase correction method to tsunami waveforms calculated as conventional linear long waves was proposed by Watada et al. (2014). It has been applied to inversions of tsunami waveforms recorded at far-field from the 2010 Chile earthquake (Yoshimoto et al., 2016), the 2011 Tohoku earthquake (Ho et al., 2017), and the 1960 Chile earthquake (Ho et al., 2019), and successfully reconstructed the fault slips. Recently, Fujii et al. (2020) applied the method to the tsunami waveform inversion of the 2005 Nias earthquake (Mw 8.6) and demonstrated that far-field tsunami data have the potential to reveal the slip distribution, comparable to local geodetic data, when the phase-corrected Green’s functions are used.

In this study, we perform tsunami data inversions to estimate the slip distribution on the fault of the 2004 Sumatra–Andaman earthquake using the phase-corrected Green’s functions. We first invert the sea surface heights (SSHs) from SA measurements, in which the tsunami signals were extracted after reducing background ocean noise (Hayashi, 2008), next the tsunami waveforms at TGs and OBPGs, and then the joint data of the SSHs and tsunami waveforms. Further, we evaluate the tsunami source models by comparing the observed tsunami waveforms recorded at far-field OBPGs and DARTs with the synthetic ones calculated from the fault model.

2 Data

2.1 Tsunami Waveforms at Tide Gauges

Tsunamis generated from the 2004 Sumatra–Andaman earthquake propagated throughout the Indian Ocean and spread globally across the Atlantic Ocean to the west and the Pacific Ocean to the east (Rabinovich et al., 2011a, 2017; Titov et al., 2005). The tsunamis were recorded at TGs and OBPGs in the Indian Ocean (Rabinovich & Thomson, 2007) (Fig. 1). In the previous study (Fujii & Satake, 2007), we used the TG data from Indonesia, Thailand, Sri Lanka, India, Maldives and Australia. Most of the TG data were available from an open website of the University of Hawaii Sea Level Center (UHSLC) with a digital sampling rate of 2 or 4 min. The Cocos TG data (1-min sampling) were provided by the National Tide Center, Australia. The Indian TG data were provided by the Survey of India (SOI) (Nagarajan et al., 2006), and some TG data in Thailand were digitized from picture images in the tsunami field survey paper (Tsuji et al., 2006). The detailed information of these data, including other stations such as Port Blair, are listed in Table 1 of Fujii and Satake (2007).

Fig. 1
figure 1

Tide gauge (TG) stations (yellow triangles), ocean bottom pressure gauge (OBPG) stations (magenta squares) and tracking data points (red: Jason-1, dark green: TOPEX/Poseidon, blue: Envisat, yellow: GFO p208, light blue: GFO p210) of sea surface heights (SSHs) from satellite altimetry (SA) in and around the Indian Ocean. The epicenter of the 2004 Sumatra–Andaman earthquake is also shown by the red star. The black rectangle indicates the source area shown in Figs. 4, 9 and 12. The tsunami waveforms recorded at TG and OBPG stations are shown in Figs. S7, S8, 10, 11, 14 and 15. The TG data whose station names are shown by narrow italic in white with the black border are not collected in this study (e.g. the observed tsunami waveforms in South Africa are shown in the last figure of Fig. S7 but not shown for Kerguelen, Rodriguez, and La Reunion.). The observed TG and OBPG data whose station names are shown by regular font in black with the white border are used for tsunami waveform inversions, and those by regular font in white with the black border are not used for the inversions

Table 1 Subfault parameters for the 2004 earthquake and the slip amounts from the inversions using only SA data, only TG data, and both SA data and TG data with rupture velocity of 1.3 km/s

In this study, we additionally use other TG data compiled by Rabinovich and Thomson (2007) to compare with the calculated tsunami waveforms. We preprocessed the observed tsunami waveforms by removing ocean tide signals which have longer periods than tsunami signals. The tsunami waveforms of some stations used for inversion are resampled with a time interval of 1 min. We show not only the observed tsunami waveforms of the collected TG data but also the calculated waveforms even for the stations without the observed data to see the effects of the phase correction.

2.2 Tsunami Waveforms at Ocean Bottom Pressure Gauges

The 2004 Sumatra–Andaman earthquake tsunamis were also recorded at OBPG stations (Fig. 1) in Antarctica (Nawa et al., 2007), at far-field OBPG stations in the Drake Passage (Rabinovich et al., 2011c) and at DART stations off the Pacific coast of the United States (Rabinovich et al., 2011b) and off northern Chile (Rabinovich et al., 2017) (Fig. 2). Two stations (BPG1, BPG2) are located near and offshore the Syowa Base in Antarctica (Nawa et al., 2007) (Fig. 1). The digital data of BPG1 were downloaded from the Japan Coast Guard website, and the BPG2 data were provided by the National Institute of Polar Research. The sampling intervals are 30 s and 1 min for BPG1 and BPG2 data, respectively. Tsunami waveforms for OBPG stations (DPN and DPS) at the Drake Passage were digitized from Rabinovich et al. (2011c). The digital data of DART NeMO and DART 46505 were downloaded from the National Oceanic and Atmospheric Administration (NOAA) website. The sampling intervals are 15 s. The tsunami waveform for DART 32401 was digitized from Rabinovich et al. (2017). The maximum amplitudes are about 1 cm at DARTs 32401, NeMO, and 46505; a few cm at DPN, DPS; about 5 cm at BPG2; and about 40 cm at BPG1. In the previous studies (Rabinovich et al., 2011c, 2017), they modeled the observed tsunami waveforms of DPN, DPS and DART 32401 and found that the time shifts (delay) of 15, 10 and 20 min are needed for conventional tsunami calculations to fit the observed peaks, respectively.

Fig. 2
figure 2

Far-field stations (magenta squares) of three DARTs and two OBPGs (DPN and DPS), which recorded tsunami waveforms. Tsunami ray paths are estimated by the method of (Ho et al., 2017, 2020). (top) Tsunami ray paths leaving from the western side of the epicenter (red star) to the stations are shown by dark red lines. (bottom) Tsunami ray paths leaving from the eastern side to the stations are shown by blue lines. The observed tsunami waveforms are shown in Figs. 8, S10 and S12

2.3 Sea Surface Heights from Satellite Altimetry

The 2004 tsunamis were captured as SSHs by SA measurements. In the previous study (Fujii & Satake, 2007), we used SA data of three satellites, Jason-1, TOPEX/Poseidon and EnviSat for tsunami data inversions (white circles in Fig. 3). To extract tsunami signals embedded in the large spatial variations, the SSHs were processed by subtracting the SA data before the earthquake from those after the earthquake. In this study, we use the tsunami signals in SA data (gray circles in Fig. 3), which were processed by Hayashi (2008). He eliminated the effects of oceanographical phenomena except the 2004 tsunamis and detected the tsunami signals along the five track lines of four satellites [one track from each satellite: Jason-1, TOPEX/Poseidon, EnviSat, and two tracks (passes 208 and 210) from satellite Geosat Follow On (GFO)]; thus his data have a better quality as the tsunami height data than those used in Fujii and Satake (2007). Jason-1 and TOPEX/Poseidon recorded the tsunami profile flying against the tsunami propagation at about 5° S, while EnviSat, GFO passes 208 and 210 caught up with the wavefront from behind at about 18° S, 46° S and 49° S, respectively (Figs. 3 and S1 for the animation). The maximum SSHs are about 60 cm at 4° S, 50 cm at 4° S, 30 cm at 16° S, 30 cm at 37° S and 20 cm at 47° S for Jason-1, TOPEX/Poseidon, EnviSat, GFO passes 208 and 210, respectively. The number of data points are 4,426 in total; Jason-1: 717, TOPEX/Poseidon: 404, EnviSat: 763, GFO pass 208: 1,044 and GFO pass 210: 1,498. The time intervals of data are about 1 s for all the SA data.

Fig. 3
figure 3

Sea surface heights (SSHs) (gray circles) extracted from satellite altimetry (SA) measurements processed by Hayashi (2008) for Jason-1, TOPEX/Poseidon, EnviSat, GFO passes 208 and 210 from top to bottom. White arrows indicate the directions of satellite movements along latitude. Right-top three figures show observed SSHs (white circles) used for inversions in Fujii and Satake (2007)

3 Method

3.1 Subfault Models and Tsunami Computations

We use the same subfault configurations of Fujii and Satake (2007) for the 2004 Sumatra–Andaman earthquake, that is a 22-subfault model along the trench (Table 1; Fig. 4). The fault area includes almost all the aftershocks within 1 day after the mainshock (Fig. 4a). Each subfault has a length of 100 km and a width of 100 km. From off the northern Sumatra Island through the Nicobar Islands, subfaults 1–16 are shallow subfaults with the top depth of 3 km for odd numbers and deep subfaults with the top depth of 20 km for even numbers. The subfaults 17–22 are only the shallow subfaults from the northern Nicobar Islands through the Andaman Islands.

Fig. 4
figure 4

Inversion results using SA data with assumed rupture velocities (Vr) of a 0.9, b 1.3, c 2.0 and d 2.5 km/s. The blue star indicates the epicenter of the 2004 Sumatra–Andaman earthquake. Red circles show the aftershocks within 1 day after the mainshock. Locations of epicenters are determined by USGS

We calculated the tsunami propagation from each subfault to the TG, OBPG and DART stations and also to the measurement points of SA data (Figs. 1 and 2). For the TG stations in and around the Indian Ocean and two OBPG stations in Antarctica, the tsunami computation area ranges 10° W–160° E and 71° S–31° N as shown in Fig. 1. The 30 arc-sec bathymetry grid data from GEBCO 2014 (Weatherall et al., 2015) were resampled with a 24 arc-sec grid interval; thus the numbers of computation grids are 25,500 and 15,300 in the longitude and latitude directions, respectively. We numerically solved the linear long wave equations (Satake, 1995) in the spherical coordinate system for 20-h tsunami propagation with a time interval of 1 s to satisfy the stability condition. The computation time was about 2 h and 50 min with GPGPU (TESLA V100 32 GB and CUDA 9.0) as used in Satake et al. (2017) and Fujii et al. (2020). For the SA data points in the Indian Ocean, the bathymetry data resampled with a 2 arc-min grid interval were used for the same tsunami computation area (Fig. 1). For the far-field OBPG and DART stations, two bathymetry grids were prepared for tsunami propagations toward west: 170° W–130° E and 79° S–61° N (Fig. 2, top), and toward east: 30° E–340° E and 79° S–61° N (Fig. 2, bottom). The grid intervals are 2 arc-min, thus the grid sizes are 9000 × 4200 and 9300 × 4200 for the west and east computation grids, respectively. The 48-h tsunami propagations were simulated for both the grids with a time interval of 3 s to satisfy the stability condition. The computation times were about 41 min for both the grid cases with the same GPGPU environment.

To prepare the initial SSH from each subfault with a unit slip of 1 m, we calculated the seafloor displacement from a rectangular fault model (Okada, 1985) using coarser grid data of 2 arc-min. We also included the effects of horizontal displacement on a steep bathymetric slope (Tanioka & Satake, 1996). For the tsunami propagations in the Indian Ocean (Fig. 1), we resampled the 2 arc-min initial SSH distribution into the 24 arc-sec grid data for the initial condition of the tsunami computations. In all the tsunami simulations mentioned above, we assumed a rise time of 60 s for all the subfaults. The simulated tsunami waveform at a station from a subfault is called a Green’s function which can be used for inversions or synthesizing tsunami waveforms. Before applying the phase correction to the Green’s functions as described in Sect. 3.2, the calculated tsunami waveforms at stations near subfault models, i.e., Port Blair and Sibolga (see Figs. 4a, 9a or 12a), were shifted vertically to remove the offset due to the crustal deformation. Although the effect of this correction is negligible for the other distant stations, we systematically applied the precorrections to all the Green’s functions from the 24 arc-sec grid computations.

3.2 Phase Correction of Green’s Functions

The tsunami waveforms (Green’s functions) are calculated based on the method of Watada et al. (2014) that reproduces the tsunami arrival time delay and the initial-phase reversal by taking into account the elastic and gravitational coupling effect between the solid earth and the ocean, as well as seawater compressibility effect. These effects reduce the conventional phase velocity of long-waves depending on the wave period and the local ocean depth, and appear as a change in the phase spectra of the tsunami waveform. The time- and space-dependent changes in the gravity field due to the mass movement of seawater and seafloor deformation caused by seawater loading during tsunami propagation should be understood separately from the static geographic variation in the Earth's gravity field along the tsunami path. The effect of static gravity field variation on tsunami travel time is an order of magnitude smaller than the effect of dynamic gravity field changes (Watada et al., 2014). The total effects of the phase change from the source to the station can be expressed as the sum of the phase difference in the phase spectra between the long-wave Green’s function and calculated Green’s function along the ray path. The wave theory for tsunamis developed by Watada et al. (2014) predicts that, regardless of the local ocean depth, the amount of frequency-dependent phase difference depends mostly on the propagation distance from the source to the receiver, therefore the computation of the phase-corrected Green’s function is greatly simplified as follows: to correct the phase of a Green's function, first, we converted the time-domain long-wave Green’s function to the Fourier spectra in the frequency domain by using fast Fourier transform (FFT). Next, we corrected the phase differences corresponding to the epicentral distance (great circle length) or the tsunami ray path from the subfault location to the station. Then, the spectra are transformed back to the time domain. We used the phase-difference table of Ho et al. (2017) to further include the effect of stratified ocean layers.

For the TG and OBPG stations (Fig. 1) that are used for tsunami waveform inversions, the phase-corrected Green’s functions are resampled as 1-min data, the same sampling rate as the observed tsunami waveforms. Since SA data are spatial and temporal data, we also applied the phase correction to all the Green’s functions at 4426 data points of the SA measurements. We used the epicentral distances from subfaults locations for the phase-corrections. The phase-corrected Green’s functions are prepared with a sampling interval of 1 s, the same interval as the SA data, by interpolating the calculated tsunami waveforms.

For the far-field OBPG and DART stations (Fig. 2), the phase-correction effect would be large because of the long distances of tsunami propagation. Here, we used the tsunami propagation distance along the eastward or westward tsunami ray path (thick lines in Fig. 2) instead of the epicentral distances (great circle length). This is because the far-field ray paths in Fig. 2 are much longer than the great circle lengths, whereas the distances along the tsunami ray path and great circle path are almost the same within the area in Fig. 1. Figure S2 shows an example of point-to-point tsunami ray path estimation using the method proposed by Ho et al. (2017) and Ho et al. (2020). We estimated the westward tsunami propagation distances as DPN: 14,000 km, DPS: 13,400 km, DART 32401: 18,700 km, DART 46405: 27,000 km, DART NeMO: 27,300 km (Fig. 2, top) and the eastward ones as DPN: 15,900 km, DPS: 15,600 km, DART 32401: 18,800 km, DART 46405: 25,200 km, DART NeMO: 25,400 km (Fig. 2, bottom), respectively, then used them for the phase corrections.

3.3 Tsunami Data Inversion

The same method with Fujii and Satake (2007) was applied for the inversions of tsunami data, i.e., SSHs for SA data and tsunami waveforms for TG data. We estimated the slip amount for each subfault and its error by using the non-negative least square method (Lawson & Hanson, 1974) and the delete-half jack knife method (Tichelaar & Ruff, 1989), respectively. We adopted a rupture propagation model from south to north subfaults assuming constant or average Vrs from 0.5 to 3.0 km/s with an interval of 0.1 km/s. The rupture propagation is implemented to the subfault model as the rupture delay time from the epicenter to the edges of the subfaults.

Weight factors are often used in data inversions depending on different data sets such as SA data, TG data, OBPG data, geodetic data, or their data qualities. In the previous studies (Fujii & Satake, 2007; Fujii et al., 2020), we have learned that the inverted slip amounts could be controlled by the weight factors; on the other hand, different weights for some data sets or specific stations were needed to reproduce the observed data. To better explain the observed large tsunami waves at near field and the OBPGs in Antarctica, we weighted these stations separately from other stations in our inversion.

For the SA data inversions, the spatial window range for each satellite track was determined so that the initial waveforms of tsunami signals in SSHs from all the Green’s functions are included in that spatial range even for the case of latest arrivals (Vr = 0.5 km/s in this study). We did not set any weights for different satellite track data.

For the TG data inversions, we selected a different station set from our previous study (Fujii & Satake, 2007), namely Sibolga, Belawan, Port Blair, Cocos, Paradip, Vishakhapatnam, Chennai, Colombo, Hanimaadhoo, Male, Gan, Diego Garcia, Pointe La Rue, Salalah, Lamu, Zanzibar, Richards Bay, East London, Port Elizabeth, Mossel Bay, and two OBPGs, BPG1 and BPG2, at the Syowa Base in Antarctica (Fig. 1). Hereafter, we refer to this data set simply as TG data. We set three times larger weights to Colombo and Port Blair data, and 30 times larger weights to BPG1 and BPG2 because we found that if we use the same weights, slips on faults are suppressed and they cause much underestimation of the observed tsunami waveforms.

For the joint inversions of SA data and TG data (hereafter, we refer to this data set as SA+TG data), we set five times larger weights to the SA data than the TG data in order to make both data norms equal following the method by Satake (1993).

Before applying the inversions described above to the actual observed data, we performed inversion tests for different (SA, TG, SA+TG) data sets synthesized from a checkerboard slip distribution (Figs. S3, S4, S5 and S6). The assumed Vr was set to 0.9 km/s for generating the target synthetic data. We tested three cases of synthetic data without noise, with 20% and 60% of random-noise amplitudes which were generated following a normal distribution with the same standard deviation for each data set, assuming three Vrs of 0.8, 0.9 and 1.0 km/s for each case. We confirmed that SA and TG data had good resolution for revealing the target slip distribution regardless of the noise levels, if the actual target Vr (0.9 km/s) was assumed (see the center columns of Figs. S4, S5 and S6). For the SA data inversions (Fig. S4), the slips on regions off Sumatra Island to the Nicobar Islands are robustly resolved even in cases of assuming Vrs different from the target velocity, while the estimated slips on northern faults off the Andaman Islands varied depending on the assumed Vr and noise level. For the TG data inversions (Fig. S5), the resolution became worse as the noise levels increased when the wrong Vr was assumed. From this comparison, we could conclude that the SA data have more robust resolution than the TG data for the slip inversions.

4 Results and Discussion

4.1 Source Models from SA Data Inversions

Inversion results using only the SA data show that the consistent slip distributions were estimated (Fig. 4) regardless of the assumed Vr: there are slight differences in the estimated slip amounts that depend on the Vrs, however, the slip distribution patterns are persistent within the assumed Vr ranges. This feature is the same as in the checkerboard tests (Figs. S3 and S4), in which we assumed a slightly different Vr from the target model. To select the best Vr, we calculated the variance reductions between the observed and synthetic data for the inversion of different Vrs. Then we found that there are two local maxima of variance reductions at Vrs of 0.9 km/s and 1.3 km/s, and Vrs faster than 1.6 km/s do not reduce much variance for the SA data inversions (Fig. 5). In Fig. 4, the slips estimated with Vrs of 2.0 and 2.5 km/s are also shown for comparison. As we saw in the checkerboard test (Fig. S4), northern slips near the Andaman Islands could vary depending on the assumed Vrs. To select a preferable Vr between 0.9 and 1.3 km/s, we compared the synthetic tsunami waveforms at Port Blair near the northern slip area, and those at Paradip, Vishakhapatnam and Chennai where the calculated waveforms are sensitive to the northern slips (Fig. 6). Then, we found that the slip distribution with the Vr of 1.3 km/s reproduces the observed tsunami waveforms better than the one with the Vr of 0.9 km/s. In the case of Vr = 1.3 km/s (Fig. 4b), the slip distribution inverted from the SA data shows the large slips of 17–22 m off Sumatra Island, the moderate slips of 6–8 m off the Nicobar Islands, and small slips less than 4 m in the Andaman Islands, and indicates the total source length of about 1400 km. This SA data inversion yields a total seismic moment of 7.74 × 1022 Nm (Mw = 9.2) assuming the rigidity of 5 × 1010 N/m2 which is the same rigidity value used in Fujii and Satake (2007). Slip amounts and errors of subfaults, seismic moments and moment magnitudes from the SA data inversions are listed in Table 1 for the Vr of 1.3 km/s and Table S1 for the other Vrs of 0.9, 2.0 and 2.5 km/s.

Fig. 5
figure 5

Variance reductions versus assumed rupture velocities for different data set inversions. White circles, filled triangles and gray diamonds are for SA data, TG data and SA+TG data inversions, respectively

Fig. 6
figure 6

Comparison of observed (gray line) and synthetic (red lines) tsunami waveforms at selected tide gauges. The red lines from top to bottom show tsunami waveforms synthesized from slip distributions inverted from SA data with assumed rupture velocities of 1.3 and 0.9 km/s, TG data with assumed rupture velocity of 1.3 km/s, and SA+TG data with assumed rupture velocities of 1.3 and 0.9 km/s. The vertical thin lines in gray show the arrival times of initial tsunami waves

The SSHs and tsunami waveforms calculated from the inverted slip models effectively reproduce the observed SSHs (Fig. 7) and tsunami waveforms at TGs and OBPGs, respectively, in and around the Indian Ocean (Figs. S7 and S8). Figure 7 also shows the calculated SSHs from the estimated slips (Fig. 4b) using the Green’s functions without the phase correction. We can see that the differences between the synthetic SSHs with and without the phase correction are not so large, but the phase-corrected ones better fit with the observations. We also performed another SA data inversion using the Green’s functions without the phase correction (Fig. S9), and confirmed that a similar slip distribution to Fig. 4b was obtained. At some TGs the first arrivals of the calculated tsunami waves with phase corrections do not fit with the observed ones, though the entire shapes and amplitudes of waveforms including the later phases are well reproduced (Fig. S7). The observed tsunami waveforms are well reproduced at two OBPGs, BPG1 and BPG2, near the Syowa Base in Antarctica, supporting that the slip model is reasonably good (Fig. S8). The negative waves with a reversed polarity before the first positive waves are clearly reproduced by the phase correction. While the phase-correction effects are not so significant for TG stations within the Indian Ocean, the differences between the calculated tsunami waveforms with and without the phase correction become larger for the distant stations.

Fig. 7
figure 7

Comparison of SSHs for the inversion using SA data with the assumed rupture velocity of 1.3 km/s. Black circles are observed SSHs. Red and blue circles are calculated SSHs with and without the phase correction, respectively. Gray bars correspond to the data windows used for inversions. Right top figure shows the estimated slip distribution

4.2 Far-Field Tsunami Waveforms

In order to evaluate our best source model from SA data with the assumed Vr of 1.3 km/s, we computed the far-field tsunami waveforms from the estimated slip distribution and compared them with the observations (Fig. 8). Despite the fact that the far-field stations are located more distant, 13,000 to 27,000 km from the source, the observed tsunami waveforms are quite well reproduced (Fig. 8). We separately calculated the tsunami propagations toward west and east as described in Sect. 3.1, and noticed that the western propagations are dominant for the stations DPN and DPS near the Drake Passage, while the eastern propagations are dominant for DARTs 32401, 46405, and NeMO. In the comparisons at DARTs 46405, and NeMO, the same tsunami phase appears at 2160 min and 2190 min, which is not so clear in the observed waveforms, but can be clearly traced in the synthetic waveforms.

Fig. 8
figure 8

Comparison of far-field tsunami waveforms at OBPGs in the Drake Passage and DARTs off northern Chile and off the Pacific coast of USA, which were calculated from the inversion result using the SA data with the assumed rupture velocity of 1.3 km/s. Black lines are observed waveforms. Red and blue lines are calculated waveforms with and without the phase correction, respectively. Green and dark yellow lines indicate the calculated tsunami waveforms with the phase correction toward west and east from the epicenter, respectively

4.3 Source Models from TG Data Inversions

Inversions using only the TG data including two OBPGs at the Syowa Base in Antarctica also show that the consistent slip distributions were estimated (Fig. 9) regardless of Vrs, as shown by the previous SA data inversion. The variance reductions with different Vrs indicate a broad peak at Vr of 1.2 km/s, but the peak value is almost same as that for Vr of 1.3 km/s (Fig. 5). We compared the synthetic tsunami waveforms between the cases with Vrs of 1.2 and 1.3 km/s and found that there is no significant difference between the two cases based on the waveform similarities at Port Blair, Paradip and Chennai (Fig. 6). Thus, here we picked up the results in the case with Vr of 1.3 km/s (Fig. 9b) to compare with the other data set inversions (SA data or SA+TG data as described in Sect. 4.4 later) which prefer the Vr of 1.3 km/s. The slip distribution inverted from the TG data shows the largest slip of 27 m off Sumatra Island, almost zero to small slips between the Nicobar Islands and Sumatra Island, large slips of 7–15 m around the Nicobar Islands, and moderate slips of 4–5 m in the Andaman Islands (Table 1), indicating that the fault length is 1400 km. The TG data inversion yields a total seismic moment of 7.80 × 1022 Nm (Mw = 9.2) assuming the rigidity of 5 × 1010 N/m2. Slip distributions of other TG data inversions with the Vrs of 0.9, 2.0 and 2.5 km/s are shown in Fig. 9 and listed in Table S1. The slip distribution estimated from slower Vrs (0.9 km/s) shows shorter total length, ~ 1200 km (Fig. 9a). The tsunami waveforms computed from the slip model generally reproduce the observed tsunami waveforms at TGs and OBPGs in and around the Indian Ocean but slightly overestimates or underestimates at some stations (Figs. 10 and 11). We also compared the far-field tsunami waveforms (Fig. S10) and the SSHs (Fig. S11) calculated from the slip model. The reproducibility of the observed data became slightly worse than that from the SA data inversion (see Figs. 7 and 8). However, the observed SSHs and far-field tsunami waveforms are reasonably reproduced also by the TG data inversion.

Fig. 9
figure 9

Inversion results using the TG data with the assumed rupture velocities (Vr) of a 0.9, b 1.3, c 2.0 and d 2.5 km/s. The blue star indicates the epicenter of the 2004 Sumatra–Andaman earthquake. Red circles show the aftershocks during 1 day after the mainshock. Locations of epicenters are determined by USGS

Fig. 10
figure 10figure 10

Comparison of tsunami waveforms at TGs from the inversion result using the TG data with the assumed rupture velocity of 1.3 km/s. Black lines are observed waveforms. Red lines are calculated waveforms with the phase correction. The thick black lines and the gray bars under the waveforms show the time windows for the inversion. Blue lines are calculated waveforms without the phase correction. In the stations shown by narrow italic, only the synthetic tsunami waveforms are shown

Fig. 11
figure 11

Same as Fig. 10, but for tsunami waveforms at OBPGs near the Syowa Base and Australian TGs in Antarctica

4.4 Source Models from Joint Inversions

Joint inversions of the SA data and the TG data including two OBPGs at the Syowa Base in Antarctica also show that the consistent slip distributions were estimated (Fig. 12) regardless of the assumed Vrs as well as the previous SA data and TG data inversions, that have an intermediate slip amount almost within the range between the SA data and TG data inversions (Figs. 4 and 9) on each subfault (Table 1). Similar to the SA data inversions, the variance reductions indicate two local maximums of variance reductions at Vrs of 0.9 km/s and 1.3 km/s (Fig. 5). Although the case with Vr of 1.3 km/s produces a variance reduction slightly smaller than that with Vr of 0.9 km/s, again, we compared the synthetic tsunami waveforms (Fig. 6) and selected the preferable Vr of 1.3 km/s based on the waveform similarities at Port Blair, Paradip and Chennai. In this case (Fig. 12b), the slip distribution inverted from the SA+TG data shows the largest slip of 25 m off Sumatra Island, small to large slips of 2–11 m off the Nicobar Islands, and small to moderate slips of 2–6 m in the Andaman Islands (Table 1), indicating that the fault length is 1400 km. The SA+TG data inversion yields a total seismic moment of 8.03 × 1022 Nm (Mw = 9.2) assuming the rigidity of 5 × 1010 N/m2. The total source length and the seismic moment are longer and larger than those of our previous estimates (Fujii & Satake, 2007); about 900 km and Mw 9.1 based on the TG records. Slip distributions of other SA+TG data inversions with the Vrs of 0.9, 2.0 and 2.5 km/s are shown in Fig. 12 and listed in Table S1. In common with the TG data inversions, the slip distribution estimated from slower Vrs (0.9 km/s) shows a shorter total length of 1200 km (Fig. 12a). The tsunamis computed from the slip model reproduce the observed SSHs well (Fig. 13) and generally explain the observed tsunami waveforms at most of the TGs and OBPGs in and around the Indian Ocean (Figs. 14 and 15), and also match well with those observed at far-field stations (Fig. S12), although the resolution and quality of the digital bathymetry may not be adequate for the model to resolve all details of the wave dynamics in the vicinity of the gauges.

Fig. 12
figure 12

Inversion results using the SA+TG data with the assumed rupture velocities (Vr) of a 0.9, b 1.3, c 2.0 and d 2.5 km/s. The blue star indicates the epicenter of the 2004 Sumatra–Andaman earthquake. Red circles show the aftershocks during 1 day after the mainshock. Locations of epicenters are determined by USGS

Fig. 13
figure 13

Comparison of SSHs for the inversion using the SA+TG data with the assumed rupture velocity of 1.3 km/s. Black circles are observed SSHs. Red and blue circles are calculated SSHs with and without the phase correction, respectively. Right top figure shows the estimated slip distribution

Fig. 14
figure 14figure 14

Comparison of tsunami waveforms at TGs from the inversion result using the SA+TG data with the assumed rupture velocity of 1.3 km/s. Black lines are observed waveforms. Red lines are calculated waveforms with the phase correction. The thick black lines and the gray bars under the waveforms show the time windows for the inversion. Blue lines are calculated waveforms without the phase correction. In the stations shown by narrow italic, only the synthetic tsunami waveforms are shown

Fig. 15
figure 15

Same as Fig. 14, but for tsunami waveforms at OBPGs near the Syowa Base and Australian TGs in Antarctica

4.5 Comparison of Slip Distributions

To evaluate the slip distributions obtained in this study, we compare them with the source models from other studies: SA data inversions (Gopinathan et al., 2017; Hirata et al., 2006; Seno & Hirata, 2007) including our previous study (Fujii & Satake, 2007), a seismic data inversion (Yoshimoto & Yamanaka, 2014), and a geodetic data inversion (Chlieh et al., 2007). In the following discussions, note again that we adopted the rupture propagation model with the slip time delays on subfaults assuming constant or average Vrs in this study, although the previous studies (Lay et al., 2005; Lorito et al., 2010; Seno & Hirata, 2007) suggested fast ruptures at the southern segment and slow ruptures at the northern segment of the source region.

From the comparison of SA data inversions (Fig. S13), the slip distribution with Vr of 1.3 km/s in this study is similar to that from the SA data inversion with Vr of 1.0 km/s of Fujii and Satake (2007). Although the new slip amounts are reduced at most of subfaults compared to the old ones, the slips around the epicenter (subfaults 2–4) are significantly increased. These differences must be mainly caused by the differences in processing of observed SSHs in addition to the increased numbers of SA tracks used for the inversions; tsunami signals (Hayashi, 2008) along the five SA tracks were well-separated for this study, while the simple differences of the SSHs before and after the earthquake along the three SA tracks were considered as tsunami signals in Fujii and Satake (2007). The slip differences between the SA data inversions with and without the phase correction of Green’s functions are relatively small (Fig. S14) compared to those from the different SA data set used for inversions (Fig. S13). However, they are not negligible to improve the reproducibility of SSHs as described in Sect. 4.1. Moreover, we found that the variance reductions from the SA data inversions without the phase correction of Green’s functions represent the maximum peak at Vr of 0.7 km/s [consistent with Hirata et al. (2006) and Seno and Hirata (2007)] and the second peak at Vr of 1.2 km/s [close to Fujii and Satake (2007)]; they were 1.3 km/s and 0.9 km/s, respectively, in this study by considering the phase corrections. This suggests that the phase corrections of Green's functions have important roles to estimate not only slip amounts but also Vrs, which could be affected by the Green's functions used for inversions.

From the comparison of TG data or SA+TG data inversions (Figs. S15 and S16), the slip distribution with Vr of 1.3 km/s in this study is much different from the TG data or SA+TG data inversion result with Vr of 1.0 km/s in Fujii and Satake (2007); the location of maximum slip off Sumatra Island is close but this study shows smaller slips between the Nicobar Islands and Sumatra Island and moderate slips near the Andaman Islands. The differences are mainly caused by the difference in TG data sets, i.e., selection of stations and the time windows of tsunami waveform inversions, indicating the sensitivity of TG data inversions to the selections of stations. In this study, we newly used the near-field tsunami waveforms at Port Blair, where the clock error was reported by Neetu et al. (2005) and the time correction was applied to the observed tsunami waveforms by Singh et al. (2006). We also used observed tsunami waveforms at several TGs around the Indian Ocean and also those at two OBPGs near the Syowa Base in Antarctica with phase corrections, which were not able to be applied in Fujii and Satake (2007) at that time.

In this study, we also rearranged the coastal output points for calculated tsunami waveforms, especially for the TGs along the eastern coast of India, reproducing the actual coastal shapes such as breakwaters at bay mouth (Chennai) or intricate piers (Vishakhapatnam). If we re-synthesize the tsunami waveforms at the selected TGs, adopting the previous source models of Fujii and Satake (2007) using the Green’s functions of this study, the old SA data model reasonably reproduces the Indian TG records rather than the old TG data or SA+TG data models (Fig. S17), although the calculated amplitudes are slightly overestimated because of the previously used SSHs higher than those of Hayashi’s (2008) data (see Fig. 3).

The slip distributions from the other studies using various kinds of data (Chlieh et al., 2007; Gopinathan et al., 2017; Yoshimoto & Yamanaka, 2014) are similar to those from the SA data and TG data inversions of this study; maximum slips off Sumatra Island and moderate slips near the Nicobar Islands and small slip near the Andaman Islands, although they prefer the faster Vrs more than 2.5 km/s to the revealed Vr of 1.3 km/s from this study. The northern slips are the key to define the total source length, which were revealed in both inversions of the SA data and the TG data. As we described in the previous sections, the slip model inverted from SA data reproduces the tsunami waveforms well at TGs and OBPGs in and around Indian Ocean and at far-field OBPGs, and the slip model from TG data also reasonably reproduce the those tsunami waveforms. We also recognized that the inversion results by using TG data could be significantly variable depending on the selections of TG stations or the weight factors. Moreover, the long-wave assumption to calculate Green’s functions could be more validated for open-ocean SA data than for coastal TG data which can be affected by local bathymetry or coastal shape variations. These may suggest that the quality-controlled SA data with a good spatial coverage are more appropriate than the sparsely located TG data to invert the slip distribution for the case of the 2004 Sumatra–Andaman earthquake.

5 Conclusions

We estimated the slip distributions of the 2004 Sumatra–Andaman earthquake using the phase-corrected Green’s functions (linear long wave). Phase-correction effects are not so significant for stations or data points within the Indian Ocean, however, the reproducibility of SSHs by SA measurements and tsunami waveforms at TGs and OBPGs around the Indian Ocean are significantly improved. We obtained the best slip distribution (Mw = 9.2) inferred from SA and TG joint data, which reproduces the observed SSHs and tsunami waveforms well and also explains the OBPG and DART tsunami waveforms at far-field more than 13,000–25,000 km, while the slip distribution (Mw = 9.2) inferred from TG data also explains the SSHs and far-field tsunami waveforms well. The best slip model from the joint inversions shows a large variance reduction in the case of an average Vr of 1.3 km/s with the estimated large slips of 16–25 m off Sumatra Island, the large slips of 2–11 m off the Nicobar Islands, and the moderate slips of 2–6 m in the Andaman Islands. The significant slips from the epicenter to the northern area yield total source length of about 1400 km along the trench.