Skip to content

Fixed a output seismograph file bug in GPU mode#1278

Merged
danielpeter merged 2 commits into
SPECFEM:develfrom
MengYuanzhuo:devel
May 12, 2026
Merged

Fixed a output seismograph file bug in GPU mode#1278
danielpeter merged 2 commits into
SPECFEM:develfrom
MengYuanzhuo:devel

Conversation

@MengYuanzhuo

Copy link
Copy Markdown
  • Fixed a bug in GPU mode where, when NTSTEP_BETWEEN_OUTPUT_SAMPLE was not 1, the output seismograph file appended the previous data at the end, thereby overwriting the original data.

  • Before repair, single-channel data when NTSTEP_BETWEEN_OUTPUT_SAMPLE = 1.

Ux_file_single_d_1 su-heatmap
  • Before repair, single-channel data when NTSTEP_BETWEEN_OUTPUT_SAMPLE = 10.
Ux_file_single_d_10 su-heatmap
  • (There is some repetition of the data above at the end of the data, which overwrites the original data.)

  • After repair, single-channel data when NTSTEP_BETWEEN_OUTPUT_SAMPLE = 10.

Ux_file_single_d su-heatmap

…not 1, the output seismograph file appended the previous data at the end, thereby overwriting the original data.
Copilot AI review requested due to automatic review settings May 10, 2026 03:13

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes incorrect seismogram output in GPU mode when seismograms are written in chunks (e.g., when NTSTEP_BETWEEN_OUTPUT_SAMPLE is not 1) by ensuring device-side seismogram buffers are flushed to host before file output.

Changes:

  • Flush GPU seismogram buffers to host immediately before write_seismograms_to_file() runs.
  • Adjust CUDA seismogram transfer logic to copy only the “valid” portion of the buffer.
  • Add a CPU-only stub for the new CUDA flush entry point for non-CUDA builds.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/specfem2D/write_seismograms.F90 Calls a new GPU flush routine before writing seismograms to disk.
src/gpu/write_seismograms_cuda.cu Implements flush_seismograms_cuda() and changes device→host transfer strategy.
src/gpu/specfem2D_gpu_cuda_method_stubs.c Adds a no-op stub for flush_seismograms_cuda() when CUDA is not enabled.
Comments suppressed due to low confidence (1)

src/gpu/write_seismograms_cuda.cu:276

  • The subsequent host->Fortran copy loops over j=0..nlength_seismogram-1 for every receiver even when only valid_length samples were copied from the device. This does extra work and also makes the earlier host-buffer zeroing necessary. Consider copying/converting only the first valid_length samples (or seismo_current samples in flush), leaving the unused tail untouched.
    // copies values into host array
    for (int irec=0; irec < mp->nrec_local; irec++){
      for (int j=0; j < nlength_seismogram; j++){
        sisux[j + nlength_seismogram * irec] = (double) h_seismo[j + nlength_seismogram * irec];
        sisuz[j + nlength_seismogram * irec] = (double) h_seismo[j + nlength_seismogram * irec + size];
      }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +250 to +269
// clears host buffers first, then copies only the samples that were actually written.
// This prevents stale tail values from a previous chunk from leaking into the final trace.
for (int i = 0; i < 2 * size; i++) {
h_seismo[i] = 0.0f;
}

// copies only the valid part from GPU to CPU (note: could use async mem copy in future...)
for (int irec = 0; irec < mp->nrec_local; irec++) {
int device_offset = irec * nlength_seismogram;
int host_offset = irec * nlength_seismogram;

print_CUDA_error_if_any(cudaMemcpy(h_seismo + host_offset,
d_seismo + device_offset,
sizeof(realw) * valid_length,
cudaMemcpyDeviceToHost),72001);
print_CUDA_error_if_any(cudaMemcpy(h_seismo + size + host_offset,
d_seismo + size + device_offset,
sizeof(realw) * valid_length,
cudaMemcpyDeviceToHost),72002);
}
Comment on lines +293 to +311
Mesh* mp = (Mesh*)(*Mesh_pointer_f);

synchronize_cuda();

if (mp->nrec_local == 0) return;

int i_sig = *i_sigf - 1;
int seismo_current = *seismo_currentf;
int nlength_seismogram = *nlength_seismogramf;

if (seismo_current <= 0) return;

if (seismo_current > nlength_seismogram) seismo_current = nlength_seismogram;

realw* h_seismo = mp->h_seismograms[i_sig];
realw* d_seismo = mp->d_seismograms[i_sig];

cudaStreamSynchronize(mp->compute_stream);

Comment thread src/specfem2D/write_seismograms.F90 Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@danielpeter danielpeter merged commit 8e9c2c3 into SPECFEM:devel May 12, 2026
31 checks passed
@danielpeter

Copy link
Copy Markdown
Member

thanks for the contribution! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants