Skip to content

Add kokkos and its simtbx/LS49 tests to Azure build#728

Merged
bkpoon merged 16 commits into
masterfrom
kokkos_azure
Mar 30, 2022
Merged

Add kokkos and its simtbx/LS49 tests to Azure build#728
bkpoon merged 16 commits into
masterfrom
kokkos_azure

Conversation

@phyy-nx

@phyy-nx phyy-nx commented Jan 25, 2022

Copy link
Copy Markdown
Contributor

Co-authored-by: Felix Wittwer fwittwer@lbl.gov
Co-authored-by: Billy K. Poon bkpoon@lbl.gov
Co-authored-by: Nicholas Sauter nksauter@lbl.gov

@nksauter

Copy link
Copy Markdown
Contributor

Actually the Kokkos build would only be expected to work with std c++ >= 11 and probably not with Python 2.7. Is it possible to readjust for that?

@bkpoon

bkpoon commented Jan 26, 2022

Copy link
Copy Markdown
Member

The XFEL CI tests are only in Python 3 and have C++11 enabled.

But the flag is for configure.py, not boostrap.py. The change should be

--config-flags="--enable_kokkos"

I just updated the commit and adjusted the formatting.

@bkpoon

bkpoon commented Jan 26, 2022

Copy link
Copy Markdown
Member

On macOS, it looks like the same error that @mewall had. The gpu extension is not built since --enable_cuda is not provided, but something needs it?

On linux, the KOKKOS_CXXFLAGS has echo as the first item. It should not be there.

@bkpoon

bkpoon commented Jan 26, 2022

Copy link
Copy Markdown
Member

Wait, kokkos is not enabled for macOS.

# only build kokkos on linux and if kokkos is enabled
if sys.platform.startswith('linux') and env_etc.enable_kokkos:
env_simtbx.SConscript("kokkos/SConscript",exports={ 'env' : env_simtbx })

@bkpoon

bkpoon commented Jan 26, 2022

Copy link
Copy Markdown
Member

@JBlaschke, would #675 help with getting rid of echo?

@phyy-nx

phyy-nx commented Jan 26, 2022

Copy link
Copy Markdown
Contributor Author

The echo is coming from here:

kokkos_cxxflags = subprocess.check_output(
['make', '-f', 'Makefile.kokkos', 'print-cxx-flags'],
cwd=os.environ['KOKKOS_PATH'])

I looked into the Makefile.kokkos and there's lots of calls to echo, wrapped in shell commands. Maybe those aren't working on Azure? That's why I tried to print os.envrion['SHELL'], but it appears that's not set on Azure.

@phyy-nx

phyy-nx commented Jan 26, 2022

Copy link
Copy Markdown
Contributor Author

Contents of kokkos_cxxflags right before it used:

['echo', '"-std=c++14', '-march=core-avx2', '-mtune=core-avx2', '-fopenmp', '-I./', '-I/__w/1/modules/kokkos/core/src', '-I/__w/1/modules/kokkos/containers/src', '-I/__w/1/modules/kokkos/algorithms/src"']

libtbx.run_tests_parallel module=uc_metrics module=simtbx module=xfel_regression module=LS49 nproc=4
echo "DEBUG"
cat mp4k/rank_0*.err
echo "DEBUG2"

@bkpoon bkpoon Jan 27, 2022

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These debugging lines will cause this step to always pass since the last command will always run correctly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were going to set OMP_NUM_THREADS to 2 to get rid of the OpenMP warning and I'm testing a change that enables kokkos on macOS.

@bkpoon

bkpoon commented Jan 27, 2022

Copy link
Copy Markdown
Member

Now there are 4 remaining test failures

2022-01-27T18:59:52.3610355Z libtbx.python "/__w/1/modules/LS49/adse13_187/cyto_batch.py" N_total=1 test_pixel_congruency=True mosaic_method=double_random mosaic_spread_samples=50 write_output=False test_without_mpi=True log.outdir=mp1k nxmx_local_data=/global/cfs/cdirs/m3562/der/master_files/run_000795.JF07T32V01_master.h5 context=kokkos_gpu [FAIL] 2.2s
2022-01-27T18:59:52.3611179Z   Time:  2.22
2022-01-27T18:59:52.3611567Z   Return code: 1
2022-01-27T18:59:52.3611860Z   OKs: 0
2022-01-27T18:59:52.3612564Z libtbx.python "/__w/1/modules/LS49/adse13_187/tst_multipanel_argchk.py" N_total=1 mosaic_spread_samples=50 test_without_mpi=True log.outdir=mp2k nxmx_local_data=/global/cfs/cdirs/m3562/der/master_files/run_000795.JF07T32V01_master.h5 context=kokkos_gpu [FAIL] 1.9s
2022-01-27T18:59:52.3613376Z   Time:  1.87
2022-01-27T18:59:52.3613671Z   Return code: 1
2022-01-27T18:59:52.3614293Z   OKs: 0
2022-01-27T18:59:52.3615077Z libtbx.python "/__w/1/modules/LS49/adse13_187/tst_write_file_action.py" N_total=1 mosaic_method=double_random mosaic_spread_samples=50 test_without_mpi=True log.outdir=mp3k write_output=False nxmx_local_data=/global/cfs/cdirs/m3562/der/master_files/run_000795.JF07T32V01_master.h5 context=kokkos_gpu [FAIL] 1.8s
2022-01-27T18:59:52.3615871Z   Time:  1.84
2022-01-27T18:59:52.3616154Z   Return code: 1
2022-01-27T18:59:52.3616549Z   OKs: 0
2022-01-27T18:59:52.3618254Z libtbx.python "/__w/1/modules/LS49/adse13_187/tst_write_file_action.py" N_total=1 write_output=True write_experimental_data=True mosaic_spread_samples=62 test_without_mpi=True log.outdir=mp4k nxmx_local_data=/global/cfs/cdirs/m3562/der/master_files/run_000795.JF07T32V01_master.h5 mask_file=/global/cfs/cdirs/m3562/nks/adse13_187/13_221/event_648.mask context=kokkos_gpu [FAIL] 1.9s
2022-01-27T18:59:52.3619147Z   Time:  1.91
2022-01-27T18:59:52.3619446Z   Return code: 1
2022-01-27T18:59:52.3619829Z   OKs: 0

The change for using kokkos on macOS has not been committed yet.

@bkpoon

bkpoon commented Mar 14, 2022

Copy link
Copy Markdown
Member

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 3 pipeline(s).

@bkpoon

bkpoon commented Mar 15, 2022

Copy link
Copy Markdown
Member

/azp run "XFEL CI"

@azure-pipelines

Copy link
Copy Markdown
No pipelines are associated with this pull request.

@bkpoon

bkpoon commented Mar 15, 2022

Copy link
Copy Markdown
Member

/azp run XFEL CI

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@bkpoon

bkpoon commented Mar 17, 2022

Copy link
Copy Markdown
Member

/azp run XFEL CI

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@bkpoon

bkpoon commented Mar 17, 2022

Copy link
Copy Markdown
Member

/azp run XFEL CI

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

phyy-nx and others added 6 commits March 30, 2022 06:46
Co-authored-by: Felix Wittwer <fwittwer@lbl.gov>
Co-authored-by: Billy K. Poon <bkpoon@lbl.gov>
Co-authored-by: Nicholas Sauter <nksauter@lbl.gov>
Seems a bug in Makefile.kokkos print-cxx-flags on Azure
@bkpoon bkpoon merged commit 03272eb into master Mar 30, 2022
@bkpoon bkpoon deleted the kokkos_azure branch March 30, 2022 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants