Skip to content

Raise open file descriptors limit to avoid problem with pystack#40108

Merged
SilkeSchomann merged 1 commit into
release-nextfrom
raise_open_file_descriptors_limit
Oct 14, 2025
Merged

Raise open file descriptors limit to avoid problem with pystack#40108
SilkeSchomann merged 1 commit into
release-nextfrom
raise_open_file_descriptors_limit

Conversation

@jhaigh0
Copy link
Copy Markdown
Contributor

@jhaigh0 jhaigh0 commented Oct 14, 2025

Description of work

The recent problems with running pystack in the error reporter have been traced back to the fact that pystack was trying to open more file descriptors than the limit set on most linux systems. A pr into pystack has been made to improve their error messages (bloomberg/pystack#260).

When this is a problem, you get a lot of this kind of output

ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.

The issue started after the first new instrument view pr went in (this commit 8733e98#diff-4a6d6571d2c59705d8a673667672de2d7dcc3c756c61a3f0195acdefac52fe34). We're not that sure why, but it could be the addition of new libraries such as pyvista.

This pr raises the limit (equivalent to ulimit -n <value>) for the outer mantid process, which may eventually launch the error reporter.

To test:

Package to test is at mamba install jhaigh0/label/raise_open_files_limit::mantidworkbench

It would be good to test on idaaas where the core dump location is already set up.
Run Segfault and check that the pystack output has been captured in the stacktrace field.

Reviewer

Your comments will be used as part of the gatekeeper process. Comment clearly on what you have checked and tested during your review. Provide an audit trail for any changes requested.

As per the review guidelines:

  • Is the code of an acceptable quality? (Code standards/GUI standards)
  • Has a thorough functional test been performed? Do the changes handle unexpected input/situations?
  • Are appropriately scoped unit and/or system tests provided?
  • Do the release notes conform to the guidelines and describe the changes appropriately?
  • Has the relevant (user and developer) documentation been added/updated?

Gatekeeper

As per the gatekeeping guidelines:

  • Has a thorough first line review been conducted, including functional testing?
  • At a high-level, is the code quality sufficient?
  • Are the base, milestone and labels correct?

@jhaigh0 jhaigh0 added this to the Release 6.14 milestone Oct 14, 2025
@jhaigh0 jhaigh0 added the Bug Issues and pull requests that are regressions or would be considered a bug by users (e.g. crashing) label Oct 14, 2025
@jhaigh0 jhaigh0 marked this pull request as ready for review October 14, 2025 12:36
@jhaigh0 jhaigh0 added the High Priority An issue or pull request that if not addressed is severe enough to postponse a release. label Oct 14, 2025
@jhaigh0 jhaigh0 requested a review from MialLewis October 14, 2025 12:55
Copy link
Copy Markdown
Contributor

@MialLewis MialLewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

I've tested the change locally using the packaged produced by the CI and C++ stack traces are now again working.

@SilkeSchomann SilkeSchomann self-assigned this Oct 14, 2025
@SilkeSchomann SilkeSchomann enabled auto-merge (squash) October 14, 2025 13:50
Copy link
Copy Markdown
Contributor

@SilkeSchomann SilkeSchomann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested these changes on IDAaaS and can confirm that there is a normal stacktrace again instead of ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so output.

@SilkeSchomann SilkeSchomann merged commit d55cd15 into release-next Oct 14, 2025
11 checks passed
@SilkeSchomann SilkeSchomann deleted the raise_open_file_descriptors_limit branch October 14, 2025 14:44
@github-project-automation github-project-automation Bot moved this from In review to Done in ISIS core workstream v6.14.0 Oct 14, 2025
peterfpeterson pushed a commit to peterfpeterson/mantid that referenced this pull request Oct 14, 2025
…idproject#40108)

### Description of work

The recent problems with running `pystack` in the error reporter have
been traced back to the fact that `pystack` was trying to open more file
descriptors than the limit set on most linux systems. A pr into
`pystack` has been made to improve their error messages
(bloomberg/pystack#260).

When this is a problem, you get a lot of this kind of output
```
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.
```

The issue started after the first new instrument view pr went in (this
commit
mantidproject@8733e98#diff-4a6d6571d2c59705d8a673667672de2d7dcc3c756c61a3f0195acdefac52fe34).
We're not that sure why, but it could be the addition of new libraries
such as `pyvista`.

This pr raises the limit (equivalent to `ulimit -n <value>`) for the
outer mantid process, which may eventually launch the error reporter.

### To test:

Package to test is at `mamba install
jhaigh0/label/raise_open_files_limit::mantidworkbench`

It would be good to test on idaaas where the core dump location is
already set up.
Run Segfault and check that the pystack output has been captured in the
stacktrace field.
peterfpeterson added a commit that referenced this pull request Oct 14, 2025
This is a version of #40108 into `ornl-next`

Co-authored-by: Jonathan Haigh <35813666+jhaigh0@users.noreply.github.com>
peterfpeterson pushed a commit to peterfpeterson/mantid that referenced this pull request Oct 15, 2025
…idproject#40108)

### Description of work

The recent problems with running `pystack` in the error reporter have
been traced back to the fact that `pystack` was trying to open more file
descriptors than the limit set on most linux systems. A pr into
`pystack` has been made to improve their error messages
(bloomberg/pystack#260).

When this is a problem, you get a lot of this kind of output
```
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../.././libMantidJson.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidTypes.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.so
ERROR(process_core): Cannot open ELF file /home/bya67386/mambaforge/envs/test_htlp_links/lib/python3.11/site-packages/mantid/kernel/../../../../libMantidPythonInterfaceCore.
```

The issue started after the first new instrument view pr went in (this
commit
mantidproject@8733e98#diff-4a6d6571d2c59705d8a673667672de2d7dcc3c756c61a3f0195acdefac52fe34).
We're not that sure why, but it could be the addition of new libraries
such as `pyvista`.

This pr raises the limit (equivalent to `ulimit -n <value>`) for the
outer mantid process, which may eventually launch the error reporter.

### To test:

Package to test is at `mamba install
jhaigh0/label/raise_open_files_limit::mantidworkbench`

It would be good to test on idaaas where the core dump location is
already set up.
Run Segfault and check that the pystack output has been captured in the
stacktrace field.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Issues and pull requests that are regressions or would be considered a bug by users (e.g. crashing) High Priority An issue or pull request that if not addressed is severe enough to postponse a release.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants