Skip to content

refactor: rest client#217

Open
NguyenHoangSon96 wants to merge 4 commits into
mainfrom
refactor/rest-client
Open

refactor: rest client#217
NguyenHoangSon96 wants to merge 4 commits into
mainfrom
refactor/rest-client

Conversation

@NguyenHoangSon96

@NguyenHoangSon96 NguyenHoangSon96 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Closes #216

Proposed Changes

  • I'm trying to make python3 similar to other v3 clients as possible, so I removed some functionalities that currently don't exist in other clients.

Summarize

  • RestClient will be created in InfluxDBClient3 and injected to WriteApi; WriteApi will have all functions to write to Influxdb, so there relations are something like this InfluxDBClient3 -> WriteApi -> RestClient.

Removed features:

  • Debug.
  • org_id - only allow org name.
  • _return_http_data_only param.
  • zap-trace-span param.
  • _preload_content

Checklist

  • CHANGELOG.md updated
  • Rebased/mergeable
  • A test has been added if appropriate
  • Tests pass
  • Commit messages are conventional
  • Sign CLA (if not already signed)

@NguyenHoangSon96 NguyenHoangSon96 self-assigned this Jun 12, 2026
Copilot AI review requested due to automatic review settings June 12, 2026 15:16

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the write path to use a new synchronous RestClient (urllib3-based) instead of the previous generated client/write service wiring, and updates InfluxDBClient3 construction accordingly.

Changes:

  • Refactors WriteApi to issue HTTP requests via a new RestClient, adds gzip decision/compression utilities, and adds endpoint/exception translation logic.
  • Introduces influxdb_client_3/write_client/_sync/rest_client.py to encapsulate urllib3 request/response handling.
  • Updates InfluxDBClient3 initialization to pass write connection details (base_url/auth/gzip) directly into WriteApi.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 15 comments.

File Description
influxdb_client_3/write_client/client/write_api.py Major refactor of write implementation to use a new REST client and custom request building/serialization paths.
influxdb_client_3/write_client/_sync/rest_client.py Adds a new urllib3-based synchronous REST client abstraction used by WriteApi.
influxdb_client_3/__init__.py Wires new write API constructor parameters (auth/gzip/base_url) and updates public kwargs documentation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread influxdb_client_3/write_client/_sync/rest_client.py Outdated
Comment thread influxdb_client_3/__init__.py Outdated
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 12 comments.

Comments suppressed due to low confidence (2)

tests/test_write_api.py:251

  • WriteApi.write() takes parameters in (bucket, org, record, ...) order, but this test passes (org, bucket, record). This makes the test less representative of real usage and can mask issues in how query parameters are constructed.
    tests/test_write_api.py:293
  • This call passes (org, bucket, record) instead of (bucket, org, record), which makes the timeout test less representative of real usage and inconsistent with other tests in this file.

Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/write_client/_sync/rest_client.py Outdated
Comment thread influxdb_client_3/__init__.py Outdated
Comment thread .circleci/config.yml
Comment thread influxdb_client_3/write_client/_sync/rest_client.py Outdated
Comment thread influxdb_client_3/write_client/_sync/rest_client.py Outdated
Comment thread influxdb_client_3/write_client/_sync/rest_client.py Outdated
Comment thread influxdb_client_3/write_client/_sync/rest_client.py

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (4)

influxdb_client_3/write_client/client/write_api.py:1171

  • __getstate__ deletes _write_service, but WriteApi no longer defines that attribute. This will raise KeyError during pickling (and del state['_subject'] / del state['_disposable'] can also KeyError depending on instance state). Use state.pop(..., None) and remove the stale _write_service reference.
    def __getstate__(self):
        """Return a dict of attributes that you want to pickle."""
        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

influxdb_client_3/write_client/client/write_api.py:1181

  • __setstate__ calls self.__init__(self._write_options, self._point_settings, ...), but WriteApi.__init__ requires token, bucket, and org as the first arguments. As written, unpickling will raise TypeError. Recreate the Rx batching pipeline based on the restored _write_options instead of re-calling __init__ with the wrong signature.
    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

tests/test_write_api.py:252

  • WriteApi.write() expects positional arguments in the order (bucket, org, record). This call passes org first, which swaps the query parameters and makes the test less representative of real usage. Use keyword arguments (or correct positional order) to avoid accidental swaps.
    tests/test_write_api.py:293
  • WriteApi.write() positional argument order is (bucket, org, record). This test passes org then bucket, which is easy to get wrong and can hide routing/query-param bugs. Prefer keyword arguments to make the intent explicit (and keep _request_timeout behavior under test).

Comment thread influxdb_client_3/__init__.py
Comment thread influxdb_client_3/__init__.py

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (5)

tests/test_write_api.py:29

  • mock_urllib3_timeout_request is used as a side_effect for urllib3._request_methods.RequestMethods.request, which passes the RequestMethods instance as the first positional arg. The helper currently omits that parameter, so method/url are shifted and the test can behave incorrectly.
    tests/test_write_api.py:252
  • WriteApi.write() takes bucket as the first positional argument and org as the second. This call passes them in the opposite order, so the request routing/query params will be wrong.
    tests/test_write_api.py:293
  • WriteApi.write() takes bucket as the first positional argument and org as the second. This call passes them in the opposite order, so it won't exercise the intended timeout behavior for the right endpoint params.
    influxdb_client_3/write_client/client/write_api.py:1171
  • __getstate__ deletes state['_write_service'], but this class no longer defines _write_service. Pickling will raise KeyError here.
        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

influxdb_client_3/write_client/client/write_api.py:1181

  • __setstate__ calls self.__init__(self._write_options, self._point_settings, ...), but __init__ now requires token, bucket, org, and other constructor args. This will break unpickling; it should just recreate the Rx batching pipeline based on the restored _write_options.
    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/__init__.py

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (3)

tests/test_write_api.py:252

  • WriteApi.write() takes positional arguments in the order (bucket, org, record). This call passes org first and bucket second, which swaps routing parameters and can hide bugs in header/query handling. Prefer keyword arguments (or swap the positionals) to make the intent unambiguous.
    tests/test_write_api.py:293
  • Same positional-argument issue as above: this passes org first and bucket second to WriteApi.write(bucket, org, record), which inverts query parameters. Using keywords also makes the _request_timeout intent clearer.
    influxdb_client_3/write_client/client/write_api.py:1181
  • __getstate__ / __setstate__ are broken after the refactor: _write_service is no longer an attribute, so del state['_write_service'] will raise KeyError during pickling. Additionally, __setstate__ calls __init__ with the wrong arguments for the new constructor signature (it no longer accepts just write_options and point_settings). This makes pickling/unpickling unusable.
    def __getstate__(self):
        """Return a dict of attributes that you want to pickle."""
        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/__init__.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (3)

influxdb_client_3/write_client/client/write_api.py:1181

  • __getstate__ deletes state['_write_service'], but WriteApi no longer defines _write_service, so pickling will raise KeyError. Additionally, __setstate__ calls __init__ with the wrong signature (it now requires token, bucket, org, etc.), so unpickling will fail even if __getstate__ succeeds. Safer approach: pop() optional fields and recreate the Rx pipeline based on _write_options without re-calling __init__.
    def __getstate__(self):
        """Return a dict of attributes that you want to pickle."""
        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

tests/test_write_api.py:252

  • WriteApi.write() expects positional arguments as (bucket, org, record, ...), but this test passes them as (org, bucket, record). That swaps query params and makes the test exercise the wrong behavior. Use keyword args (or correct positional order) to match the write() signature.
    tests/test_write_api.py:294
  • This write() call also swaps (org, bucket) positional order. Since the assertion is about timeout propagation rather than parameter order, using keyword args avoids accidentally testing the wrong thing.

Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/write_client/_sync/rest_client.py

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

tests/test_write_api.py:252

  • In this test, WriteApi.write() expects positional args in the order (bucket, org, record). Passing org first swaps the routing parameters and can hide bugs in request construction/error handling.
    tests/test_write_api.py:293
  • WriteApi.write() positional parameters are (bucket, org, record). This call passes org first, which makes the test exercise the wrong request routing.

Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/__init__.py

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (3)

influxdb_client_3/write_client/client/write_api.py:1179

  • __getstate__() deletes _write_service, but WriteApi no longer sets this attribute, so pickling will raise KeyError. Additionally, __setstate__() calls __init__() with the wrong signature (it now requires token/bucket/org/...).
    def __getstate__(self):
        """Return a dict of attributes that you want to pickle."""
        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

tests/test_write_api.py:252

  • Argument order is swapped in this write() call: WriteApi.write(bucket, org, record, ...) expects bucket first, org second, but this passes org then bucket. This makes the test less meaningful and can hide real regressions.
    tests/test_write_api.py:293
  • Argument order is swapped in this write() call: WriteApi.write(bucket, org, record, ...) expects bucket first, org second. As written, the test exercises the wrong parameter mapping (and still "works" only because values are strings).

Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/write_client/__init__.py
Comment thread influxdb_client_3/write_client/client/write_api.py

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (2)

tests/test_write_api.py:252

  • WriteApi.write() takes arguments in (bucket, org, record) order. This test currently passes (org, bucket, record), which swaps query params and can hide regressions in request construction.
    tests/test_write_api.py:294
  • Same issue as above: the positional args to write() are reversed here; the test should pass bucket first and org second (or use keyword args) so it actually exercises the correct URL/query-param behavior.

Comment thread influxdb_client_3/__init__.py
Comment thread influxdb_client_3/__init__.py
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread tests/test_influxdb_client_3_integration.py

@bednar bednar left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the refactor. I found a few runtime regressions in the new write REST path that should be fixed before this lands.

If the method is called asynchronously,
returns the request thread.
""" # noqa: E501
local_var_params, path, path_params, query_params, header_params, body_params = \

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call_api() is still synchronous: when async_req is false it returns a RESTResponse, and when it is true it returns a thread-like ApplyResult. Awaiting it here makes the real success path fail with TypeError: object RESTResponse can't be used in 'await' expression; the current test hides this by replacing call_api with an AsyncMock. This method needs either a real async implementation or should stop being declared/used as an async coroutine.

Comment thread influxdb_client_3/write_client/_sync/rest_client.py
Comment thread influxdb_client_3/write_client/_sync/rest_client.py
Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/write_client/client/write_api.py Outdated
Comment thread tests/test_write_api.py
Comment thread tests/test_write_api.py
@karel-rehor

Copy link
Copy Markdown
Contributor

@bednar @NguyenHoangSon96

When running the integration test test_multiprocessing_helper in Ubuntu 24.0.4 with python 3.12.7, I'm getting a segmentation fault whenever the underlying Apache Arrow Flight Client is closed or nulled out.

e.g.

Current thread 0x00007b84b5d97b80 (most recent call first):
  File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/query/query_api.py", line 298 in close
  File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/write_client/client/util/multiprocessing_helper.py", line 164 in run
  File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/process.py", line 314 in _bootstrap
  File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/popen_fork.py", line 71 in _launch
...

A couple of observations:

  1. It seems to me a bit wasteful in the run() implementation of multiprocessing_helper to initialize an entirely new InfluxDBClient3, which initializes the entire query_api, just to then discard the query_api and the underlying Flight client without even making a connection. I suspect this may have something to do with the segmentation fault. Apache Arrow best practices recommends pooling and then reusing clients and to avoid needlessly creating more than might be needed.
  2. I have not yet seen where a customer has reported this segmentation fault issue, so I'm not sure that this feature is being widely used.
  3. Since one of the goals of this refactoring is to simplify the core of the client library so that it is like other Influxdb3-X libraries, it seems to me that the multiprocessing_helper is not part of the core. We are already removing some minor features in this PR, so that maybe multiprocessing_helper could also be removed, because it leads in some configurations to a segmentation fault and is not core. It could be added as an advanced example in the examples directory.
  4. Just why using Apache Arrow Flight client with multiprocessing leads to a segmentation fault in some environments and configurations requires further investigation.

@bednar do you have an opinion regarding this?

@karel-rehor karel-rehor left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see Jakub is reporting some of the same problems. Review to continue.

Comment thread influxdb_client_3/write_client/_sync/rest_client.py

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Comment thread influxdb_client_3/write_client/_sync/rest_client.py
Comment thread influxdb_client_3/__init__.py Outdated
Comment thread influxdb_client_3/write_client/client/write_api.py
Comment thread influxdb_client_3/write_client/client/write_api.py
@NguyenHoangSon96 NguyenHoangSon96 force-pushed the refactor/rest-client branch 3 times, most recently from 5548816 to 2825235 Compare July 2, 2026 15:19
@NguyenHoangSon96 NguyenHoangSon96 requested a review from Copilot July 2, 2026 15:48

@karel-rehor karel-rehor left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've still a few files to look at. I've been investigating the segmentation fault issue further but with no fresh information. There are some requests in the files I've looked at so far.

list_results = reader.to_pylist()
self.assertEqual(data_size, len(list_results))

@pytest.mark.skipif(running_on_circleci, reason="Skipping this test on CircleCI")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running this test locally, I'm getting segmentation faults.

Fatal Python error: Segmentation fault
...
Current thread 0x00007b84b5d97b80 (most recent call first):
  File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/query/query_api.py", line 298 in close
  File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/write_client/client/util/multiprocessing_helper.py", line 164 in run
  File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/process.py", line 314 in _bootstrap
  File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/popen_fork.py", line 71 in _launch

Skipping a test specifically in CircleCI without a clear reason looks suspicious. It is possible that the test can run successfully locally in one environment, but then fail in another. When it fails the underlying cause needs to be addressed.

I have a further comment on this in the discussion part of this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skipped it for CI only because I thought that the reason it failed in CI was something related to how multi-processors work in the CI environment.

Comment thread influxdb_client_3/__init__.py Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that one co-pilot recommendation in this file was applied, but it remains unresolved. Can this be marked as resolved?

Comment thread influxdb_client_3/write_client/_sync/rest_client.py

# Close and set _query_api to None because query_api is not needed in this process.
# We only need write_api.
self.client._query_api.close()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the second to last call in the stack, when I'm getting a Segmentation Fault. It appears to be coming from the internal arrow flight client.

e.g.

Fatal Python error: Segmentation fault
...
Current thread 0x00007b9fa357eb80 (most recent call first):
  File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/query/query_api.py", line 298 in close
  File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/write_client/client/util/multiprocessing_helper.py", line 164 in run
  File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/process.py", line 314 in _bootstrap
  File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/popen_fork.py", line 71 in _launch

requires further investigation

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.

Comment thread influxdb_client_3/write_client/_sync/rest_client.py
Comment thread influxdb_client_3/write_client/_sync/rest_client.py
Comment on lines +467 to +478
try:
return await self.call_api(
resource_path=path,
method='POST',
query_params=query_params,
header_params=header_params,
body=body,
async_req=local_var_params.get('async_req'),
_request_timeout=local_var_params.get('_request_timeout'),
urlopen_kw=kwargs.get('urlopen_kw', None))
except ApiException as e:
raise self._translate_write_exception(e, use_v2_api)
@NguyenHoangSon96 NguyenHoangSon96 force-pushed the refactor/rest-client branch 2 times, most recently from ed7c50a to a7b4d07 Compare July 2, 2026 16:33
@alespour

alespour commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

+AI+ says (about SIGSEGV):

• Cause is almost certainly native gRPC/Arrow Flight state being used after fork().

Exact flow in this PR:

  1. Test setUp() creates self.client = InfluxDBClient3(...).
  2. InfluxDBClient3.init creates QueryApi.
  3. QueryApi.init creates pyarrow.flight.FlightClient.
  4. writer.start() forks the process on Linux.
  5. Child runs MultiprocessingWriter.run().
  6. Child creates another InfluxDBClient3, then immediately calls:
  self.client._query_api.close()
  1. That calls:
  self._flight_client.close()

and segfaults in native code.

Why SIGSEGV instead of Python error: pyarrow.flight.FlightClient wraps native C++/gRPC resources. gRPC Python documents that it wraps multithreaded native core and has known problems with fork() when gRPC objects exist before the fork. After fork, only the calling thread survives, but native locks/threads/file descriptors/internal state may be copied in an inconsistent state. Closing the Flight client can then hit invalid native state and crash the interpreter.

So: not “bad line protocol”, not the REST write path directly. The trigger is closing an unnecessary Flight client in the child. The deeper cause is mixing Linux fork() multiprocessing with Arrow Flight/gRPC native client lifecycle.

Relevant docs:


Q: Would not creating Query API in "write-only" client help?
A: Yes. That is the most direct technical mitigation.

If the write-only path never creates QueryApi, it never creates pyarrow.flight.FlightClient, so the child process has no unnecessary Arrow Flight/gRPC native object to close. That removes the exact crash trigger Karel reported.

It also improves design: a writer should not initialize the query stack. In this PR, MultiprocessingWriter only needs REST write support, but InfluxDBClient3 eagerly creates both write and query clients. That is why the helper has this suspicious cleanup:

  self.client._query_api.close()
  self.client._query_api = None

A better fix is either:

  InfluxDBClient3(..., query_api_enabled=False)

or, preferably, lazy query creation so normal users do not need a flag:

  self._query_api = None
  # create QueryApi only inside query()/query_async()

Then MultiprocessingWriter can create InfluxDBClient3 and use _write_api without ever touching Arrow Flight. That would likely avoid this SIGSEGV path.

--- BUT ---

Q: Is this helper any good, anyway??
A: Not really. It looks marginal at best and probably not worth keeping as core API. Main problems:

  1. It is a write-only convenience wrapper around multiprocessing.Process, but it leaks too many lifecycle details: manual __del__(), poison pills, join(), queue management, batching callbacks.
  2. It currently uses multiprocessing.Manager().Queue(), which is slower and more fragile than a plain multiprocessing.Queue() for this use case.
  3. It subclasses Process, but exposes an API shaped like a writer object. That makes shutdown semantics awkward: terminate() is redefined to mean cleanup, while actual process termination semantics are not what users expect.
  4. It depends on passing callbacks/options through process boundaries. On spawn platforms, or if users pass bound methods/closures that are not picklable, this can fail.
  5. It now conflicts with InfluxDBClient3’s unified query/write construction because write-only usage still pays for query/Flight setup unless the client becomes lazy.
  6. It is hard to test reliably. Multiprocessing plus network writes plus batching plus native Arrow/gRPC state is a poor combination for a client-library core test matrix.

My recommendation: remove it from core public API in this refactor, or deprecate it and move the pattern to examples/advanced/multiprocessing_writer.py.

If you want to keep functionality, a better design is not this helper. Use a simple documented pattern:

  • child process creates its own InfluxDBClient3 after process start
  • child only writes
  • parent sends records and a sentinel
  • child closes client on exit
  • no inherited client object, no inherited Flight client, no del API

That aligns with @karel-rehor ’s concern: removing it is reasonable because it is not core, appears lightly used, and creates real native-process risk.

@karel-rehor

karel-rehor commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

OK after researching differences between why a segmentation fault is occurring in Linux but apparently not in Darwin, I came across the +AI+ suggestion that Darwin often uses the "spawn" start method by default in multiprocessing, while Linux relies on the traditional "fork" approach. In multiprocessing_helper:__init__ I tried the following locally.

lines 121-132

   ...
    def __init__(self, **kwargs) -> None:
        """
        Initialize defaults.

        For more information how to initialize the writer see the examples above.

        :param kwargs: arguments are passed into ``__init__`` function of ``InfluxDBClient`` and ``write_api``.
        """
        multiprocessing.set_start_method("spawn", force=True)
        multiprocessing.Process.__init__(self)
        self.kwargs = kwargs
   ...     

🎉 The integration test test_multuprocessing_helper now passes without encountering SIGSEGV. However I'm still not comfortable with this as an ultimate solution.

It still does not address all of the concerns mention by me earlier or in Aleses post. I still prefer this be removed and added as an advanced example.

We might now consider calling the set_start_method as shown here, along with marking the feature as deprecated, as we may want to remove it in a future refactoring with the objective of further simplifying the library.

N.B.

After adding the above change to my local fork. I see that the integration test now passes in CircleCI without issue.

Addendum

Further documentation https://docs.python.org/3.14/library/multiprocessing.html#contexts-and-start-methods. Note that according to this product documentation "set_start_method() should not be used more than once in the program.". So perhaps my suggestion here is not the best solution or work-around, but it is an indicator of how this might be fixed.

btw. it is mentioned in the multiprocessing documentation that "spawn... (is) the default on Windows and macOS".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor write HTTP stack: introduce internal _RestClient and bypass OpenAPI WriteService

5 participants