refactor: rest client by NguyenHoangSon96 · Pull Request #217 · InfluxCommunity/influxdb3-python

NguyenHoangSon96 · 2026-06-12T15:16:42Z

Closes #216

Proposed Changes

I'm trying to make python3 similar to other v3 clients as possible, so I removed some functionalities that currently don't exist in other clients.

Summarize

RestClient will be created in InfluxDBClient3 and injected to WriteApi; WriteApi will have all functions to write to Influxdb, so there relations are something like this InfluxDBClient3 -> WriteApi -> RestClient.

Removed features:

Debug.
org_id - only allow org name.
_return_http_data_only param.
zap-trace-span param.
_preload_content

Checklist

Copilot

Pull request overview

This PR refactors the write path to use a new synchronous RestClient (urllib3-based) instead of the previous generated client/write service wiring, and updates InfluxDBClient3 construction accordingly.

Changes:

Refactors WriteApi to issue HTTP requests via a new RestClient, adds gzip decision/compression utilities, and adds endpoint/exception translation logic.
Introduces influxdb_client_3/write_client/_sync/rest_client.py to encapsulate urllib3 request/response handling.
Updates InfluxDBClient3 initialization to pass write connection details (base_url/auth/gzip) directly into WriteApi.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 15 comments.

File	Description
`influxdb_client_3/write_client/client/write_api.py`	Major refactor of write implementation to use a new REST client and custom request building/serialization paths.
`influxdb_client_3/write_client/_sync/rest_client.py`	Adds a new urllib3-based synchronous REST client abstraction used by `WriteApi`.
`influxdb_client_3/__init__.py`	Wires new write API constructor parameters (auth/gzip/base_url) and updates public kwargs documentation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 12 comments.

Comments suppressed due to low confidence (2)

tests/test_write_api.py:251

WriteApi.write() takes parameters in (bucket, org, record, ...) order, but this test passes (org, bucket, record). This makes the test less representative of real usage and can mask issues in how query parameters are constructed.
tests/test_write_api.py:293
This call passes (org, bucket, record) instead of (bucket, org, record), which makes the timeout test less representative of real usage and inconsistent with other tests in this file.

Copilot

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (4)

influxdb_client_3/write_client/client/write_api.py:1171

__getstate__ deletes _write_service, but WriteApi no longer defines that attribute. This will raise KeyError during pickling (and del state['_subject'] / del state['_disposable'] can also KeyError depending on instance state). Use state.pop(..., None) and remove the stale _write_service reference.

    def __getstate__(self):
        """Return a dict of attributes that you want to pickle."""
        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

influxdb_client_3/write_client/client/write_api.py:1181

__setstate__ calls self.__init__(self._write_options, self._point_settings, ...), but WriteApi.__init__ requires token, bucket, and org as the first arguments. As written, unpickling will raise TypeError. Recreate the Rx batching pipeline based on the restored _write_options instead of re-calling __init__ with the wrong signature.

    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

tests/test_write_api.py:252

WriteApi.write() expects positional arguments in the order (bucket, org, record). This call passes org first, which swaps the query parameters and makes the test less representative of real usage. Use keyword arguments (or correct positional order) to avoid accidental swaps.
tests/test_write_api.py:293
WriteApi.write() positional argument order is (bucket, org, record). This test passes org then bucket, which is easy to get wrong and can hide routing/query-param bugs. Prefer keyword arguments to make the intent explicit (and keep _request_timeout behavior under test).

Copilot

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (5)

tests/test_write_api.py:29

mock_urllib3_timeout_request is used as a side_effect for urllib3._request_methods.RequestMethods.request, which passes the RequestMethods instance as the first positional arg. The helper currently omits that parameter, so method/url are shifted and the test can behave incorrectly.
tests/test_write_api.py:252
WriteApi.write() takes bucket as the first positional argument and org as the second. This call passes them in the opposite order, so the request routing/query params will be wrong.
tests/test_write_api.py:293
WriteApi.write() takes bucket as the first positional argument and org as the second. This call passes them in the opposite order, so it won't exercise the intended timeout behavior for the right endpoint params.
influxdb_client_3/write_client/client/write_api.py:1171
__getstate__ deletes state['_write_service'], but this class no longer defines _write_service. Pickling will raise KeyError here.

        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

influxdb_client_3/write_client/client/write_api.py:1181

__setstate__ calls self.__init__(self._write_options, self._point_settings, ...), but __init__ now requires token, bucket, org, and other constructor args. This will break unpickling; it should just recreate the Rx batching pipeline based on the restored _write_options.

    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

Copilot

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (3)

tests/test_write_api.py:252

WriteApi.write() takes positional arguments in the order (bucket, org, record). This call passes org first and bucket second, which swaps routing parameters and can hide bugs in header/query handling. Prefer keyword arguments (or swap the positionals) to make the intent unambiguous.
tests/test_write_api.py:293
Same positional-argument issue as above: this passes org first and bucket second to WriteApi.write(bucket, org, record), which inverts query parameters. Using keywords also makes the _request_timeout intent clearer.
influxdb_client_3/write_client/client/write_api.py:1181
__getstate__ / __setstate__ are broken after the refactor: _write_service is no longer an attribute, so del state['_write_service'] will raise KeyError during pickling. Additionally, __setstate__ calls __init__ with the wrong arguments for the new constructor signature (it no longer accepts just write_options and point_settings). This makes pickling/unpickling unusable.

    def __getstate__(self):
        """Return a dict of attributes that you want to pickle."""
        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

Copilot

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (3)

influxdb_client_3/write_client/client/write_api.py:1181

__getstate__ deletes state['_write_service'], but WriteApi no longer defines _write_service, so pickling will raise KeyError. Additionally, __setstate__ calls __init__ with the wrong signature (it now requires token, bucket, org, etc.), so unpickling will fail even if __getstate__ succeeds. Safer approach: pop() optional fields and recreate the Rx pipeline based on _write_options without re-calling __init__.

    def __getstate__(self):
        """Return a dict of attributes that you want to pickle."""
        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

tests/test_write_api.py:252

WriteApi.write() expects positional arguments as (bucket, org, record, ...), but this test passes them as (org, bucket, record). That swaps query params and makes the test exercise the wrong behavior. Use keyword args (or correct positional order) to match the write() signature.
tests/test_write_api.py:294
This write() call also swaps (org, bucket) positional order. Since the assertion is about timeout propagation rather than parameter order, using keyword args avoids accidentally testing the wrong thing.

Copilot

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

tests/test_write_api.py:252

In this test, WriteApi.write() expects positional args in the order (bucket, org, record). Passing org first swaps the routing parameters and can hide bugs in request construction/error handling.
tests/test_write_api.py:293
WriteApi.write() positional parameters are (bucket, org, record). This call passes org first, which makes the test exercise the wrong request routing.

Copilot

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (3)

influxdb_client_3/write_client/client/write_api.py:1179

__getstate__() deletes _write_service, but WriteApi no longer sets this attribute, so pickling will raise KeyError. Additionally, __setstate__() calls __init__() with the wrong signature (it now requires token/bucket/org/...).

    def __getstate__(self):
        """Return a dict of attributes that you want to pickle."""
        state = self.__dict__.copy()
        # Remove rx
        del state['_subject']
        del state['_disposable']
        del state['_write_service']
        return state

    def __setstate__(self, state):
        """Set your object with the provided dict."""
        self.__dict__.update(state)
        # Init Rx
        self.__init__(self._write_options,
                      self._point_settings,
                      success_callback=self._success_callback,
                      error_callback=self._error_callback,
                      retry_callback=self._retry_callback)

tests/test_write_api.py:252

Argument order is swapped in this write() call: WriteApi.write(bucket, org, record, ...) expects bucket first, org second, but this passes org then bucket. This makes the test less meaningful and can hide real regressions.
tests/test_write_api.py:293
Argument order is swapped in this write() call: WriteApi.write(bucket, org, record, ...) expects bucket first, org second. As written, the test exercises the wrong parameter mapping (and still "works" only because values are strings).

Copilot

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (2)

tests/test_write_api.py:252

WriteApi.write() takes arguments in (bucket, org, record) order. This test currently passes (org, bucket, record), which swaps query params and can hide regressions in request construction.
tests/test_write_api.py:294
Same issue as above: the positional args to write() are reversed here; the test should pass bucket first and org second (or use keyword args) so it actually exercises the correct URL/query-param behavior.

bednar

Thanks for the refactor. I found a few runtime regressions in the new write REST path that should be fixed before this lands.

bednar · 2026-07-02T08:57:25Z

+                 If the method is called asynchronously,
+                 returns the request thread.
+        """  # noqa: E501
+        local_var_params, path, path_params, query_params, header_params, body_params = \


call_api() is still synchronous: when async_req is false it returns a RESTResponse, and when it is true it returns a thread-like ApplyResult. Awaiting it here makes the real success path fail with TypeError: object RESTResponse can't be used in 'await' expression; the current test hides this by replacing call_api with an AsyncMock. This method needs either a real async implementation or should stop being declared/used as an async coroutine.

karel-rehor · 2026-07-02T09:07:26Z

@bednar @NguyenHoangSon96

When running the integration test test_multiprocessing_helper in Ubuntu 24.0.4 with python 3.12.7, I'm getting a segmentation fault whenever the underlying Apache Arrow Flight Client is closed or nulled out.

e.g.

Current thread 0x00007b84b5d97b80 (most recent call first):
  File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/query/query_api.py", line 298 in close
  File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/write_client/client/util/multiprocessing_helper.py", line 164 in run
  File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/process.py", line 314 in _bootstrap
  File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/popen_fork.py", line 71 in _launch
...

A couple of observations:

It seems to me a bit wasteful in the run() implementation of multiprocessing_helper to initialize an entirely new InfluxDBClient3, which initializes the entire query_api, just to then discard the query_api and the underlying Flight client without even making a connection. I suspect this may have something to do with the segmentation fault. Apache Arrow best practices recommends pooling and then reusing clients and to avoid needlessly creating more than might be needed.
I have not yet seen where a customer has reported this segmentation fault issue, so I'm not sure that this feature is being widely used.
Since one of the goals of this refactoring is to simplify the core of the client library so that it is like other Influxdb3-X libraries, it seems to me that the multiprocessing_helper is not part of the core. We are already removing some minor features in this PR, so that maybe multiprocessing_helper could also be removed, because it leads in some configurations to a segmentation fault and is not core. It could be added as an advanced example in the examples directory.
Just why using Apache Arrow Flight client with multiprocessing leads to a segmentation fault in some environments and configurations requires further investigation.

@bednar do you have an opinion regarding this?

karel-rehor

I see Jakub is reporting some of the same problems. Review to continue.

Copilot

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

karel-rehor

I've still a few files to look at. I've been investigating the segmentation fault issue further but with no fresh information. There are some requests in the files I've looked at so far.

karel-rehor · 2026-07-02T11:27:29Z

            list_results = reader.to_pylist()
            self.assertEqual(data_size, len(list_results))

+    @pytest.mark.skipif(running_on_circleci, reason="Skipping this test on CircleCI")


When running this test locally, I'm getting segmentation faults.

Fatal Python error: Segmentation fault ... Current thread 0x00007b84b5d97b80 (most recent call first): File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/query/query_api.py", line 298 in close File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/write_client/client/util/multiprocessing_helper.py", line 164 in run File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/process.py", line 314 in _bootstrap File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/popen_fork.py", line 71 in _launch

Skipping a test specifically in CircleCI without a clear reason looks suspicious. It is possible that the test can run successfully locally in one environment, but then fail in another. When it fails the underlying cause needs to be addressed.

I have a further comment on this in the discussion part of this PR.

I skipped it for CI only because I thought that the reason it failed in CI was something related to how multi-processors work in the CI environment.

karel-rehor · 2026-07-02T12:43:29Z

It appears that one co-pilot recommendation in this file was applied, but it remains unresolved. Can this be marked as resolved?

karel-rehor · 2026-07-02T14:36:09Z

+
+        # Close and set _query_api to None because query_api is not needed in this process.
+        # We only need write_api.
+        self.client._query_api.close()


This is the second to last call in the stack, when I'm getting a Segmentation Fault. It appears to be coming from the internal arrow flight client.

e.g.

Fatal Python error: Segmentation fault ... Current thread 0x00007b9fa357eb80 (most recent call first): File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/query/query_api.py", line 298 in close File "/home/karl/bonitoo/prjs/github.com/influxCommunity/influxdb3-python/influxdb_client_3/write_client/client/util/multiprocessing_helper.py", line 164 in run File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/process.py", line 314 in _bootstrap File "/home/karl/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/popen_fork.py", line 71 in _launch

requires further investigation

Copilot

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.

+        try:
+            return await self.call_api(
+                resource_path=path,
+                method='POST',
+                query_params=query_params,
+                header_params=header_params,
+                body=body,
+                async_req=local_var_params.get('async_req'),
+                _request_timeout=local_var_params.get('_request_timeout'),
+                urlopen_kw=kwargs.get('urlopen_kw', None))
+        except ApiException as e:
+            raise self._translate_write_exception(e, use_v2_api)


alespour · 2026-07-03T11:46:06Z

+AI+ says (about SIGSEGV):

• Cause is almost certainly native gRPC/Arrow Flight state being used after fork().

Exact flow in this PR:

Test setUp() creates self.client = InfluxDBClient3(...).
InfluxDBClient3.init creates QueryApi.
QueryApi.init creates pyarrow.flight.FlightClient.
writer.start() forks the process on Linux.
Child runs MultiprocessingWriter.run().
Child creates another InfluxDBClient3, then immediately calls:

  self.client._query_api.close()

That calls:

  self._flight_client.close()

and segfaults in native code.

Why SIGSEGV instead of Python error: pyarrow.flight.FlightClient wraps native C++/gRPC resources. gRPC Python documents that it wraps multithreaded native core and has known problems with fork() when gRPC objects exist before the fork. After fork, only the calling thread survives, but native locks/threads/file descriptors/internal state may be copied in an inconsistent state. Closing the Flight client can then hit invalid native state and crash the interpreter.

So: not “bad line protocol”, not the REST write path directly. The trigger is closing an unnecessary Flight client in the child. The deeper cause is mixing Linux fork() multiprocessing with Arrow Flight/gRPC native client lifecycle.

Relevant docs:

Arrow Flight is built on gRPC: https://arrow.apache.org/docs/format/Flight.html
gRPC fork caveat: https://grpc.github.io/grpc/core/md_doc_fork_support.html

Q: Would not creating Query API in "write-only" client help?
A: Yes. That is the most direct technical mitigation.

If the write-only path never creates QueryApi, it never creates pyarrow.flight.FlightClient, so the child process has no unnecessary Arrow Flight/gRPC native object to close. That removes the exact crash trigger Karel reported.

It also improves design: a writer should not initialize the query stack. In this PR, MultiprocessingWriter only needs REST write support, but InfluxDBClient3 eagerly creates both write and query clients. That is why the helper has this suspicious cleanup:

  self.client._query_api.close()
  self.client._query_api = None

A better fix is either:

  InfluxDBClient3(..., query_api_enabled=False)

or, preferably, lazy query creation so normal users do not need a flag:

  self._query_api = None
  # create QueryApi only inside query()/query_async()

Then MultiprocessingWriter can create InfluxDBClient3 and use _write_api without ever touching Arrow Flight. That would likely avoid this SIGSEGV path.

--- BUT ---

Q: Is this helper any good, anyway??
A: Not really. It looks marginal at best and probably not worth keeping as core API. Main problems:

It is a write-only convenience wrapper around multiprocessing.Process, but it leaks too many lifecycle details: manual __del__(), poison pills, join(), queue management, batching callbacks.
It currently uses multiprocessing.Manager().Queue(), which is slower and more fragile than a plain multiprocessing.Queue() for this use case.
It subclasses Process, but exposes an API shaped like a writer object. That makes shutdown semantics awkward: terminate() is redefined to mean cleanup, while actual process termination semantics are not what users expect.
It depends on passing callbacks/options through process boundaries. On spawn platforms, or if users pass bound methods/closures that are not picklable, this can fail.
It now conflicts with InfluxDBClient3’s unified query/write construction because write-only usage still pays for query/Flight setup unless the client becomes lazy.
It is hard to test reliably. Multiprocessing plus network writes plus batching plus native Arrow/gRPC state is a poor combination for a client-library core test matrix.

My recommendation: remove it from core public API in this refactor, or deprecate it and move the pattern to examples/advanced/multiprocessing_writer.py.

If you want to keep functionality, a better design is not this helper. Use a simple documented pattern:

child process creates its own InfluxDBClient3 after process start
child only writes
parent sends records and a sentinel
child closes client on exit
no inherited client object, no inherited Flight client, no del API

That aligns with @karel-rehor ’s concern: removing it is reasonable because it is not core, appears lightly used, and creates real native-process risk.

karel-rehor · 2026-07-03T13:17:01Z

OK after researching differences between why a segmentation fault is occurring in Linux but apparently not in Darwin, I came across the +AI+ suggestion that Darwin often uses the "spawn" start method by default in multiprocessing, while Linux relies on the traditional "fork" approach. In multiprocessing_helper:__init__ I tried the following locally.

lines 121-132

   ...
    def __init__(self, **kwargs) -> None:
        """
        Initialize defaults.

        For more information how to initialize the writer see the examples above.

        :param kwargs: arguments are passed into ``__init__`` function of ``InfluxDBClient`` and ``write_api``.
        """
        multiprocessing.set_start_method("spawn", force=True)
        multiprocessing.Process.__init__(self)
        self.kwargs = kwargs
   ...

🎉 The integration test test_multuprocessing_helper now passes without encountering SIGSEGV. However I'm still not comfortable with this as an ultimate solution.

It still does not address all of the concerns mention by me earlier or in Aleses post. I still prefer this be removed and added as an advanced example.

We might now consider calling the set_start_method as shown here, along with marking the feature as deprecated, as we may want to remove it in a future refactoring with the objective of further simplifying the library.

N.B.

After adding the above change to my local fork. I see that the integration test now passes in CircleCI without issue.

Addendum

Further documentation https://docs.python.org/3.14/library/multiprocessing.html#contexts-and-start-methods. Note that according to this product documentation "set_start_method() should not be used more than once in the program.". So perhaps my suggestion here is not the best solution or work-around, but it is an indicator of how this might be fixed.

btw. it is mentioned in the multiprocessing documentation that "spawn... (is) the default on Windows and macOS".

NguyenHoangSon96 self-assigned this Jun 12, 2026

Copilot AI review requested due to automatic review settings June 12, 2026 15:16

Copilot started reviewing on behalf of NguyenHoangSon96 June 12, 2026 15:17 View session

Copilot AI reviewed Jun 12, 2026

View reviewed changes

NguyenHoangSon96 force-pushed the refactor/rest-client branch from 2b26e5c to 7d74438 Compare June 12, 2026 16:42

NguyenHoangSon96 marked this pull request as draft June 16, 2026 11:15

NguyenHoangSon96 force-pushed the refactor/rest-client branch from 52cee20 to a7076f9 Compare June 17, 2026 09:38

NguyenHoangSon96 requested a review from Copilot June 18, 2026 05:45

Copilot started reviewing on behalf of NguyenHoangSon96 June 18, 2026 05:45 View session