Skip to content

Commit edb2031

Browse files
authored
fix: poll status=="completed" in cloud add_document (#226)
The cloud backend previously polled tree_resp["retrieval_ready"] as the ready signal. Empirically this flag is not a reliable indicator — docs can reach status=="completed" without retrieval_ready flipping, causing col.add() to wait until the 10 min timeout before giving up on otherwise-successful uploads. The cloud API's canonical ready signal is status=="completed"; switch the poll to check that instead.
1 parent f5de9c9 commit edb2031

1 file changed

Lines changed: 4 additions & 3 deletions

File tree

pageindex/backend/cloud.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -141,12 +141,13 @@ def add_document(self, collection: str, file_path: str) -> str:
141141

142142
doc_id = resp["doc_id"]
143143

144-
# Poll until retrieval-ready
144+
# Poll until indexing completes. The cloud API signals readiness via
145+
# status == "completed"; retrieval_ready is not a reliable indicator.
145146
for _ in range(120): # 10 min max
146147
tree_resp = self._request("GET", f"/doc/{self._enc(doc_id)}/", params={"type": "tree"})
147-
if tree_resp.get("retrieval_ready"):
148-
return doc_id
149148
status = tree_resp.get("status", "")
149+
if status == "completed":
150+
return doc_id
150151
if status == "failed":
151152
raise CloudAPIError(f"Document {doc_id} indexing failed")
152153
time.sleep(5)

0 commit comments

Comments
 (0)