Hello,
I am using pysyncobj as the RAFT backend for Patroni (PostgreSQL HA orchestrator), and I ran into a reproducible crash in the RAFT journal handling code.
The exception looks like this (shortened):
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib64/python3.6/threading.py", line 937, in _bootstrap_inner
self.run()
File "/usr/lib64/python3.6/threading.py", line 885, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/raft.py", line 274, in _autoTickThread
self.doTick(self.conf.autoTickPeriod)
File "/usr/local/lib/python3.6/site-packages/pysyncobj/syncobj.py", line 532, in doTick
self._onTick(timeToWait)
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/raft.py", line 264, in _onTick
super(KVStoreTTL, self)._onTick(timeToWait)
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/raft.py", line 100, in _onTick
super(DynMemberSyncObj, self)._onTick(timeToWait)
File "/usr/local/lib/python3.6/site-packages/pysyncobj/syncobj.py", line 622, in _onTick
self._checkCommandsToApply()
File "/usr/local/lib/python3.6/site-packages/pysyncobj/syncobj.py", line 456, in _checkCommandsToApply
self.__raftLog.add(command, idx, term)
File "/usr/local/lib/python3.6/site-packages/pysyncobj/journal.py", line 201, in add
self.__journalFile.write(self.__currentOffset, cmdData)
File "/usr/local/lib/python3.6/site-packages/pysyncobj/journal.py", line 104, in write
self.__mm[offset:offset + size] = values
IndexError: mmap slice assignment is wrong size
- If mmap.resize() returns without raising an exception but the mapping was not actually enlarged (or enlarged less than requested for any reason), the code assumes success and proceeds to write beyond the real end of the mapping.
- There is no verification that self.__mm.size() >= offset + size before the slice assignment.
This leads to IndexError: mmap slice assignment is wrong size when the slice [offset:offset + size] gets silently truncated to the real end of the mapping, while len(values) == size.
Suggested fix (one possible approach):
Add an explicit check after resize() and before the assignment, e.g.:
def write(self, offset, values):
size = len(values)
currSize = self.__mm.size()
if offset + size > currSize:
try:
self.__mm.resize(int(currSize * self.__resizeFactor))
except (SystemError, OSError):
self.__extand(int(currSize * self.__resizeFactor) - currSize)
# Guard: verify resize actually provided enough space
if self.__mm.size() < offset + size:
self.__extand(offset + size - self.__mm.size())
self.__mm[offset:offset + size] = values
I understand that mmap.resize() should either succeed or raise, but in practice this crash shows that we can end up in a state where resize() returns without an exception and the mapping is still too small for the requested write.
Hello,
I am using pysyncobj as the RAFT backend for Patroni (PostgreSQL HA orchestrator), and I ran into a reproducible crash in the RAFT journal handling code.
The exception looks like this (shortened):
This leads to IndexError: mmap slice assignment is wrong size when the slice [offset:offset + size] gets silently truncated to the real end of the mapping, while len(values) == size.
Suggested fix (one possible approach):
Add an explicit check after resize() and before the assignment, e.g.:
I understand that mmap.resize() should either succeed or raise, but in practice this crash shows that we can end up in a state where resize() returns without an exception and the mapping is still too small for the requested write.