Skip to content

bug: check_sara, check_marttra, nighit crash on empty or vowel-only input #1376

@phoneee

Description

@phoneee

Description

Three public functions crash with IndexError when given empty strings or inputs lacking expected characters:

  1. KhaveeVerifier.check_sara("")IndexError at line 58 — word[-1] on empty string
  2. KhaveeVerifier.check_marttra("")IndexError at line 257 — word[-1] on empty string
  3. nighit("สํ", "าี")IndexError at line 41 — [...][0] on empty list when w2 has no Thai consonants

Expected results

kv.check_sara("")       # → "" or raise ValueError
kv.check_marttra("")    # → "" or raise ValueError
nighit("สํ", "าี")     # → raise ValueError

Current results

kv.check_sara("")       # IndexError: string index out of range
kv.check_marttra("")    # IndexError: string index out of range
nighit("สํ", "าี")     # IndexError: list index out of range

Steps to reproduce

from pythainlp.khavee import KhaveeVerifier
kv = KhaveeVerifier()
kv.check_sara("")        # crash
kv.check_marttra("")     # crash

from pythainlp.morpheme import nighit
nighit("สํ", "าี")      # crash

PyThaiNLP version

5.3.3

Python version

3.13

Operating system and version

macOS

Possible solution

  • check_sara/check_marttra: Add if not word: return "" at function start.
  • nighit: Check consonant list non-empty before [0], raise ValueError with descriptive message.

Files

  • pythainlp/khavee/core.py (lines 58, 257)
  • pythainlp/morpheme/word_formation.py (line 41)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugbugs in the library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions