Skip to content

Possible memory leak when deleting fst in python bindings #286

@joanmo2

Description

@joanmo2

Hello,

I am experiencing an issue with the Python bindings of the library. I am creating an FST and then generating another FST containing the n_best_paths of the original FST.

The problem occurs when I explicitly delete the FSTs. Despite this, the initial FST does not release any memory, and while the FST containing the n_best_paths does free some memory, it doesn't release all of it. This suggests a potential memory leak.

Could you take a look at it?

Code:

import gc
import math

import rustfst as fst
from memory_profiler import profile
from rustfst.algorithms.shortest_path import ShortestPathConfig


def create_fst():
    fa = fst.VectorFst()
    start_state = fa.add_state()
    fa.set_start(start_state)

    current_state = start_state
    input_symbol_table = fst.SymbolTable()
    output_symbol_table = fst.SymbolTable()
    input_symbol_table.add_symbol("<eps>")
    output_symbol_table.add_symbol("<eps>")

    for i in range(1000):
        input_symbol_table.add_symbol(str(i))
    for j in range(1, 151):
        output_symbol_table.add_symbol(str(j))
    # Create 1000 states
    for i in range(1000):
        next_state = fa.add_state()
        # Create 150 transitions between the current state and the next state
        for j in range(1, 151):
            w = -math.log(j * 0.005)
            fa.add_tr(
                current_state,
                fst.Tr(
                    input_symbol_table.find(str(i)),
                    output_symbol_table.find(str(j)),
                    w,
                    next_state,
                ),
            )
        current_state = next_state
    fa.set_final(current_state)
    return fa, input_symbol_table, output_symbol_table


def get_fst_containing_n_shortest_path(fa: fst.VectorFst, nshortest: int):
    res = fa.shortest_path(ShortestPathConfig(nshortest=nshortest))
    return res


@profile
def create_fst_and_then_get_n_best():
    fa, _, _ = create_fst()
    fb = get_fst_containing_n_shortest_path(fa, nshortest=250)
    del fa
    del fb
    gc.collect()


def main():
    create_fst_and_then_get_n_best()



if __name__ == "__main__":
    main()

Output of the profiling:

Filename: bug_rustfst.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    49     44.1 MiB     44.1 MiB           1   @profile
    50                                         def create_fst_and_then_get_n_best():
    51     48.6 MiB      4.5 MiB           1       fa, _, _ = create_fst()
    52   4495.8 MiB   4447.2 MiB           1       fb = get_fst_containing_n_shortest_path(fa, nshortest=250)
    53   4495.8 MiB      0.0 MiB           1       del fa
    54   3606.8 MiB   -889.0 MiB           1       del fb
    55   3606.8 MiB      0.0 MiB           1       gc.collect()

Please note that when deleting fb, only 889MiB from the originally allocated 4447.2 have been released.

Used software:

Python 3.8.19
rustfst-python==1.1.2
memory-profiler==0.61.0
Ubuntu 22.04.5 LTS

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions