Skip to content

Data race in Dataset.AddFeedback on itemFeedback and numFeedback (follow-up to #1232) #1239

@shaun0927

Description

@shaun0927

Gorse version

master (HEAD b7e6aa9, 2026-04-16). Reproduces on Go 1.26.2 / darwin-arm64 and the 16-goroutine CI path added in #1232.

Describe the bug

#1232 added per-user locks to Dataset.AddFeedback to prevent concurrent append to userFeedback[userIndex], but two other writes in the same function remain outside the lock scope:

  1. dataset/dataset.go:237d.itemFeedback[itemIndex] = append(d.itemFeedback[itemIndex], userIndex) is executed before any lock is acquired. Concurrent goroutines that share an itemIndex race on the slice header.
  2. dataset/dataset.go:242d.numFeedback++ is protected only by a per-user lock. Two goroutines holding different user locks still race on the shared counter.

The current layout only avoids the userFeedback[userIndex] race; goroutines that write feedback for different users but share an item (a common case when loading data in parallel) still collide.

To Reproduce

// dataset/race_repro_test.go
package dataset

import (
    "fmt"
    "sync"
    "testing"
    "time"

    "github.com/gorse-io/gorse/storage/data"
    "github.com/stretchr/testify/assert"
)

func TestAddFeedbackRace(t *testing.T) {
    const numUsers, numItems = 16, 32
    ds := NewDataset(time.Now(), numUsers, numItems)
    for i := 0; i < numUsers; i++ {
        ds.AddUser(data.User{UserId: fmt.Sprintf("user%d", i)})
    }
    for j := 0; j < numItems; j++ {
        ds.AddItem(data.Item{ItemId: fmt.Sprintf("item%d", j)})
    }

    var wg sync.WaitGroup
    ts := time.Unix(1700000000, 0)
    for u := 0; u < numUsers; u++ {
        wg.Add(1)
        go func(userIdx int) {
            defer wg.Done()
            for j := 0; j < numItems; j++ {
                ds.AddFeedback(fmt.Sprintf("user%d", userIdx), fmt.Sprintf("item%d", j), ts)
            }
        }(u)
    }
    wg.Wait()
    assert.Equal(t, numUsers*numItems, ds.CountFeedback())
}

Run go test -race -run TestAddFeedbackRace -count=1 ./dataset/...:

==================
WARNING: DATA RACE
Write at 0x00c00004a2d4 by goroutine 21:
  dataset.(*Dataset).AddFeedback()
      /.../dataset/dataset.go:237 +0x76c
Previous write at 0x00c00004a2d4 by goroutine 29:
  dataset.(*Dataset).AddFeedback()
      /.../dataset/dataset.go:237 +0x76c
==================
--- FAIL: TestAddFeedbackRace (0.02s)
    Error: expected 512  actual: 437
    testing.go:1712: race detected during execution of test

So in addition to the race detector firing, 75 of 512 (14.6%) numFeedback increments are silently lost under contention. The incorrect numFeedback also leaks into CountFeedback, which is consumed by progress reporting and by downstream validation in the parallel loader.

Expected behavior

  • go test -race is clean.
  • ds.CountFeedback() equals the number of successful AddFeedback calls, independent of concurrency.

Additional context / suggested fix

A minimal correction mirrors the existing per-user lock pattern:

func (d *Dataset) AddFeedback(userId, itemId string, timestamp time.Time) {
    userIndex := d.userDict.Add(userId)
    itemIndex := d.itemDict.Add(itemId)

    d.itemDict.Lock(itemIndex)
    d.itemFeedback[itemIndex] = append(d.itemFeedback[itemIndex], userIndex)
    d.itemDict.Unlock(itemIndex)

    d.userDict.Lock(userIndex)
    d.userFeedback[userIndex] = append(d.userFeedback[userIndex], itemIndex)
    d.timestamps[userIndex]   = append(d.timestamps[userIndex], timestamp)
    d.userDict.Unlock(userIndex)

    atomic.AddInt64(&d.numFeedback, 1) // change field type to int64
}

Happy to send the PR with the regression test above if the approach looks good.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions