Skip to content

Timeseries split quadratic performance for many small chunks #1634

@jonenst

Description

@jonenst
  • Do you want to request a feature or report a bug?
    bug

  • What is the current behavior?
    When "splitting" (more like merging in this case) many small chunks into one big chunk, we have quadratic O(N^2) performance because we copy the growing chunk to a new chunk over and over.

  • If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem

    @Test
    public void splitManyMultiChunkTimeSeriesTest() {
        TimeSeriesIndex index = Mockito.mock(TimeSeriesIndex.class);
        int tsSize = 200000;
        Mockito.when(index.getPointCount()).thenReturn(tsSize);
        TimeSeriesMetadata metadata = new TimeSeriesMetadata("ts1", TimeSeriesDataType.DOUBLE, Collections.emptyMap(),
                index);
        DoubleDataChunk[] chunks = new DoubleDataChunk[tsSize];
        for (int i = 0; i < chunks.length; i++) {
            chunks[i] = new UncompressedDoubleDataChunk(i, new double[] { i });
        }
        StoredDoubleTimeSeries timeSeries = new StoredDoubleTimeSeries(metadata, chunks);
        List<List<DoubleTimeSeries>> split = TimeSeries.split(Collections.singletonList(timeSeries), tsSize);

    }

with tsSize = 100000, this codes takes 4s to run.
with tsSize = 200000 (x2), this codes takes 15s(x4) to run.

  • What is the expected behavior?
    O(N) performance

  • What is the motivation / use case for changing the behavior?
    quadratic performance often "breaks" stuff because at some point, if the code takes to long to run (like hours or days), in practical terms it's just the same as if it was stuck in an infinite loop.

  • Please tell us about your environment:

    • PowSyBl Version: 4.1.0-SNAPSHOT

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions