Skip to content

Investigate use of Compact strings for TextBuffer #910

Description

@yawkat

I was looking at some heap dumps and saw that TextBuffer still uses char[] internally. I think it could instead use an approach similar to the jdk String and StringBuilder, with a byte[] that either contains latin1 encoded as one byte per char, or utf-16 with two bytes per char. This would save a lot of memory in most standard cases, especially when the buffer becomes large.

It could also improve performance when constructing String instances a bit. Using the String charset constructor, it's possible to create a latin1 String directly from bytes with a single copy, while going through the char[] constructor needs to run compaction. I don't know if this is relevant though, the compaction is probably already very fast.

What do you think @cowtowncoder @pjfanning ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions