I recently spent some time head-scratching over a strange observation. A simple neutral SLiM model that just forward-simulated 10,000 individuals with tree-sequence recording, but without doing any simplification at all, did not exhibit the memory-usage dynamics I expected. I expected that the memory footprint of the process would grow as a linear function of the number of generations simulated, since the tree-sequence tables would just grow and grow. Instead, the memory footprint displayed a sawtooth pattern; the memory usage would grow linearly for some number of generations, and then suddenly fall by a factor of two or more, and then resume linear growth.
Eventually I realized that this was due to a relatively new feature in the macOS kernel, memory compression. Basically, the kernel observes when a given block of memory has not been accessed for a long time, and compresses it to take less memory. If anybody tries to access the memory block, the kernel decompresses it on demand. This is all invisible to the process, which only ever sees the memory in its decompressed state. It's also quite fast. Rather remarkable.
After realizing that this was what was causing that saw-tooth memory usage pattern, I realized that the effectiveness of this memory compression scheme in reducing the memory usage of the tree sequence was actually pretty interesting. The kernel was apparently compressing the tree sequence by as much as 10x, with very little performance cost! It made me wonder: could tskit do the same sort of thing under the hood? Compress particular buffers in the tree sequence behind the scenes, and decompress them when they need to be accessed? If it resulted in a 10x reduction in memory and disk footprint, that would be pretty significant, right? And it might not actually be very hard to implement. Food for thought?
I recently spent some time head-scratching over a strange observation. A simple neutral SLiM model that just forward-simulated 10,000 individuals with tree-sequence recording, but without doing any simplification at all, did not exhibit the memory-usage dynamics I expected. I expected that the memory footprint of the process would grow as a linear function of the number of generations simulated, since the tree-sequence tables would just grow and grow. Instead, the memory footprint displayed a sawtooth pattern; the memory usage would grow linearly for some number of generations, and then suddenly fall by a factor of two or more, and then resume linear growth.
Eventually I realized that this was due to a relatively new feature in the macOS kernel, memory compression. Basically, the kernel observes when a given block of memory has not been accessed for a long time, and compresses it to take less memory. If anybody tries to access the memory block, the kernel decompresses it on demand. This is all invisible to the process, which only ever sees the memory in its decompressed state. It's also quite fast. Rather remarkable.
After realizing that this was what was causing that saw-tooth memory usage pattern, I realized that the effectiveness of this memory compression scheme in reducing the memory usage of the tree sequence was actually pretty interesting. The kernel was apparently compressing the tree sequence by as much as 10x, with very little performance cost! It made me wonder: could tskit do the same sort of thing under the hood? Compress particular buffers in the tree sequence behind the scenes, and decompress them when they need to be accessed? If it resulted in a 10x reduction in memory and disk footprint, that would be pretty significant, right? And it might not actually be very hard to implement. Food for thought?