Skip to content

sis_hash_helper more efficient#86

Draft
albertz wants to merge 2 commits into
masterfrom
albert-sis-hash-helper-more-efficient
Draft

sis_hash_helper more efficient#86
albertz wants to merge 2 commits into
masterfrom
albert-sis-hash-helper-more-efficient

Conversation

@albertz
Copy link
Copy Markdown
Member

@albertz albertz commented Jun 20, 2022

This gives me some huge speedup, of a big GMM pipeline, from 12 sec runtime down to 2 sec runtime.

Edit Sorry, wrong numbers...

@albertz
Copy link
Copy Markdown
Member Author

albertz commented Jun 20, 2022

Hm ok I need to do some debugging, this seems to change the hash.

@albertz albertz marked this pull request as draft June 20, 2022 12:39
@JackTemaki
Copy link
Copy Markdown
Contributor

Yes, my setup before the change:
error(7) runnable(5) waiting(649)

My setup after the change:
error(3) runnable(12) waiting(247)

@albertz
Copy link
Copy Markdown
Member Author

albertz commented Jun 20, 2022

Ok, similar fix as in #89. Now for my simple test I get the same hash.

@albertz albertz marked this pull request as ready for review June 20, 2022 12:48
@albertz
Copy link
Copy Markdown
Member Author

albertz commented Jun 20, 2022

However, the speedup is much less now. Need to do some benchmarks. I think it should still be faster than before though.

@albertz
Copy link
Copy Markdown
Member Author

albertz commented Jun 20, 2022

However, the speedup is much less now. Need to do some benchmarks. I think it should still be faster than before though.

Actually, no? It seems slightly slower. Maybe due to higher memory consumption? I need to investigate this a bit more...

@albertz albertz marked this pull request as draft June 20, 2022 12:53
Comment thread sisyphus/hash.py
Comment thread sisyphus/hash.py
Comment thread sisyphus/hash.py
@albertz
Copy link
Copy Markdown
Member Author

albertz commented Jun 21, 2022

However, the speedup is much less now. Need to do some benchmarks. I think it should still be faster than before though.

Actually, no? It seems slightly slower. Maybe due to higher memory consumption? I need to investigate this a bit more...

I'm also thinking about actually do more clever caching. But not exactly sure where. Unfortunately the objects we get here are often dict or list or some other config objects, and they are all mutable, so you cannot really cache the hash, unless you make sure the cache is correctly invalidated, which is not easily possible always (e.g. no idea how you would do that for a dict). Maybe in JobSingleton.__call__ we can do it for the args, because afterwards the args should anyway not change anymore.

@albertz
Copy link
Copy Markdown
Member Author

albertz commented Jun 21, 2022

Also, I think the current change in this PR here does not fully captures all the recursive calls of sis_hash_helper. The recursive calls are often through obj._sis_hash calls.

@albertz albertz force-pushed the albert-sis-hash-helper-more-efficient branch from 4e6fb53 to 2f4cbe0 Compare June 21, 2022 09:01
@critias
Copy link
Copy Markdown
Contributor

critias commented Jun 21, 2022

Also, I think the current change in this PR here does not fully captures all the recursive calls of sis_hash_helper. The recursive calls are often through obj._sis_hash calls.

You could also cache the result of obj._sis_hash

@albertz albertz mentioned this pull request Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants