Add support for sharing an ORT session#248
Open
quic-suppugun wants to merge 1 commit into
Open
Conversation
For every instance in a model instance group a new ORT session is
created. This code adds support to share a session per instance
group.
This support can be enabled by defining
'share_session_between_instances' to true
in triton model config "parameters". Example:
parameters [
.....
{
key: "share_session_between_instances"
value: {string_value: "true"}
}
]
This is a global parameter and cannot be defined per instance
group. The user should determine if the parameter makes sense for
their setup.
When log-info option of tritonserver is set to "1",
the logs will indicate that a session is mapped for the
instance group during the first initialized instance and reused for other
instances.
Example:
TRITONBACKEND_ModelInstanceInitialize: <model>_0_1 (CPU device 0)
TRITONBACKEND_ModelInstanceInitialize: <model>_0_0 (CPU device 0)
Could not find a session corresponding to instance group: <model>_0
Created session for instance: <model>_0_1
Mapped session for instance group: <model>_0
Reusing session for instance: <model>_0_0
Change-Id: I6dc509b9c2451e3dd14d45f6f150b37f50b5db89
|
I have compiled two images based on this PR for easy use. They are:
The first image only replaces the ONNX backend while keeping everything else unchanged. The second image provides a smaller CPU version. |
|
Hey! I was going to work on resolving the same issue with session sharing and noticed that this PR already exists, so thanks. Is there a reluctance to do this that I'm missing? |
|
This is great, but it needs to be set with the enable_mem_pattern parameter to solve the performance issue. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
For every instance in a model instance group a new ORT session is created. This code adds support to share a session per instance group.
This support can be enabled by defining
'share_session_between_instances' to true
in triton model config "parameters". Example:
parameters [
.....
{
key: "share_session_between_instances"
value: {string_value: "true"}
}
]
This is a global parameter and cannot be defined per instance group. The user should determine if the parameter makes sense for their setup.
When log-info option of tritonserver is set to "1", the logs will indicate that a session is mapped for the instance group during the first initialized instance and reused for other instances.
Example:
TRITONBACKEND_ModelInstanceInitialize: _0_1 (CPU device 0)
TRITONBACKEND_ModelInstanceInitialize: _0_0 (CPU device 0)
Could not find a session corresponding to instance group: _0
Created session for instance: _0_1
Mapped session for instance group: _0
Reusing session for instance: _0_0
Change-Id: I6dc509b9c2451e3dd14d45f6f150b37f50b5db89