[v0] remove APHRODITE_USE_V1 from platform and v1#1590
Conversation
Signed-off-by: AlpinDale <alpindale@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request focuses on removing the APHRODITE_USE_V1 environment variable, effectively making the v1 engine the default and only path. The changes primarily involve removing conditional logic based on this variable and updating log messages across various platform-specific files. The refactoring is generally well-executed, but I've identified a couple of minor issues. One is a potential change in logging behavior due to a missing scope parameter, and the other is a leftover reference to the use_v1 variable in an error message. Addressing these points will help ensure the cleanup is complete and consistent.
| logger.info_once( | ||
| "Using FlexAttention backend for %s.", | ||
| ", ".join(f"{k}={v}" for k, v in use_flex_attention_reason.items()), | ||
| ) |
There was a problem hiding this comment.
The scope="global" parameter has been removed from this logger.info_once call. This changes the logging behavior from logging once per process to once per worker (the default), which can lead to duplicated log messages in a multi-worker environment. This appears to be an oversight, as other similar logging calls in this file retained scope="global".
| logger.info_once( | |
| "Using FlexAttention backend for %s.", | |
| ", ".join(f"{k}={v}" for k, v in use_flex_attention_reason.items()), | |
| ) | |
| logger.info_once( | |
| "Using FlexAttention backend for %s.", | |
| ", ".join(f"{k}={v}" for k, v in use_flex_attention_reason.items()), | |
| scope="global", | |
| ) |
| return FLASH_ATTN | ||
| elif selected_backend: | ||
| raise ValueError( | ||
| f"Invalid attention backend for {cls.device_name}, with use_v1: {use_v1} use_mla: {use_mla}" |
There was a problem hiding this comment.
The use_v1 variable is referenced in this error message, but it's being removed as part of this refactoring. This reference should be removed to avoid confusion and make the code cleaner, as the concept of v1 is being deprecated.
| f"Invalid attention backend for {cls.device_name}, with use_v1: {use_v1} use_mla: {use_mla}" | |
| f"Invalid attention backend for {cls.device_name}, with use_mla: {use_mla}" |
No description provided.