Skip to content

Build tent#2089

Open
Dao007forever wants to merge 2 commits into
kvcache-ai:mainfrom
Dao007forever:build_tent
Open

Build tent#2089
Dao007forever wants to merge 2 commits into
kvcache-ai:mainfrom
Dao007forever:build_tent

Conversation

@Dao007forever
Copy link
Copy Markdown

Description

Module

  • Transfer Engine (mooncake-transfer-engine)
  • Mooncake Store (mooncake-store)
  • Mooncake EP (mooncake-ep)
  • Integration (mooncake-integration)
  • P2P Store (mooncake-p2p-store)
  • Python Wheel (mooncake-wheel)
  • PyTorch Backend (mooncake-pg)
  • Mooncake RL (mooncake-rl)
  • CI/CD
  • Docs
  • Other

Type of Change

  • Bug fix
  • New feature
  • Refactor
  • Breaking change
  • Documentation update
  • Other

How Has This Been Tested?

Checklist

  • I have performed a self-review of my own code.
  • I have formatted my own code using ./scripts/code_format.sh before submitting.
  • I have updated the documentation.
  • I have added tests to prove my changes are effective.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the build system and CUDA transport logic by introducing a local CUDA build script, making unit test compilation optional, and improving the NVLink transport layer with CUDA device context management and VMM allocation support. Feedback from the review focuses on improving script portability by removing hardcoded user paths, addressing a security vulnerability in LD_LIBRARY_PATH construction, and ensuring robust error handling for CUDA driver API calls.

CUresult retain_result =
cuMemRetainAllocationHandle(&generic_handle, (void*)desc.addr);
if (retain_result == CUDA_SUCCESS) {
cuMemRelease(generic_handle);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The return value of cuMemRelease is ignored. While this is a cleanup step, it is good practice to check the CUresult to ensure that the allocation handle is correctly released and to detect potential issues in the CUDA driver state.

SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)"
REPO_ROOT="$(cd -- "${SCRIPT_DIR}/.." >/dev/null 2>&1 && pwd)"

DEPS_PREFIX="${MOONCAKE_DEPS:-/home/inf-daole/.local/mooncake-deps}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The default value for DEPS_PREFIX contains a hardcoded path specific to a user's home directory (/home/inf-daole). This makes the script less portable and may cause build failures for other developers. Consider using a more generic default or leaving it empty to require explicit configuration.

Comment thread scripts/build_wheel.sh
# ${BUILD_DIR}/ep_pg_staging when the project was built with -DWITH_EP=ON.
BUILD_DIR="${BUILD_DIR:-build}"
echo "Building wheel for Python ${PYTHON_VERSION} with output directory ${OUTPUT_DIR}"
DEPS_PREFIX="${MOONCAKE_DEPS:-/home/inf-daole/.local/mooncake-deps}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The DEPS_PREFIX variable contains a hardcoded user-specific path (/home/inf-daole). Please replace this with a more generic default to ensure the script works across different environments.

Comment thread scripts/build_wheel.sh

# Ensure LD_LIBRARY_PATH includes /usr/local/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/$(pwd)/build/mooncake-common:/usr/local/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-}:$(pwd)/${BUILD_DIR}/mooncake-common:${DEPS_PREFIX}/lib:/usr/local/lib
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

When LD_LIBRARY_PATH is empty or unset, the current construction results in a leading colon (e.g., :path/to/lib). In Linux, a leading or trailing colon in LD_LIBRARY_PATH is equivalent to including the current directory (.) in the search path. This can be a security risk if the script is executed in a directory containing untrusted shared objects.

Suggested change
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-}:$(pwd)/${BUILD_DIR}/mooncake-common:${DEPS_PREFIX}/lib:/usr/local/lib
export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}${LD_LIBRARY_PATH:+:}$(pwd)/${BUILD_DIR}/mooncake-common:${DEPS_PREFIX}/lib:/usr/local/lib"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant