Skip to content

Improve deployment documentation and fix configuration inconsistencies #31

@drunksu

Description

@drunksu

Acknowledgments

First, thank you for this excellent work! UGround achieves impressive accuracy on visual grounding tasks and the model quality is outstanding. The approach of fine-tuning Qwen2-VL for coordinate prediction works really well in practice.

Issues encountered during deployment

However, I encountered several configuration and documentation issues that made deployment challenging, especially for users with limited experience:

1. Missing dependencies

  • pyairports not listed in requirements but required by vLLM 0.6.1
  • flash-attn installation instructions incomplete, causing performance warnings

2. Parameter inconsistencies

  • uground_qwen2vl.py lacks --max-model-len parameter needed for GPU memory management
  • vLLM server mode and local inference mode have different parameter sets
  • No default GPU memory utilization settings for different GPU configurations

3. Documentation gaps

  • No mention of GPU memory requirements (model needs ~16GB for 7B version)
  • Missing troubleshooting guide for common CUDA/memory errors
  • Path handling edge cases not documented (e.g., relative vs absolute paths)

Suggested improvements

  1. Add complete requirements.txt with version constraints
  2. Standardize parameters across inference scripts
  3. Include deployment examples for different hardware setups (single/multi-GPU)
  4. Add validation and better error messages for common issues

This would make the excellent UGround model much more accessible to the community!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions