Hello, now mobilecli can take screenshots, obtain the view hierarchy, and then provide these to the LLM. The LLM can use this data to operate the simulator through the web driver agent for automated testing. During the use of this, some issues were found. For apps with complex pages, it consumes a lot of context because, although the view hierarchy data is filtered, the filtered content is still substantial, resulting in high context usage. Can this be optimized? I have thought of a possible solution:
- Obtain specific view information based on name or other details (for this step, should only the basic information be returned, filtering out extra fields like position to reduce context usage?)
- If there are duplicates, the LLM can determine which one it is based on position or other information.
Can this issue be optimized? Looking forward to your reply, thank you.
Hello, now mobilecli can take screenshots, obtain the view hierarchy, and then provide these to the LLM. The LLM can use this data to operate the simulator through the web driver agent for automated testing. During the use of this, some issues were found. For apps with complex pages, it consumes a lot of context because, although the view hierarchy data is filtered, the filtered content is still substantial, resulting in high context usage. Can this be optimized? I have thought of a possible solution:
Can this issue be optimized? Looking forward to your reply, thank you.