Skip to content

Commit 524f876

Browse files
authored
feat(Pipeline): roi 和 target 字段支持负数坐标和尺寸 (#1093)
1 parent 4311082 commit 524f876

5 files changed

Lines changed: 185 additions & 12 deletions

File tree

docs/en_us/3.1-PipelineProtocol.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -426,9 +426,11 @@ Template matching, also known as "find image."
426426
427427
This algorithm property requires additional fields:
428428
429-
- `roi`: *array<int, 4>* | *string*
429+
- `roi`: *array<int, 4>* | *array<int, 2>* | *string*
430430
Recognition area coordinates. Optional, default [0, 0, 0, 0], i.e. full screen.
431-
- *array<int, 4>*: Recognition area coordinates, [x, y, w, h], if you want full screen, you can set it to [0, 0, 0, 0].
431+
- *array<int, 4>*: Recognition area coordinates, [x, y, w, h], if you want full screen, you can set it to [0, 0, 0, 0].
432+
Supports negative values: negative x or y means calculating from the right or bottom edge of the image; w or h of 0 means extending to the edge, negative means taking absolute value and treating (x, y) as the bottom-right corner instead of top-left. For example, `[-100, -100, 0, 0]` represents the area from the bottom-right corner 100 pixels inward to the edge; `[200, 200, -100, -50]` represents a 100x50 area with (200, 200) as bottom-right corner, i.e., `[100, 150, 100, 50]`.
433+
- *array<int, 2>*: Fixed coordinate point `[x, y]`. Supports negative values with the same meaning as above.
432434
- *string*: Fill in the node name, and identify within the target range identified by a previously executed node.
433435
434436
- `roi_offset`: *array<int, 4>*
@@ -801,8 +803,8 @@ Additional properties for this action:
801803
The position of the click target. Optional, default is true.
802804
- *true*: The target is the position just recognized in this node (i.e., itself).
803805
- *string*: Enter the node name, as the target, to use the position recognized by a previously executed node.
804-
- *array<int, 2>*: Fixed coordinate point `[x, y]`.
805-
- *array<int, 4>*: Fixed coordinate area `[x, y, w, h]`. A point is sampled inside the rectangle with higher probability near the center and lower probability near the edges. To target the entire screen, set it to [0, 0, 0, 0].
806+
- *array<int, 2>*: Fixed coordinate point `[x, y]`. Supports negative values, meaning calculating from the right or bottom edge of the image. For example, `[-100, -100]` represents a position 100 pixels from the bottom-right corner.
807+
- *array<int, 4>*: Fixed coordinate area `[x, y, w, h]`. A point is sampled inside the rectangle with higher probability near the center and lower probability near the edges. To target the entire screen, set it to [0, 0, 0, 0]. Supports negative values: negative x or y means calculating from the right or bottom edge of the image; w or h of 0 means extending to the edge, negative means taking absolute value and treating (x, y) as the bottom-right corner.
806808

807809
- `target_offset`: *array<int, 4>*
808810
Additional movement from the `target` before clicking, where the four values are added together. Optional, default is [0, 0, 0, 0].

docs/zh_cn/3.1-任务流水线协议.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -431,9 +431,11 @@ MaaResourcePostPath(resource, "resource/debug"); // debug 节点使用 rate_lim
431431
432432
该算法属性需额外部分字段:
433433
434-
- `roi`: *array<int, 4>* | *string*
434+
- `roi`: *array<int, 4>* | *array<int, 2>* | *string*
435435
识别区域坐标。可选,默认 [0, 0, 0, 0] ,即全屏。
436-
- *array<int, 4>*: 识别区域坐标,[x, y, w, h],若希望全屏可设为 [0, 0, 0, 0] 。
436+
- *array<int, 4>*: 识别区域坐标,[x, y, w, h],若希望全屏可设为 [0, 0, 0, 0] 。
437+
支持负数:x 或 y 为负数时表示从图像右边缘或下边缘反向计算;w 或 h 为 0 时表示延伸至边缘,为负数时取绝对值并将 (x, y) 视为右下角而非左上角。例如 `[-100, -100, 0, 0]` 表示右下角 100x100 位置到边缘的范围;`[200, 200, -100, -50]` 表示以 (200, 200) 为右下角的 100x50 区域,即 `[100, 150, 100, 50]`。
438+
- *array<int, 2>*: 固定坐标点 `[x, y]`。支持负数,含义同上。
437439
- *string*: 填写节点名,在之前执行过的某节点识别到的目标范围内识别。
438440
439441
- `roi_offset`: *array<int, 4>*
@@ -809,8 +811,8 @@ Pipeline v2 时,将这些字段放到 `recognition.param` 中即可。
809811
点击目标的位置。可选,默认 true 。
810812
- *true*: 目标为本节点中刚刚识别到的位置(即自身)。
811813
- *string*: 填写节点名,目标为之前执行过的某节点识别到的位置。
812-
- *array<int, 2>*: 固定坐标点 `[x, y]`
813-
- *array<int, 4>*: 固定坐标区域 `[x, y, w, h]`,会在矩形内随机选取一点(越靠近中心概率越高,边缘概率相对较低),若希望全屏可设为 [0, 0, 0, 0]
814+
- *array<int, 2>*: 固定坐标点 `[x, y]`支持负数,表示从图像右边缘或下边缘反向计算。例如 `[-100, -100]` 表示右下角距离边缘 100 像素的位置。
815+
- *array<int, 4>*: 固定坐标区域 `[x, y, w, h]`,会在矩形内随机选取一点(越靠近中心概率越高,边缘概率相对较低),若希望全屏可设为 [0, 0, 0, 0]支持负数:x 或 y 为负数时表示从图像右边缘或下边缘反向计算;w 或 h 为 0 时表示延伸至边缘,为负数时取绝对值并将 (x, y) 视为右下角。
814816

815817
- `target_offset`: *array<int, 4>*
816818
`target` 的基础上额外移动再作为点击目标,四个值分别相加。可选,默认 [0, 0, 0, 0]

source/MaaFramework/Task/Component/Actuator.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
#include "MaaUtils/JsonExt.hpp"
77
#include "MaaUtils/Logger.h"
88
#include "Vision/TemplateComparator.h"
9+
#include "Vision/VisionUtils.hpp"
910

1011
MAA_TASK_NS_BEGIN
1112

@@ -725,6 +726,11 @@ cv::Rect Actuator::get_target_rect(const MAA_RES_NS::Action::Target target, cons
725726

726727
auto image = controller()->cached_image();
727728

729+
// Region 类型支持负数坐标和尺寸
730+
if (target.type == Target::Type::Region) {
731+
raw = MAA_VISION_NS::normalize_rect(raw, image.cols, image.rows);
732+
}
733+
728734
int x = std::clamp(raw.x + target.offset.x, 0, image.cols);
729735
int y = std::clamp(raw.y + target.offset.y, 0, image.rows);
730736
int width = std::clamp(raw.width + target.offset.width, 0, image.cols - x);

source/MaaFramework/Vision/VisionUtils.hpp

Lines changed: 37 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -295,17 +295,50 @@ inline static std::vector<float> image_to_tensor(const cv::Mat& image)
295295
return tensor;
296296
}
297297

298+
// 将支持负数的矩形转换为标准矩形:
299+
// - x/y 负数表示从右/下边缘反向计算
300+
// - w/h 负数表示取绝对值并将 (x,y) 视为右下角
301+
// - w/h 为 0 表示延伸至边缘
302+
inline cv::Rect normalize_rect(const cv::Rect& rect, int image_width, int image_height)
303+
{
304+
cv::Rect res = rect;
305+
306+
if (res.x < 0) {
307+
res.x = image_width + res.x;
308+
}
309+
if (res.y < 0) {
310+
res.y = image_height + res.y;
311+
}
312+
313+
if (res.width < 0) {
314+
res.width = -res.width;
315+
res.x -= res.width;
316+
}
317+
if (res.height < 0) {
318+
res.height = -res.height;
319+
res.y -= res.height;
320+
}
321+
322+
if (res.width == 0) {
323+
res.width = image_width - res.x;
324+
}
325+
if (res.height == 0) {
326+
res.height = image_height - res.y;
327+
}
328+
329+
return res;
330+
}
331+
298332
inline cv::Rect correct_roi(const cv::Rect& roi, const cv::Mat& image)
299333
{
300334
if (image.empty()) {
301335
LogError << "image is empty" << VAR(image.size());
302336
return roi;
303337
}
304-
if (roi.empty()) {
305-
return { 0, 0, image.cols, image.rows };
306-
}
307338

308-
cv::Rect res = roi;
339+
cv::Rect res = normalize_rect(roi, image.cols, image.rows);
340+
341+
// 边界检查和修正
309342
if (image.cols < res.x) {
310343
LogError << "roi is out of range" << VAR(image.size()) << VAR(res);
311344
res.x = image.cols - res.width;

test/python/pipeline_test.py

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -935,6 +935,135 @@ def test_repeat_params(context: Context):
935935
print(" PASS: repeat params")
936936

937937

938+
# ============================================================================
939+
# 负数 roi 和 target 参数测试
940+
# ============================================================================
941+
942+
943+
def test_negative_roi_and_target(context: Context):
944+
"""测试负数 roi 和 target 参数的解析"""
945+
print("\n=== test_negative_roi_and_target ===")
946+
947+
new_ctx = context.clone()
948+
949+
# 测试负数 roi 坐标(从边缘反向计算)
950+
new_ctx.override_pipeline(
951+
{
952+
"NegativeRoiCoord": {
953+
"recognition": "TemplateMatch",
954+
"template": ["test.png"],
955+
"roi": [-100, -100, 50, 50],
956+
}
957+
}
958+
)
959+
obj = new_ctx.get_node_object("NegativeRoiCoord")
960+
assert_eq(obj.recognition.param.roi, [-100, -100, 50, 50], "negative roi coords")
961+
962+
# 测试负数 roi 宽高(xy 作为右下角)
963+
new_ctx.override_pipeline(
964+
{
965+
"NegativeRoiSize": {
966+
"recognition": "TemplateMatch",
967+
"template": ["test.png"],
968+
"roi": [200, 200, -100, -50],
969+
}
970+
}
971+
)
972+
obj = new_ctx.get_node_object("NegativeRoiSize")
973+
assert_eq(obj.recognition.param.roi, [200, 200, -100, -50], "negative roi size")
974+
975+
# 测试 roi 宽高为 0(延伸至边缘)
976+
new_ctx.override_pipeline(
977+
{
978+
"ZeroRoiSize": {
979+
"recognition": "TemplateMatch",
980+
"template": ["test.png"],
981+
"roi": [100, 100, 0, 0],
982+
}
983+
}
984+
)
985+
obj = new_ctx.get_node_object("ZeroRoiSize")
986+
assert_eq(obj.recognition.param.roi, [100, 100, 0, 0], "zero roi size")
987+
988+
# 测试组合:负数坐标 + 零宽高(右下角到边缘)
989+
new_ctx.override_pipeline(
990+
{
991+
"NegativeRoiCombo": {
992+
"recognition": "TemplateMatch",
993+
"template": ["test.png"],
994+
"roi": [-100, -100, 0, 0],
995+
}
996+
}
997+
)
998+
obj = new_ctx.get_node_object("NegativeRoiCombo")
999+
assert_eq(obj.recognition.param.roi, [-100, -100, 0, 0], "negative roi combo")
1000+
1001+
# 测试 2 元素数组的负数坐标
1002+
new_ctx.override_pipeline(
1003+
{
1004+
"NegativeRoi2Elem": {
1005+
"recognition": "TemplateMatch",
1006+
"template": ["test.png"],
1007+
"roi": [-50, -50],
1008+
}
1009+
}
1010+
)
1011+
obj = new_ctx.get_node_object("NegativeRoi2Elem")
1012+
assert_eq(obj.recognition.param.roi, [-50, -50, 0, 0], "negative roi 2-element")
1013+
1014+
# 测试负数 target 坐标
1015+
new_ctx.override_pipeline(
1016+
{
1017+
"NegativeTargetCoord": {
1018+
"action": "Click",
1019+
"target": [-100, -100, 50, 50],
1020+
}
1021+
}
1022+
)
1023+
obj = new_ctx.get_node_object("NegativeTargetCoord")
1024+
assert_eq(obj.action.param.target, [-100, -100, 50, 50], "negative target coords")
1025+
1026+
# 测试负数 target 宽高
1027+
new_ctx.override_pipeline(
1028+
{
1029+
"NegativeTargetSize": {
1030+
"action": "Click",
1031+
"target": [300, 300, -100, -100],
1032+
}
1033+
}
1034+
)
1035+
obj = new_ctx.get_node_object("NegativeTargetSize")
1036+
assert_eq(obj.action.param.target, [300, 300, -100, -100], "negative target size")
1037+
1038+
# 测试 2 元素数组的负数 target
1039+
new_ctx.override_pipeline(
1040+
{
1041+
"NegativeTarget2Elem": {
1042+
"action": "Click",
1043+
"target": [-50, -50],
1044+
}
1045+
}
1046+
)
1047+
obj = new_ctx.get_node_object("NegativeTarget2Elem")
1048+
assert_eq(obj.action.param.target, [-50, -50, 0, 0], "negative target 2-element")
1049+
1050+
# 测试 Swipe 的负数坐标
1051+
new_ctx.override_pipeline(
1052+
{
1053+
"NegativeSwipe": {
1054+
"action": "Swipe",
1055+
"begin": [-100, -100, 50, 50],
1056+
"end": [-50, -50],
1057+
}
1058+
}
1059+
)
1060+
obj = new_ctx.get_node_object("NegativeSwipe")
1061+
assert_eq(obj.action.param.begin, [-100, -100, 50, 50], "negative swipe begin")
1062+
assert_eq(obj.action.param.end[0], [-50, -50, 0, 0], "negative swipe end")
1063+
1064+
print(" PASS: negative roi and target")
1065+
1066+
9381067
# ============================================================================
9391068
# 主测试流程
9401069
# ============================================================================
@@ -1033,6 +1162,7 @@ class AdditionalTestReco(CustomRecognition):
10331162
def analyze(self, context, argv):
10341163
test_wait_freezes(context)
10351164
test_repeat_params(context)
1165+
test_negative_roi_and_target(context)
10361166
return CustomRecognition.AnalyzeResult(
10371167
box=(0, 0, 10, 10), detail="done"
10381168
)

0 commit comments

Comments
 (0)