Skip to content

Commit 0b26fd1

Browse files
committed
dev: fix miku and jetarm-agent, add workspace for examples
1 parent cf905ea commit 0b26fd1

File tree

12 files changed

+275
-73
lines changed

12 files changed

+275
-73
lines changed

examples/.env.example

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,8 @@ MOSS_LLM_MODEL="volcengine/doubao-seed-1-6-251015"
2424
# 访问模型接口的 api key
2525
MOSS_LLM_API_KEY=your_api_key_here
2626

27-
# 是否开启语音回复.
28-
USE_VOCIE_SPEECH="yes" # yes or no
27+
# 是否开启语音回复. 开启的前提是配置了火山引擎的生成.
28+
USE_VOICE_SPEECH="yes" # yes or no
2929

3030
# 创建一个火山引擎的 "大模型流式 tts 合成应用"
3131
# 获得 app id 和 access token 放在此.

examples/jetarm_demo/README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# jetarm demo
2+
3+
用来验证对 幻尔 6dof 机械臂 的控制.
4+
5+
运行前需要:
6+
7+
1. 真的有幻尔 6dof jetarm 机械臂.
8+
2. 在机械臂开发板上, 已经实装了 jetarm_ws, 完成编译可运行, 并启动了 jetarm_channel 和 jetarm_control 节点.
9+
3. 已经完成了 examples 的依赖安装.
10+
11+
运行:
12+
13+
1. 测试 `python connect_pychannel_with_rcply.py --address=jetarm_control监听的地址端口`, 看看是否能运动.
14+
2. 启动 agent `python jetarm_agent.py --address=jetarm_control监听的地址端口`
15+
16+
这个例子不必特别测试. jetarm 本身二次开发难度比较大. 看看样例知道怎么回事就可以了.

examples/jetarm_demo/connect_pychannel_with_rcply.py

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import asyncio
2+
import argparse
23

34
from ghoshell_moss.transports.zmq_channel.zmq_channel import ZMQChannelProxy
45

@@ -27,9 +28,23 @@ async def main():
2728
测试 jetarm 的脚本, 通过 zmq proxy 调用 zmq provider 的方式, 与 jetarm channel 进行通讯.
2829
然后测试脚本可以运行.
2930
"""
31+
# 创建参数解析器
32+
parser = argparse.ArgumentParser(description="运行 jetarm 测试轨迹例程")
33+
34+
# 添加 --address 参数,设置默认值
35+
parser.add_argument(
36+
"--address",
37+
type=str,
38+
default=JETARM_ADDRESS,
39+
help=f"代理地址,默认值: {JETARM_ADDRESS}",
40+
)
41+
42+
# 解析命令行参数
43+
args = parser.parse_args()
44+
3045
chan = ZMQChannelProxy(
3146
name="jetarm",
32-
address=JETARM_ADDRESS,
47+
address=args.address,
3348
)
3449

3550
async with chan.bootstrap() as broker:

examples/jetarm_demo/jetarm_agent.py

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import asyncio
22

33
from ghoshell_container import Container
4+
import argparse
45

56
from ghoshell_moss.core.shell import new_shell
67
from ghoshell_moss.speech import make_baseline_tts_speech
@@ -9,20 +10,21 @@
910
from ghoshell_moss.transports.zmq_channel.zmq_channel import ZMQChannelProxy
1011
from ghoshell_moss_contrib.agent import ModelConf, SimpleAgent
1112
from ghoshell_moss_contrib.agent.chat import ConsoleChat
12-
13-
container = Container(name="jetarm_agent_container")
13+
from ghoshell_moss_contrib.example_ws import workspace_container, get_container
14+
from pathlib import Path
1415

1516
ADDRESS = "tcp://192.168.1.15:9527"
1617
"""填入正确的 ip, 需要先对齐 jetarm_ws 运行的机器设备和监听的端口. """
1718

1819

19-
async def run_agent():
20+
async def run_agent(address: str = ADDRESS, container: Container | None = None):
21+
container = container or get_container()
2022
# 创建 Shell
2123
shell = new_shell(container=container)
2224

2325
jetarm_chan = ZMQChannelProxy(
2426
name="jetarm",
25-
address=ADDRESS,
27+
address=address,
2628
)
2729

2830
shell.main_channel.import_channels(jetarm_chan)
@@ -56,8 +58,24 @@ async def run_agent():
5658

5759

5860
def main():
59-
# 运行异步主函数
60-
asyncio.run(run_agent())
61+
# 创建参数解析器
62+
parser = argparse.ArgumentParser(description="运行 jetarm agent 程序")
63+
64+
# 添加 --address 参数,设置默认值
65+
parser.add_argument(
66+
"--address",
67+
type=str,
68+
default="tcp://192.168.1.15:9527",
69+
help="代理地址,默认值: tcp://192.168.1.15:9527"
70+
)
71+
72+
# 解析命令行参数
73+
args = parser.parse_args()
74+
75+
ws_dir = Path(__file__).resolve().parent.parent.joinpath('.workspace')
76+
with workspace_container(workspace_dir=ws_dir) as container:
77+
# 运行异步主函数,传入地址参数
78+
asyncio.run(run_agent(address=args.address, container=container))
6179

6280

6381
# 运行主函数

examples/jetarm_ws/README.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -108,18 +108,18 @@ source install/setup.zsh
108108
## 核心目录说明
109109

110110
- `src`: 核心库目录
111-
- `jetarm_6dof_description`:
112-
用来存放 jetarm 的机器人描述相关讯息,
113-
也可以启动 rviz `ros2 launch jetarm_6dof_description view_model.launch.py`
114-
- `jetarm_driver`:
115-
是验证 python 驱动的节点, 想用 python 实现 ros2 control 的 interface. 不过现在不用了.
116-
- `jetarm_control`:
117-
核心的 ros2 control 实现. 启动这个节点, 机器人就可以驱动了. 详见后面的测试用例. deepseek 等也能给出 ros2
118-
control 支持的各种命令.
119-
运行这个节点的脚本是 `ros2 launch jetarm_control jetarm_control.launch.py`
120-
- `jetarm_moveit2`:
121-
这个是基于 ros2 control (jetarm_control) 实现的 moveit 节点, 所有的代码应该都由 moveit2 的 assitant 生成.
122-
具体方法可以问模型, 需要提前安装 moveit2 到全局环境里.
111+
- `jetarm_6dof_description`:
112+
用来存放 jetarm 的机器人描述相关讯息,
113+
也可以启动 rviz `ros2 launch jetarm_6dof_description view_model.launch.py`
114+
- `jetarm_driver`:
115+
是验证 python 驱动的节点, 想用 python 实现 ros2 control 的 interface. 不过现在不用了.
116+
- `jetarm_control`:
117+
核心的 ros2 control 实现. 启动这个节点, 机器人就可以驱动了. 详见后面的测试用例. deepseek 等也能给出 ros2
118+
control 支持的各种命令.
119+
运行这个节点的脚本是 `ros2 launch jetarm_control jetarm_control.launch.py`
120+
- `jetarm_moveit2`:
121+
这个是基于 ros2 control (jetarm_control) 实现的 moveit 节点, 所有的代码应该都由 moveit2 的 assitant 生成.
122+
具体方法可以问模型, 需要提前安装 moveit2 到全局环境里.
123123

124124
# 常用测试命令
125125

examples/miku/main.py

Lines changed: 30 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,11 @@
11
import os
22
import sys
33

4-
import dotenv
5-
from ghoshell_common.contracts import LocalWorkspaceProvider
6-
7-
from ghoshell_moss.speech import make_baseline_tts_speech
4+
from ghoshell_moss.speech import make_baseline_tts_speech, Speech
85
from ghoshell_moss.speech.player.pyaudio_player import PyAudioStreamPlayer
96
from ghoshell_moss.speech.volcengine_tts import VolcengineTTS, VolcengineTTSConf
107
from ghoshell_moss_contrib.agent import ModelConf, SimpleAgent
118

12-
current_dir = os.path.dirname(os.path.abspath(__file__))
13-
sys.path.append(current_dir)
14-
159
import asyncio
1610
from os.path import dirname, join
1711

@@ -27,18 +21,19 @@
2721
from miku_channels.head import head_chan
2822
from miku_channels.leg import left_leg_chan, right_leg_chan
2923
from miku_channels.necktie import necktie_chan
30-
3124
from ghoshell_moss.core.shell import new_shell
25+
from ghoshell_moss_contrib.example_ws import workspace_container, get_example_speech
26+
import pathlib
27+
28+
# 加载当前路径.
29+
current_dir = os.path.dirname(os.path.abspath(__file__))
30+
sys.path.append(current_dir)
3231

3332
# 全局状态
3433
model: live2d.LAppModel | None = None
35-
container = Container()
3634
WIDTH = 600
3735
HEIGHT = 800
3836

39-
WORKSPACE_DIR = ".workspace"
40-
dotenv.load_dotenv(f"{WORKSPACE_DIR}/.env")
41-
4237

4338
# 初始化Pygame和Live2D
4439
def init_pygame():
@@ -50,7 +45,7 @@ def init_pygame():
5045

5146

5247
# 初始化Live2D模型
53-
def init_live2d(model_path: str):
48+
def init_live2d(model_path: str, container: Container):
5449
global model
5550
live2d.init()
5651
live2d.glInit()
@@ -101,11 +96,9 @@ def stop_speak(*args):
10196
speaking_event.clear()
10297

10398

104-
async def run_agent():
99+
async def run_agent(container: Container, speech: Speech | None = None):
105100
loop = asyncio.get_running_loop()
106101

107-
container.register(LocalWorkspaceProvider(".workspace"))
108-
109102
# 创建 Shell
110103
shell = new_shell(container=container)
111104

@@ -142,12 +135,15 @@ async def speaking():
142135
player = PyAudioStreamPlayer()
143136
player.on_play(start_speak)
144137
player.on_play_done(stop_speak)
145-
tts = VolcengineTTS(conf=VolcengineTTSConf(default_speaker="saturn_zh_female_keainvsheng_tob"))
138+
speech = speech or container.get(Speech)
139+
if speech is None:
140+
tts = VolcengineTTS(conf=VolcengineTTSConf(default_speaker="saturn_zh_female_keainvsheng_tob"))
141+
speech = make_baseline_tts_speech(player=player, tts=tts)
146142

147143
agent = SimpleAgent(
148144
instruction="你是miku, 拥有 live2d 数字人躯体. 你是可爱和热情的数字人. ",
149145
shell=shell,
150-
speech=make_baseline_tts_speech(player=player, tts=tts),
146+
speech=speech,
151147
model=ModelConf(
152148
kwargs={
153149
"thinking": {
@@ -162,22 +158,22 @@ async def speaking():
162158
await speaking_task
163159

164160

165-
async def run_agent_and_render():
161+
async def run_agent_and_render(container: Container, speech: Speech | None = None):
166162
# 初始化 Pygame 和 Live2D
167163
screen, display = init_pygame()
168164
model_path = join(dirname(__file__), "model/miku.model3.json")
169-
init_live2d(model_path)
165+
init_live2d(model_path, container)
170166

171167
# 保持窗口打开,直到用户关闭
172168
running = True
173169
clock = pygame.time.Clock()
174170
font = pygame.font.SysFont(None, 24)
175171

176172
# 创建一个任务来运行 agent
177-
agent_task = asyncio.create_task(run_agent())
173+
agent_task = asyncio.create_task(run_agent(container, speech))
178174

179175
try:
180-
while running:
176+
while running and not agent_task.done():
181177
# 处理pygame事件
182178
for event in pygame.event.get():
183179
if event.type == pygame.QUIT:
@@ -201,20 +197,26 @@ async def run_agent_and_render():
201197
pass
202198
finally:
203199
# 取消agent任务
204-
agent_task.cancel()
205-
try:
206-
await agent_task
207-
except asyncio.CancelledError:
208-
pass
200+
if not agent_task.done():
201+
agent_task.cancel()
202+
try:
203+
await agent_task
204+
except asyncio.CancelledError:
205+
pass
209206

210207
# 清理资源
211208
live2d.dispose()
212209
pygame.quit()
213210

214211

212+
WORKSPACE_DIR = pathlib.Path(__file__).parent.parent.joinpath('.workspace')
213+
214+
215215
def main():
216216
# 运行异步主函数
217-
asyncio.run(run_agent_and_render())
217+
with workspace_container(WORKSPACE_DIR) as container:
218+
speech = get_example_speech(container)
219+
asyncio.run(run_agent_and_render(container, speech))
218220

219221

220222
# 运行主函数

examples/miku/miku_channels/body.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ async def on_policy_run():
2424
# 等待 其他 Motions 完成
2525
while not model.IsMotionFinished():
2626
await asyncio.sleep(0.1)
27+
if not body_chan.is_running():
28+
break
2729
model.ResetExpressions() # 防止表情重叠
2830
model.ResetExpression()
2931
# Policy的Priority设置为1(较低),是为了确保其他Motion可打断Policy Motion

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ contrib = [
3333
"rich>=14.2.0",
3434
"javascript>=1!1.2.6",
3535
"opencv-python>=4.13.0.92",
36+
"loadenv>=0.1.1",
3637
]
3738

3839
[tool.setuptools]

src/ghoshell_moss/message/abcd.py

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -386,16 +386,16 @@ class Message(BaseModel, WithAdditional):
386386
seq: Literal["head", "delta", "incomplete", "completed"] = Field(
387387
default="completed",
388388
description="消息的传输状态, 目前分为首包, 间包和尾包."
389-
"- 首包: 用来提示一个消息流已经被生产. 通常用来通知前端界面, 提前渲染消息容器"
390-
"- 间包: 用最少的讯息传递一个 delta 包, 用于流式传输"
391-
"- 尾包: 包含所有 delta 包粘包后的完整结果, 用来存储或展示."
392-
"尾包分为 completed 和 incomplete 两种. "
393-
"- completed 表示一个消息体完全传输完毕."
394-
"- incomplete 表示虽然没传输完毕, 但可能也要直接使用."
395-
"我们举一个具体的例子, 在模型处理多端输入时, 一个视觉信号让模型要反馈, 但一个 asr 输入还未全部完成;"
396-
"这个时候, 大模型仍然要看到未完成的语音输入, 也就是 incomplete 消息."
397-
"但是下一轮对话, 当 asr 已经完成时, 历史消息里不需要展示 incomplete 包."
398-
"所以 incomplete 主要是用来在大模型思考的关键帧中展示一个粘包中的中间结果.",
389+
"- 首包: 用来提示一个消息流已经被生产. 通常用来通知前端界面, 提前渲染消息容器"
390+
"- 间包: 用最少的讯息传递一个 delta 包, 用于流式传输"
391+
"- 尾包: 包含所有 delta 包粘包后的完整结果, 用来存储或展示."
392+
"尾包分为 completed 和 incomplete 两种. "
393+
"- completed 表示一个消息体完全传输完毕."
394+
"- incomplete 表示虽然没传输完毕, 但可能也要直接使用."
395+
"我们举一个具体的例子, 在模型处理多端输入时, 一个视觉信号让模型要反馈, 但一个 asr 输入还未全部完成;"
396+
"这个时候, 大模型仍然要看到未完成的语音输入, 也就是 incomplete 消息."
397+
"但是下一轮对话, 当 asr 已经完成时, 历史消息里不需要展示 incomplete 包."
398+
"所以 incomplete 主要是用来在大模型思考的关键帧中展示一个粘包中的中间结果.",
399399
)
400400
delta: Optional[Delta] = Field(
401401
default=None,
@@ -405,11 +405,11 @@ class Message(BaseModel, WithAdditional):
405405

406406
@classmethod
407407
def new(
408-
cls,
409-
*,
410-
role: Literal["assistant", "system", "developer", "user", ""] = "",
411-
name: Optional[str] = None,
412-
id: Optional[str] = None,
408+
cls,
409+
*,
410+
role: Literal["assistant", "system", "developer", "user", ""] = "",
411+
name: Optional[str] = None,
412+
id: Optional[str] = None,
413413
):
414414
"""
415415
语法糖, 用来创建一条消息.

src/ghoshell_moss/speech/__init__.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
from ghoshell_common.contracts import LoggerItf
22

3-
from ghoshell_moss.core.concepts.speech import TTS, StreamAudioPlayer
3+
from ghoshell_moss.core.concepts.speech import TTS, StreamAudioPlayer, Speech, SpeechStream
44
from ghoshell_moss.speech.mock import MockSpeech
55
from ghoshell_moss.speech.stream_tts_speech import TTSSpeech, TTSSpeechStream
66

77

88
def make_baseline_tts_speech(
9-
player: StreamAudioPlayer | None = None,
10-
tts: TTS | None = None,
11-
logger: LoggerItf | None = None,
9+
player: StreamAudioPlayer | None = None,
10+
tts: TTS | None = None,
11+
logger: LoggerItf | None = None,
1212
) -> TTSSpeech:
1313
"""
1414
基线示例.

0 commit comments

Comments
 (0)