[Model] CogACT

### ID (slug)

cogact

### Name

CogACT

### Organization

THUDM / Microsoft Research Asia

### Year

2024

### Description (English)

CogACT is a componentized Vision-Language-Action architecture that decouples cognition from action. It utilizes powerful Vision-Language Models  to extract cognitive features, which then condition a specialized Diffusion Transformer-based action module to predict continuous, temporally-correlated robotic action sequences.

### Description (Korean)

CogACT는 기존의 단일 신경망 모델들과 달리 인지와 행동을 명확히 분리한 컴포넌트형 비전-언어-행동 모델입니다. 강력한 비전-언어 모델을 통해 인지적 특징을 추출하고, 이를 조건으로 특화된 Diffusion Transformer 기반의 행동 모델이 복잡하고 연속적인 로봇의 물리적 제어 궤적을 예측하도록 설계되었습니다.

### GitHub URL

https://github.com/microsoft/CogACT

### Paper URL (arXiv)

https://cogact.github.io/CogACT_paper.pdf

### HuggingFace URL

https://huggingface.co/CogACT

### Project Page URL

https://cogact.github.io/

### Categories

- [x] manipulation
- [ ] locomotion
- [ ] navigation
- [ ] dexterous
- [ ] whole-body
- [ ] aerial

### Hardware Targets

- [x] manipulator
- [ ] humanoid
- [ ] quadruped
- [ ] biped
- [ ] mobile
- [ ] drone
- [ ] hand

### Learning Methods

- [x] VLA
- [ ] IL
- [ ] RL
- [x] diffusion
- [ ] world_model
- [ ] sim2real

### Framework

- [x] pytorch
- [ ] jax
- [ ] tensorflow
- [ ] other

### Communication

- [ ] ros2
- [ ] grpc
- [ ] lcm
- [ ] zenoh

### Tags (optional)

VLA, diffusion, foundation-model

### Checklist

- [x] The model is open-source (code or weights publicly available)
- [x] At least one URL (GitHub, paper, or HuggingFace) is provided
- [x] I have read the contribution guidelines

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] CogACT #19

ID (slug)

Name

Organization

Year

Description (English)

Description (Korean)

GitHub URL

Paper URL (arXiv)

HuggingFace URL

Project Page URL

Categories

Hardware Targets

Learning Methods

Framework

Communication

Tags (optional)

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Model] CogACT #19

Description

ID (slug)

Name

Organization

Year

Description (English)

Description (Korean)

GitHub URL

Paper URL (arXiv)

HuggingFace URL

Project Page URL

Categories

Hardware Targets

Learning Methods

Framework

Communication

Tags (optional)

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions