Skip to content

Commit 2dfd444

Browse files
committed
✨ add support for new filetypes
1 parent a0a6c2e commit 2dfd444

File tree

18 files changed

+364
-35
lines changed

18 files changed

+364
-35
lines changed

backend/database/attachment_db.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -272,6 +272,7 @@ def get_content_type(file_path: str) -> str:
272272
'.html': 'text/html',
273273
'.htm': 'text/html',
274274
'.json': 'application/json',
275+
'.epub': 'application/epuub',
275276
'.xml': 'application/xml',
276277
'.zip': 'application/zip',
277278
'.rar': 'application/x-rar-compressed',

doc/docs/en/sdk/data-process.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,10 +43,10 @@ def file_process(self,
4343

4444
## 📁 Supported File Formats
4545

46-
- **Text files**: .txt, .md, .csv
47-
- **Documents**: .pdf, .docx, .pptx
46+
- **Text files**: .txt, .md, .csv, .json
47+
- **Documents**: .pdf, .docx, .pptx, .epub
4848
- **Images**: .jpg, .png, .gif (with OCR)
49-
- **Web content**: HTML, URLs
49+
- **Web content**: HTML, URLs, XML
5050
- **Archives**: .zip, .tar
5151

5252
## 💡 Usage Examples

doc/docs/en/user-guide/knowledge-base.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,14 @@ Create and manage knowledge bases, upload documents, and generate summaries. Kno
2626
### Supported File Formats
2727

2828
Nexent supports multiple file formats, including:
29-
- **Text:** .txt, .md
29+
- **Text:** .txt, .md, .csv, .json
3030
- **PDF:** .pdf
3131
- **Word:** .docx
3232
- **PowerPoint:** .pptx
33+
- **EPUB:** .epub
3334
- **Excel:** .xlsx
3435
- **Data files:** .csv
36+
- **Web content:** .html, .xml
3537

3638
## 📊 Knowledge Base Summary
3739

doc/docs/en/user-guide/start-chat.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,8 +79,8 @@ You can upload files during a chat so the agent can reason over their content:
7979
- Or drag files directly into the chat area
8080

8181
2. **Supported File Formats**
82-
- **Documents:** PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx)
83-
- **Text:** Markdown (.md), Plain text (.txt)
82+
- **Documents:** PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx), EPUB (.epub), HTML (.html), XML (.xml)
83+
- **Text & Data:** Markdown (.md), Plain text (.txt), JSON (.json), CSV (.csv)
8484
- **Images:** JPG, PNG, GIF, and other common formats
8585

8686
3. **File Processing Flow**

doc/docs/zh/sdk/data-process.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,9 @@ def file_process(self,
9898
- `.odt` - OpenDocument文本
9999
- `.pptx` - PowerPoint 2007及更高版本
100100
- `.ppt` - PowerPoint 97-2003版本
101+
- `.xml` - XML数据文件
102+
- `.json` - JSON数据文件
103+
- `.csv` - 逗号分隔值文件
101104

102105
## 💡 使用示例
103106

doc/docs/zh/user-guide/knowledge-base.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,14 @@
2626

2727
Nexent支持多种文件格式,包括:
2828

29-
- **文本**: .txt, .md文件
29+
- **文本**: .txt, .md, .json文件
3030
- **PDF**: .pdf文件
3131
- **Word**: .docx文件
3232
- **PowerPoint**: .pptx文件
3333
- **Excel**: .xlsx文件
34+
- **EPUB** .epub文件
3435
- **数据文件**: .csv文件
36+
- **Web content**: .html, .xml文件
3537

3638
## 📊 知识库总结
3739

doc/docs/zh/user-guide/start-chat.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -79,8 +79,8 @@ Nexent支持语音输入功能,让您可以通过语音与智能体交互。
7979
- 或直接将文件拖拽到对话区域
8080

8181
2. **支持的文件格式**
82-
- **文档类**:PDF、Word (.docx)、PowerPoint (.pptx)、Excel (.xlsx)
83-
- **文本类**:Markdown (.md)、纯文本 (.txt)
82+
- **文档类**:PDF、Word (.docx)、PowerPoint (.pptx)、Excel (.xlsx), EPUB (.epub), HTML (.html), XML (.xml)
83+
- **文本类**:Markdown (.md)、纯文本 (.txt), JSON (.json), CSV (.csv)
8484
- **图片类**:JPG、PNG、GIF 等常见图片格式
8585

8686
3. **文件处理流程**

frontend/app/[locale]/knowledges/components/upload/UploadArea.tsx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@ const UploadArea = forwardRef<UploadAreaRef, UploadAreaProps>(
233233
fileList,
234234
onChange: handleChange,
235235
customRequest: handleCustomRequest,
236-
accept: ".pdf,.docx,.pptx,.xlsx,.md,.txt,.csv",
236+
accept: ".pdf,.docx,.pptx,.xlsx,.md,.txt,.csv,.json,.epub,.xml,.html",
237237
showUploadList: true,
238238
disabled: disabled,
239239
progress: {

frontend/const/chatConfig.ts

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ export const chatConfig = {
99
"application/json",
1010
"application/xml",
1111
"text/markdown",
12+
"text/csv",
1213
],
1314

1415
// Supported text file extensions
@@ -36,10 +37,10 @@ export const chatConfig = {
3637
imageExtensions: ["jpg", "jpeg", "png", "gif", "webp", "svg", "bmp"],
3738

3839
// Supported document file extensions
39-
documentExtensions: ["pdf", "doc", "docx", "xls", "xlsx", "ppt", "pptx"],
40+
documentExtensions: ["pdf", "doc", "docx", "xls", "xlsx", "ppt", "pptx", "epub", "html", "xml"],
4041

4142
// Supported text document extensions
42-
supportedTextExtensions: ["md", "markdown", "txt"],
43+
supportedTextExtensions: ["md", "markdown", "txt", "csv", "json"],
4344

4445
// File icon mapping configuration
4546
fileIcons: {
@@ -50,7 +51,7 @@ export const chatConfig = {
5051
word: ["doc", "docx"],
5152

5253
// Plain text files
53-
text: ["txt"],
54+
text: ["txt", "epub"],
5455

5556
// Markdown files
5657
markdown: ["md"],
@@ -62,7 +63,7 @@ export const chatConfig = {
6263
powerpoint: ["ppt", "pptx"],
6364

6465
// HTML files
65-
html: ["html", "htm"],
66+
html: ["html", "htm", "xml"],
6667

6768
// Code files
6869
code: ["css", "js", "ts", "jsx", "tsx", "php", "py", "java", "c", "cpp", "cs"],

frontend/const/knowledgeBase.ts

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,12 @@ export const FILE_EXTENSIONS = {
120120
PPT: 'ppt',
121121
PPTX: 'pptx',
122122
TXT: 'txt',
123-
MD: 'md'
123+
MD: 'md',
124+
EPUB: 'epub',
125+
CSV: 'csv',
126+
HTML: 'html',
127+
XML: 'xml',
128+
JSON: 'json'
124129
} as const;
125130

126131
// File type constants
@@ -131,6 +136,11 @@ export const FILE_TYPES = {
131136
POWERPOINT: 'PowerPoint',
132137
TEXT: 'Text',
133138
MARKDOWN: 'Markdown',
139+
EPUB: 'EPUB',
140+
CSV: 'CSV',
141+
JSON: 'JSON',
142+
HTML: 'HTML',
143+
XML: 'XML',
134144
UNKNOWN: 'Unknown'
135145
} as const;
136146

@@ -144,5 +154,10 @@ export const EXTENSION_TO_TYPE_MAP = {
144154
[FILE_EXTENSIONS.PPT]: FILE_TYPES.POWERPOINT,
145155
[FILE_EXTENSIONS.PPTX]: FILE_TYPES.POWERPOINT,
146156
[FILE_EXTENSIONS.TXT]: FILE_TYPES.TEXT,
147-
[FILE_EXTENSIONS.MD]: FILE_TYPES.MARKDOWN
157+
[FILE_EXTENSIONS.MD]: FILE_TYPES.MARKDOWN,
158+
[FILE_EXTENSIONS.CSV]: FILE_TYPES.CSV,
159+
[FILE_EXTENSIONS.JSON]: FILE_EXTENSIONS.JSON,
160+
[FILE_EXTENSIONS.HTML]: FILE_TYPES.HTML,
161+
[FILE_EXTENSIONS.XML]: FILE_TYPES.XML,
162+
[FILE_EXTENSIONS.EPUB]: FILE_TYPES.EPUB
148163
} as const;

0 commit comments

Comments
 (0)