Overview
This PR adds support for multiple file formats (PDF, TXT, JSON, TSV) to the LocalMind RAG system, addressing the "Coming Soon" items mentioned in the README.
Changes Made
New Features
- Support for PDF, TXT, JSON, and TSV file formats in addition to CSV
- Automatic file format detection based on extension and MIME type
- File upload functionality using multer middleware
- Comprehensive file validation (type, size, format)
- New endpoint:
GET /api/v1/dataset/formats to list supported formats
- Automatic file cleanup after processing or on error
Files Added
DataSet.fileLoader.ts - Universal file loader supporting all formats
DataSet.multer.ts - Multer configuration for file uploads
DataSet.validator.ts - File validation middleware
DataSet.type.ts - TypeScript interfaces and enums
README.md - Complete documentation with API examples
test-samples/ - Sample files for each supported format
Files Modified
DataSet.controller.ts - Updated with multi-format support
DataSet.routes.ts - Changed from GET to POST with file upload
package.json - Added pdf-parse, multer dependencies
Dependencies Added
pdf-parse - For PDF document parsing
multer - For handling file uploads
@types/multer - TypeScript types for multer
Breaking Changes
⚠️API Endpoint Changed: Upload endpoint changed from GET /upload to POST /upload with multipart/form-data
Migration:
# Before
GET /api/v1/dataset/upload
# After
POST /api/v1/dataset/upload
Content-Type: multipart/form-data
Body: file (form field)
Overview
This PR adds support for multiple file formats (PDF, TXT, JSON, TSV) to the LocalMind RAG system, addressing the "Coming Soon" items mentioned in the README.
Changes Made
New Features
GET /api/v1/dataset/formatsto list supported formatsFiles Added
DataSet.fileLoader.ts- Universal file loader supporting all formatsDataSet.multer.ts- Multer configuration for file uploadsDataSet.validator.ts- File validation middlewareDataSet.type.ts- TypeScript interfaces and enumsREADME.md- Complete documentation with API examplestest-samples/- Sample files for each supported formatFiles Modified
DataSet.controller.ts- Updated with multi-format supportDataSet.routes.ts- Changed from GET to POST with file uploadpackage.json- Added pdf-parse, multer dependenciesDependencies Added
pdf-parse- For PDF document parsingmulter- For handling file uploads@types/multer- TypeScript types for multerBreaking Changes
GET /uploadtoPOST /uploadwith multipart/form-dataMigration: