HKUDS
diff --git a/‎CHANGELOG.md‎
Lines changed: 74 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 74 additions & 0 deletions
diff --git a/‎__init__.py‎
Lines changed: 1 addition & 0 deletions b/‎__init__.py‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎requirements.txt‎
Lines changed: 1 addition & 0 deletions b/‎requirements.txt‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎tools/pdf_converter.py‎
Lines changed: 136 additions & 47 deletions b/‎tools/pdf_converter.py‎
Lines changed: 136 additions & 47 deletions
@@ -0,0 +1,74 @@
+# Changelog
+
+All notable changes to DeepCode will be documented in this file.
+
+## [1.0.6-jm] - 2025-10-19
+
+### Added
+- **Dynamic Model Limit Detection**: New `utils/model_limits.py` module that automatically detects and adapts to any LLM model's token limits and pricing
+- **Loop Detection System**: `utils/loop_detector.py` prevents infinite loops by detecting repeated tool calls, timeouts, and progress stalls
+- **Progress Tracking**: 8-phase progress tracking (5% → 100%) with file-level progress indicators in both UI and terminal
+- **Abort Mechanism**: "Stop Processing" button in UI with global abort flag for clean process termination
+- **Cache Cleanup Scripts**: `start_clean.bat` and `start_clean.ps1` to clear Python cache before starting
+- **Enhanced Error Display**: Real-time error messages in both UI and terminal with timestamps
+- **File Progress Tracking**: Shows files completed/total with estimated time remaining
+
+### Fixed
+- **Critical: False Error Detection**: Fixed overly aggressive error detection that was marking successful operations as failures, causing premature abort and empty file generation
+- **Critical: Empty File Generation**: Files now contain actual code instead of being empty (2-byte files)
+- **Unique Folder Naming**: Each project run now creates `paper_{timestamp}` folders instead of reusing `pdf_output`
+- **PDF Save Location**: PDFs now save to `deepcode_lab/papers/` instead of system temp directory
+- **Duplicate Folder Prevention**: Added session state caching to prevent duplicate folder creation on UI reruns
+- **Token Limit Compliance**: Fixed `max_tokens` to respect model limits dynamically (e.g., gpt-4o-mini's 16,384 token limit)
+- **Empty Plan Detection**: System now fails early with clear error messages when initial plan is empty or invalid
+- **Process Hanging**: Fixed infinite loops and hanging on errors - process now exits cleanly
+- **Token Cost Tracking**: Restored accurate token usage and cost display (was showing $0.0000)
+- **PDF to Markdown Conversion**: Fixed automatic conversion and file location handling
+- **Document Segmentation**: Properly uses configured 50K character threshold from `mcp_agent.config.yaml`
+- **Error Propagation**: Abort mechanism now properly stops process after 10 consecutive real errors
+
+### Changed
+- **Model-Aware Token Management**: Token limits now adapt automatically based on configured model instead of hardcoded values
+- **Cost Calculation**: Dynamic pricing based on actual model rates (OpenAI, Anthropic)
+- **Retry Logic**: Token limits for retries now respect model maximum (87.5% → 95% → 98% of max)
+- **Segmentation Workflow**: Better integration with code implementation phase
+- **Error Handling**: Enhanced error propagation - errors no longer reported as "success"
+- **UI Display**: Shows project folder name after PDF conversion for better visibility
+- **Terminal Logging**: Added timestamps to all progress messages
+
+### Technical Improvements
+- Added document-segmentation server to code implementation workflow for better token management
+- Improved error handling in agent orchestration engine with proper cleanup
+- Enhanced subprocess handling on Windows (hide console windows, prevent hanging)
+- Better LibreOffice detection on Windows using direct path checking
+- Fixed input data format consistency (JSON with `paper_path` key)
+- Added comprehensive logging throughout the pipeline
+- Improved resource cleanup on errors and process termination
+
+### Documentation
+- Translated Chinese comments to English in core workflow files
+- Added inline documentation for new utility modules
+- Created startup scripts with clear usage instructions
+
+### Breaking Changes
+- None - all changes are backward compatible
+
+### Known Issues
+- Terminal may show trailing "Calling Tool..." line after completion (cosmetic display artifact - process completes successfully)
+- Some Chinese comments remain in non-critical files (cli, tools) - translation in progress
+- tiktoken package optional warning (doesn't affect functionality)
+
+### Success Metrics
+- ✅ Complete end-to-end workflow: DOCX upload → PDF conversion → Markdown → Segmentation → Planning → Code generation
+- ✅ Files generated with actual code content (15+ files with proper implementation)
+- ✅ Single folder per project run (no duplicates)
+- ✅ Dynamic token management working across different models
+- ✅ Accurate cost tracking per model
+- ✅ Clean process termination with proper error handling
+
+---
+
+## [1.0.5] - Previous Release
+
+See previous releases for earlier changes.
+
@@ -8,6 +8,7 @@
 __version__ = "1.2.0"
 __author__ = "DeepCode Team"
 __url__ = "https://github.com/HKUDS/DeepCode"
+__repo__ = "https://github.com/Jany-M/DeepCode/"
 
 # Import main components for easy access
 from utils import FileProcessor, DialogueLogger
 
@@ -10,6 +10,7 @@ fastapi>=0.104.0
 google-genai
 mcp-agent
 mcp-server-git
+openapi
 nest_asyncio
 openai
 pathlib2
 
@@ -18,8 +18,9 @@
 import tempfile
 import shutil
 import platform
+import os
 from pathlib import Path
-from typing import Union, Optional, Dict, Any
+from typing import Union, Optional, Dict, Any, List
 
 
 class PDFConverter:
@@ -40,6 +41,39 @@ def __init__(self) -> None:
         """Initialize the PDF converter."""
         pass
 
+    @staticmethod
+    def find_libreoffice_windows() -> Optional[str]:
+        """
+        Find LibreOffice installation on Windows.
+        
+        Returns:
+            Path to soffice.exe if found, None otherwise
+        """
+        if platform.system() != "Windows":
+            return None
+            
+        # Common LibreOffice installation paths on Windows
+        possible_paths = [
+            r"C:\Program Files\LibreOffice\program\soffice.exe",
+            r"C:\Program Files (x86)\LibreOffice\program\soffice.exe",
+        ]
+        
+        # Also check PROGRAMFILES environment variables
+        program_files = os.environ.get("PROGRAMFILES")
+        program_files_x86 = os.environ.get("PROGRAMFILES(X86)")
+        
+        if program_files:
+            possible_paths.append(os.path.join(program_files, "LibreOffice", "program", "soffice.exe"))
+        if program_files_x86:
+            possible_paths.append(os.path.join(program_files_x86, "LibreOffice", "program", "soffice.exe"))
+        
+        # Check each path
+        for path in possible_paths:
+            if os.path.exists(path):
+                return path
+                
+        return None
+
     @staticmethod
     def convert_office_to_pdf(
         doc_path: Union[str, Path], output_dir: Optional[str] = None
@@ -67,7 +101,15 @@ def convert_office_to_pdf(
             if output_dir:
                 base_output_dir = Path(output_dir)
             else:
-                base_output_dir = doc_path.parent / "pdf_output"
+                # Generate unique folder name with timestamp to avoid conflicts
+                import time
+                timestamp = int(time.time())
+                folder_name = f"paper_{timestamp}"
+                
+                # Save to workspace instead of temp directory
+                workspace_base = Path(os.getcwd()) / "deepcode_lab" / "papers"
+                workspace_base.mkdir(parents=True, exist_ok=True)
+                base_output_dir = workspace_base / folder_name
 
             base_output_dir.mkdir(parents=True, exist_ok=True)
 
@@ -86,26 +128,41 @@ def convert_office_to_pdf(
 
             # Hide console window on Windows
             if platform.system() == "Windows":
-                subprocess_kwargs["creationflags"] = (
-                    0x08000000  # subprocess.CREATE_NO_WINDOW
-                )
-
-            try:
-                result = subprocess.run(
-                    ["libreoffice", "--version"], **subprocess_kwargs
-                )
-                libreoffice_available = True
-                working_libreoffice_cmd = "libreoffice"
-                logging.info(f"LibreOffice detected: {result.stdout.strip()}")  # type: ignore
-            except (
-                subprocess.CalledProcessError,
-                FileNotFoundError,
-                subprocess.TimeoutExpired,
-            ):
-                pass
-
-            # Try alternative commands for LibreOffice
-            if not libreoffice_available:
+                # Use CREATE_NO_WINDOW to prevent console window from appearing
+                subprocess_kwargs["creationflags"] = 0x08000000
+                # Also configure startupinfo to hide window
+                startupinfo = subprocess.STARTUPINFO()
+                startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
+                startupinfo.wShowWindow = subprocess.SW_HIDE
+                subprocess_kwargs["startupinfo"] = startupinfo
+
+            # On Windows, try to find LibreOffice in standard installation paths first
+            # Don't run --version check on Windows as it can cause window/hanging issues
+            if platform.system() == "Windows":
+                windows_path = PDFConverter.find_libreoffice_windows()
+                if windows_path:
+                    libreoffice_available = True
+                    working_libreoffice_cmd = windows_path
+                    logging.info(f"LibreOffice detected at {windows_path}")
+
+            # On non-Windows systems, try standard commands
+            if not libreoffice_available and platform.system() != "Windows":
+                try:
+                    result = subprocess.run(
+                        ["libreoffice", "--version"], **subprocess_kwargs
+                    )
+                    libreoffice_available = True
+                    working_libreoffice_cmd = "libreoffice"
+                    logging.info(f"LibreOffice detected: {result.stdout.strip()}")  # type: ignore
+                except (
+                    subprocess.CalledProcessError,
+                    FileNotFoundError,
+                    subprocess.TimeoutExpired,
+                ):
+                    pass
+
+            # Try alternative commands for LibreOffice (non-Windows)
+            if not libreoffice_available and platform.system() != "Windows":
                 for cmd in ["soffice", "libreoffice"]:
                     try:
                         result = subprocess.run([cmd, "--version"], **subprocess_kwargs)
@@ -142,7 +199,13 @@ def convert_office_to_pdf(
 
                 # Use the working LibreOffice command first, then try alternatives if it fails
                 commands_to_try = [working_libreoffice_cmd]
-                if working_libreoffice_cmd == "libreoffice":
+                
+                # Add alternative commands based on what was found
+                if platform.system() == "Windows" and working_libreoffice_cmd:
+                    # If we're using the full Windows path, also try standard commands
+                    if "Program Files" in working_libreoffice_cmd:
+                        commands_to_try.extend(["soffice", "libreoffice"])
+                elif working_libreoffice_cmd == "libreoffice":
                     commands_to_try.append("soffice")
                 else:
                     commands_to_try.append("libreoffice")
@@ -173,9 +236,12 @@ def convert_office_to_pdf(
 
                         # Hide console window on Windows
                         if platform.system() == "Windows":
-                            convert_subprocess_kwargs["creationflags"] = (
-                                0x08000000  # subprocess.CREATE_NO_WINDOW
-                            )
+                            convert_subprocess_kwargs["creationflags"] = 0x08000000
+                            # Also configure startupinfo to hide window
+                            startupinfo = subprocess.STARTUPINFO()
+                            startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
+                            startupinfo.wShowWindow = subprocess.SW_HIDE
+                            convert_subprocess_kwargs["startupinfo"] = startupinfo
 
                         result = subprocess.run(
                             convert_cmd, **convert_subprocess_kwargs
@@ -227,6 +293,10 @@ def convert_office_to_pdf(
                 # Copy PDF to final output directory
                 final_pdf_path = base_output_dir / f"{name_without_suff}.pdf"
                 shutil.copy2(pdf_path, final_pdf_path)
+                
+                print(f"✅ PDF saved to: {final_pdf_path}")
+                print(f"   File size: {final_pdf_path.stat().st_size} bytes")
+                print(f"   Parent folder: {base_output_dir}")
 
                 return final_pdf_path
 
@@ -281,7 +351,15 @@ def convert_text_to_pdf(
             if output_dir:
                 base_output_dir = Path(output_dir)
             else:
-                base_output_dir = text_path.parent / "pdf_output"
+                # Generate unique folder name with timestamp to avoid conflicts
+                import time
+                timestamp = int(time.time())
+                folder_name = f"paper_{timestamp}"
+                
+                # Save to workspace instead of temp directory
+                workspace_base = Path(os.getcwd()) / "deepcode_lab" / "papers"
+                workspace_base.mkdir(parents=True, exist_ok=True)
+                base_output_dir = workspace_base / folder_name
 
             base_output_dir.mkdir(parents=True, exist_ok=True)
             pdf_path = base_output_dir / f"{text_path.stem}.pdf"
@@ -435,6 +513,10 @@ def convert_text_to_pdf(
                     f"PDF conversion failed for {text_path.name} - generated PDF is empty or corrupted."
                 )
 
+            print(f"✅ PDF saved to: {pdf_path}")
+            print(f"   File size: {pdf_path.stat().st_size} bytes")
+            print(f"   Parent folder: {base_output_dir}")
+            
             return pdf_path
 
         except Exception as e:
@@ -532,27 +614,34 @@ def check_dependencies(self) -> dict:
         }
 
         # Check LibreOffice
-        try:
-            subprocess_kwargs: Dict[str, Any] = {
-                "capture_output": True,
-                "text": True,
-                "check": True,
-                "encoding": "utf-8",
-                "errors": "ignore",
-            }
-
-            if platform.system() == "Windows":
-                subprocess_kwargs["creationflags"] = (
-                    0x08000000  # subprocess.CREATE_NO_WINDOW
-                )
-
-            subprocess.run(["libreoffice", "--version"], **subprocess_kwargs)
-            results["libreoffice"] = True
-        except (subprocess.CalledProcessError, FileNotFoundError):
-            try:
-                subprocess.run(["soffice", "--version"], **subprocess_kwargs)
+        # On Windows, just check if the executable exists (don't run it to avoid window issues)
+        if platform.system() == "Windows":
+            windows_path = PDFConverter.find_libreoffice_windows()
+            if windows_path:
                 results["libreoffice"] = True
-            except (subprocess.CalledProcessError, FileNotFoundError):
+        else:
+            # On non-Windows systems, try running the version command
+            try:
+                subprocess_kwargs: Dict[str, Any] = {
+                    "capture_output": True,
+                    "text": True,
+                    "check": True,
+                    "timeout": 5,
+                    "encoding": "utf-8",
+                    "errors": "ignore",
+                }
+
+                try:
+                    subprocess.run(["libreoffice", "--version"], **subprocess_kwargs)
+                    results["libreoffice"] = True
+                except (subprocess.CalledProcessError, FileNotFoundError, subprocess.TimeoutExpired):
+                    try:
+                        subprocess.run(["soffice", "--version"], **subprocess_kwargs)
+                        results["libreoffice"] = True
+                    except (subprocess.CalledProcessError, FileNotFoundError, subprocess.TimeoutExpired):
+                        pass
+            except Exception:
+                # If any unexpected error occurs during LibreOffice check, silently pass
                 pass
 
         # Check ReportLab