fix: preserve Chinese characters in permalinks#103
fix: preserve Chinese characters in permalinks#103phernandez merged 2 commits intobasicmachines-co:mainfrom
Conversation
|
claude: is this fix safe for permalinks? |
|
The code changes in this PR are safe for permalinks for the following reasons:
The changes align with the codebase's UTF-8 handling approach seen in the test files, which already have comprehensive test coverage for UTF-8 characters throughout the system. Since permalinks are a critical part of the knowledge graph, preserving Chinese characters will improve usability for Chinese users without compromising functionality. |
|
Hi @andyxinweiminicloud thanks for the fix. I'm going to merge this PR, then add a bit of logic to handle permalinks with unicode for latin characters and keep your fixes for Chinese characters. |
Problem
The current implementation converts all Unicode characters to ASCII using
unidecode, which causes Chinese characters in permalinks to be transliterated. This makes it difficult for Chinese users to recognize the permalinks, as they no longer match the original Chinese titles.Solution
Modified the
generate_permalinkfunction inutils.pyto preserve non-ASCII characters while still properly handling ASCII characters:unidecodeThis change allows Chinese characters to be preserved in permalinks, making them more intuitive and readable for Chinese users while maintaining compatibility with existing systems.
Testing
Successfully tested with various Chinese file paths and titles. Both file paths and permalinks correctly preserve Chinese characters while still properly processing English text.