Skip to content

UnicodeDecodeError (testing files included) #44

@pureair

Description

@pureair

Environment:

Python version: 3.12.2
adb version: Android Debug Bridge version 1.0.41 / Version 35.0.1-11580240
Operating System: Windows 10

Error Description:

The adbsync command fails with a UnicodeDecodeError when encountering specific folder/file names on the Android device. The error message indicates:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 62: invalid start byte
Full error log is at the end of this report.

Investigation:

I have narrowed down the problematic folder/files and included an archive (tester.tar.gz) containing their names and folder structure for further analysis. The content of the files have been emptied, only their filenames are kept.

Each of the three folders would be:

  1. successfully pushed to the phone initially (as there is no existing files on the phone)
  2. failed to be pulled from the phone
  3. failed to be pushed to the phone again (as existing files on the phone would be examined first)
  4. a) the folder with only English characters will be pulled with "--adb-encoding latin1", but the defect remains (ie. do step 1, 2, 3 to the newly pulled files and the result is the same). I think this means the filenames are perfectly utf-8 encoded so nothing changes.
    b) the folder with some Chinese characters will not be pulled with "--adb-encoding latin1" potentially because of bad filenames (error log below)
[INFO] SYNCING
[INFO]
[INFO] Empty delete tree
[INFO]
[INFO] Copying copy tree
[INFO] .\
[INFO] ./éè´é¸-HOYO-MiX - åç¥-éªèç群æ The Stellar Moments\
[INFO] ./éè´é¸-HOYO-MiX - åç¥-éªèç群æ The Stellar Moments/01. Bard's Adventure è¯äººçå·¥ä½.m4a
[CRITICAL] Non-zero exit code from adb pull
[CRITICAL] Exiting

they will not be pulled with "--adb-encoding gb2312" or gbk, gb18080, utf-8, utf-16, etc. because of 'utf-8' codec can't decode byte 0x0b in position xx (0xe9, 0xb8, etc.).

I actually look at the hex of the folder and files names of the folder with only English characters, there is actually no 0xa0 in either the folder name or the file names.


Full error log:

PS G:\> adbsync -n pull /sdcard/Music ./
* daemon not running; starting now at tcp:5037
* daemon started successfully
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\USERX\AppData\Local\Programs\Python\Python312\Scripts\adbsync.exe\__main__.py", line 7, in <module>
  File "C:\Users\USERX\AppData\Local\Programs\Python\Python312\Lib\site-packages\BetterADBSync\__init__.py", line 374, in main
    files_tree_source = fs_source.get_files_tree(path_source, follow_links = args.copy_links)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USERX\AppData\Local\Programs\Python\Python312\Lib\site-packages\BetterADBSync\FileSystems\Base.py", line 45, in get_files_tree
    return self._get_files_tree(tree_path, statObject, follow_links = follow_links)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USERX\AppData\Local\Programs\Python\Python312\Lib\site-packages\BetterADBSync\FileSystems\Base.py", line 33, in _get_files_tree
    tree[filename] = self._get_files_tree(
                     ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USERX\AppData\Local\Programs\Python\Python312\Lib\site-packages\BetterADBSync\FileSystems\Base.py", line 30, in _get_files_tree
    for filename, stat_object_child, in self.lstat_in_dir(tree_path):
  File "C:\Users\USERX\AppData\Local\Programs\Python\Python312\Lib\site-packages\BetterADBSync\FileSystems\Android.py", line 176, in lstat_in_dir
    for line in self.adb_shell(["ls", "-la", path]):
  File "C:\Users\USERX\AppData\Local\Programs\Python\Python312\Lib\site-packages\BetterADBSync\FileSystems\Android.py", line 87, in adb_shell
    adb_line = adb_line.decode(self.adb_encoding).rstrip("\r\n")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 62: invalid start byte

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions