Gaggle uses standardized error codes to make error handling more predictable and debugging easier. Each error includes a numeric code (E001 to E010) that can be used programmatically. When troubleshooting, look for the bracketed code (like [E003]) and refer to the corresponding section below.
All errors follow this format:
[Exxx] Error description: additional details
Example:
[E002] Dataset not found: owner/invalid-dataset
Description:
Kaggle API credentials are invalid, missing, or incorrectly formatted.
Common Causes:
- Wrong username or API key
- Missing credentials (no environment variables or kaggle.json)
- Expired API key
- Incorrectly formatted kaggle.json file
Example:
[E001] Invalid Kaggle credentials: Username or API key not found
Solutions:
- Set credentials via SQL:
select gaggle_set_credentials('your-username', 'your-api-key'); - Or via env:
export KAGGLE_USERNAME=...andexport KAGGLE_KEY=... - Or create
~/.kaggle/kaggle.jsonwith username/key (chmod 600)
Description:
The requested dataset does not exist on Kaggle or is not accessible.
Common Causes:
- Typo in dataset path
- Dataset was deleted or made private
- Wrong owner name
- Dataset requires special permissions
Example:
[E002] Dataset not found: owner/nonexistent-dataset
Solutions:
-
Verify dataset path on Kaggle:
- Visit https://www.kaggle.com/datasets/owner/dataset-name
- Check spelling and owner name
-
Search for the dataset:
select gaggle_search('dataset keywords', 1, 10);
-
Check dataset availability:
- Check dataset is public
- Verify you have access rights
Description:
Network error occurred during communication with Kaggle API.
Common Causes:
- No internet connection
- Kaggle API is down
- Firewall blocking requests
- A timeout happened
- Rate limiting
Example:
[E003] HTTP request failed: Connection timeout after 30s
Solutions:
-
Check internet connection:
ping www.kaggle.com
-
Increase timeout:
export GAGGLE_HTTP_TIMEOUT=120 # 2 minutes
-
Check Kaggle API status:
- Check https://www.kaggle.com is accessible
-
Retry with backoff:
export GAGGLE_HTTP_RETRY_ATTEMPTS=5 export GAGGLE_HTTP_RETRY_DELAY=2 export GAGGLE_HTTP_RETRY_MAX_DELAY=30
-
Check firewall settings:
- Check outbound HTTPS (port 443) is allowed
- Check corporate proxy settings
Description:
Dataset path format is invalid or contains forbidden characters.
Common Causes:
- Missing slash in path
- Path traversal attempts (../)
- Too many path components
- Control characters in path
- Path too long (>4096 characters)
Example:
[E004] Invalid dataset path: Must be in format 'owner/dataset-name'
Valid Path Format:
owner/dataset-name
owner/dataset-name@v2 (with version)
Invalid Paths:
ownerdataset # Missing slash
owner/dataset/extra # Too many components
../dataset # Path traversal
owner/. # Dot component
Solutions:
-
Use correct format:
select gaggle_download('owner/dataset-name');
-
Check for special characters:
- Avoid:
..,., control characters - Allowed: letters, numbers, hyphens, underscores
- Avoid:
Description:
Error reading from or writing to the file system.
Common Causes:
- Insufficient disk space
- Permission denied
- File not found
- Directory not writable
- Disk full
Example:
[E005] IO error: Permission denied (os error 13)
Solutions:
-
Check disk space:
df -h
-
Check permissions:
ls -la ~/.cache/gaggle_cache chmod -R u+rw ~/.cache/gaggle_cache
-
Verify cache directory:
select gaggle_cache_info(); -
Change cache directory:
export GAGGLE_CACHE_DIR=/path/with/space -
Clean up cache:
select gaggle_clear_cache();
Description:
Error parsing or serializing JSON data.
Common Causes:
- Corrupted cache metadata
- Invalid JSON response from Kaggle API
- Encoding issues
- Malformed JSON
Example:
[E006] JSON serialization error: expected `,` or `}` at line 5 column 10
Solutions:
-
Clear cache:
select gaggle_clear_cache(); -
Re-download dataset:
select gaggle_update_dataset('owner/dataset');
-
Check Kaggle API response manually:
curl -u username:key https://www.kaggle.com/api/v1/datasets/view/owner/dataset
Description:
Error extracting downloaded ZIP file.
Common Causes:
- Corrupted download
- ZIP bomb protection triggered (>10GB uncompressed)
- Path traversal in ZIP
- Symlinks in ZIP
- Invalid ZIP format
Example:
[E007] ZIP extraction failed: ZIP file too large (exceeds 10GB)
Solutions:
-
Re-download dataset:
select gaggle_update_dataset('owner/dataset');
-
Check dataset size:
select gaggle_info('owner/dataset');
-
For large datasets:
- Note: 10GB uncompressed limit is a security feature
- Consider using a different dataset or smaller subset
-
Check ZIP integrity:
unzip -t /path/to/dataset.zip
Description:
Error parsing CSV file format.
Common Causes:
- Malformed CSV
- Inconsistent column count
- Invalid quotes or delimiters
- Encoding issues
Example:
[E008] CSV parsing error: record 145 has different field count
Solutions:
-
Check CSV format:
head -20 /path/to/file.csv
-
Use DuckDB's flexible CSV reader:
select * FROM read_csv_auto('kaggle:owner/dataset/file.csv', ignore_errors := true);
-
Try different parser options:
select * FROM read_csv('kaggle:owner/dataset/file.csv', delim := ';', quote := '"', escape := '\\');
Description:
String is not valid UTF-8.
Common Causes:
- Binary data in string field
- Wrong character encoding
- Corrupted data
- FFI boundary issues
Example:
[E009] Invalid UTF-8 string
Solutions:
-
Check file encoding:
file -i /path/to/file.csv
-
Convert to UTF-8:
iconv -f ISO-8859-1 -t UTF-8 input.csv > output.csv -
Use DuckDB encoding options:
select * FROM read_csv('file.csv', encoding := 'ISO-8859-1');
Description:
NULL pointer passed to FFI function.
Common Causes:
- Internal programming error
- Invalid function call
- Memory corruption
Example:
[E010] Null pointer passed
Solutions:
- This is typically an internal error
- Report as a bug if you encounter this
- Include reproduction steps