Replies: 3 comments 1 reply
-
|
I can take this up after #1194 is merged , as some modifications already done there @acul71 @yashksaini-coder |
Beta Was this translation helpful? Give feedback.
0 replies
-
Phase Description for Original Bitswap py-cid Migration IssueThis document describes the full 4-phase plan for completing the original Bitswap py-cid migration work. Phase 1: Dependency + Compatibility LayerWhat we will do
Phase 2: py-cid Core Refactor in
|
Beta Was this translation helpful? Give feedback.
1 reply
-
|
Current status:
Contributions for Phase 3 or Phase 4 are welcome — please coordinate on issue #1181 or this discussion to avoid overlapping PRs. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
py-libp2p CID Status and py-cid Integration Opportunities
Executive Summary
This document analyzes the current CID (Content Identifier) implementation in py-libp2p and identifies opportunities to leverage the full-featured
py-cidlibrary to improve code quality, maintainability, and standards compliance.Current State: py-libp2p uses a simplified, custom CID implementation in
libp2p/bitswap/cid.pythat handles basic CID operations but lacks many features available in the standardpy-cidlibrary.Recommendation: Migrate to
py-cidto gain proper CID encoding/decoding, multibase support, builder patterns, and other advanced features that would significantly improve the codebase.Table of Contents
Current CID Implementation
Location:
libp2p/bitswap/cid.pyThe current implementation provides basic CID functionality:
Features:
Limitations:
CODEC_DAG_PB = 0x70andCODEC_RAW = 0x55)Current Usage Pattern:
py-cid Features Overview
Core Features
Full CIDv0 and CIDv1 Support
Builder Pattern
V0Builder()andV1Builder(codec, mh_type)for fluent CID creationPrefix Operations
Prefixclass for CID metadata (version, codec, multihash type/length)CIDSet
JSON/IPLD Format Support
to_json_dict()/from_json_dict()for IPLD JSON formatCIDJSONEncoderfor JSON serializationAdvanced Parsing
/ipfs/path parsing (extracts CID from URLs/paths)from_reader()for stream parsingfrom_bytes_strict()with trailing bytes validationextract_encoding()to detect multibase encodingVersion Conversion
cidv0.to_v1()andcidv1.to_v0()(when codec allows)Utility Methods
to_bytes(),to_text(),from_text()for marshalingkey_string()for use as dictionary keysloggable()for loggingdefined()to check CID validityIntegration Opportunities
1. Replace Custom CID Implementation
Location:
libp2p/bitswap/cid.pyCurrent Issues:
Improvement:
Replace the custom implementation with py-cid wrapper functions that maintain backward compatibility while providing full functionality.
Example Migration:
Files Affected:
libp2p/bitswap/cid.py- Complete rewritelibp2p/bitswap/__init__.py- Update exportsbitswap.cid2. Improve CID String Representation
Location: Throughout
libp2p/bitswap/Current Issue:
CIDs are represented as hex strings using
.hex()method, which is not standard CID format.Current Pattern:
Improvement:
Use proper CID string encoding (base58 for v0, multibase for v1).
Example:
Files Affected:
libp2p/bitswap/client.py- ~20 occurrenceslibp2p/bitswap/dag.py- ~15 occurrenceslibp2p/bitswap/dag_pb.py- ~2 occurrencesexamples/bitswap/bitswap.py- ~10 occurrences3. Enhance Block Store with CID Objects
Location:
libp2p/bitswap/block_store.pyCurrent Issue:
Block store uses raw bytes as keys, which makes CID operations difficult.
Current Implementation:
Improvement:
Use CID objects as keys (they're hashable) or add CID-aware methods.
Example:
Benefits:
4. Add CIDSet for Wantlist Management
Location:
libp2p/bitswap/client.pyCurrent Issue:
Wantlist uses dictionary with bytes keys, which could benefit from CIDSet for deduplication and efficient operations.
Current Implementation:
Improvement:
Use CIDSet for tracking unique CIDs, with separate metadata storage.
Example:
Benefits:
5. Implement Proper CID Parsing from Strings
Location:
examples/bitswap/bitswap.pyCurrent Issue:
Example code manually parses hex strings, which doesn't support standard CID string formats.
Current Pattern:
Improvement:
Use py-cid's
from_string()which supports:Qm...)bafy...)/ipfs/paths (e.g.,/ipfs/Qm...)https://ipfs.io/ipfs/Qm...)Example:
Files Affected:
examples/bitswap/bitswap.py- Command-line argument parsing6. Add Builder Pattern for CID Creation
Location:
libp2p/bitswap/dag.py,libp2p/bitswap/cid.pyCurrent Issue:
CID creation is done with low-level functions, making it error-prone.
Current Pattern:
Improvement:
Use builder pattern for clearer, more maintainable code.
Example:
Benefits:
7. Add Prefix Support for Bitswap v1.1.0+
Location:
libp2p/bitswap/cid.pyCurrent Issue:
Custom prefix extraction doesn't leverage py-cid's Prefix class.
Current Implementation:
Improvement:
Use py-cid's Prefix class for proper prefix operations.
Example:
Benefits:
8. Add JSON/IPLD Format Support
Location: API endpoints, serialization code
Current Issue:
No support for IPLD JSON format (
{"/": "<cid-string>"}).Improvement:
Add JSON serialization support for APIs and data exchange.
Example:
Use Cases:
9. Improve CID Validation and Error Handling
Location: Throughout bitswap module
Current Issue:
Limited validation, unclear error messages.
Improvement:
Use py-cid's validation and error handling.
Example:
Benefits:
10. Add Stream Parsing Support
Location: Protocol message parsing
Current Issue:
No support for parsing CIDs from streams.
Improvement:
Use
from_reader()for efficient stream parsing.Example:
Use Cases:
Migration Strategy
Phase 1: Add py-cid as Dependency
Add
py-cidtopyproject.toml:Update imports to include py-cid alongside current implementation
Phase 2: Create Compatibility Layer
Create wrapper functions in
libp2p/bitswap/cid.pythat:Example wrapper:
Phase 3: Gradual Migration
Phase 4: Complete Migration
Benefits of Migration
1. Standards Compliance
2. Code Quality
3. Developer Experience
4. Features
/ipfs/...)5. Performance
6. Future-Proofing
Specific Code Locations for Improvement
High Priority
libp2p/bitswap/cid.py(236 lines)libp2p/bitswap/block_store.py(109 lines)libp2p/bitswap/client.py(859 lines)libp2p/bitswap/dag.py(585 lines)examples/bitswap/bitswap.py(400+ lines)Medium Priority
libp2p/bitswap/dag_pb.py(269 lines)libp2p/bitswap/__init__.py(64 lines)Tests (
tests/core/bitswap/test_cid.py)Low Priority
Documentation
Type Hints
Example: Complete Migration of One Function
Before (Current Implementation)
After (With py-cid)
Conclusion
Migrating py-libp2p to use
py-cidwould provide significant benefits:The migration can be done gradually with a compatibility layer, minimizing breaking changes while providing immediate benefits for new code.
Recommended Next Steps:
libp2p/bitswap/cid.pyReferences
libp2p/bitswap/cid.pyBeta Was this translation helpful? Give feedback.
All reactions