Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

IPFS Datasets Python Documentation

Welcome to the comprehensive documentation for IPFS Datasets Python - a production-ready decentralized AI data platform.

🚀 Quick Navigation

For New Users

For Developers

Latest Features (February 2026)

🗄️ IPLD Vector Database - Production-ready distributed vector search

  • Vector Store Guides - Implementation guides
  • 18 MCP tools for vector operations
  • 95% test coverage, 150+ tests

📚 Knowledge Graphs Enhanced - Modular extraction and query system

  • Knowledge Graph Guides - Complete documentation
  • 110+ new tests with comprehensive coverage
  • Unified query engine with hybrid search

📖 Documentation Reorganized - Clean, structured hierarchy

  • Archive - Historical reports and planning docs
  • Guides - 45 organized feature guides
  • Reduced clutter: 85% fewer files in docs root

Documentation Structure

Core Documentation (Root Level)

Essential guides for all users:

Organized Documentation Directories

guides/ - Feature Guides and How-To Documentation

Organized by feature and component:

  • guides/knowledge_graphs/ - Knowledge graph documentation (16 guides)

    • Implementation guides, migration paths, quick references
    • Entity extraction, relationship mapping, query engine
  • guides/processors/ - Processor subsystem documentation (29 guides)

    • Architecture, migration guides, quick references
    • File conversion, multimedia processing, data transformation
  • guides/deployment/ - Deployment and runner setup guides

  • guides/tools/ - Tool-specific documentation (MCP, scrapers, web search)

  • guides/infrastructure/ - Infrastructure and CI/CD guides

  • guides/security/ - Security, audit logging, and governance

  • guides/reference/ - API reference and technical documentation

tutorials/ - Step-by-Step Tutorials

Hands-on tutorials for specific features and use cases.

examples/ - Usage Examples

Code samples and practical examples for common scenarios.

architecture/ - System Architecture

Technical design documents and architecture diagrams.

reports/ - Project Reports

Historical completion reports and project summaries (44+ files).

archive/ - Archived Documentation

Historical documentation and deprecated content organized into:

Component Documentation

Direct links to module documentation:

  • Vector Stores - IPLD vector database, FAISS, Qdrant, Elasticsearch
  • Embeddings - Embedding generation and management
  • Search - Advanced search including RAG and GraphRAG
  • Knowledge Graphs - Extraction, query, and storage
  • PDF Processing - PDF analysis and processing
  • Multimedia - Media processing capabilities
  • LLM - Language model integration
  • MCP Tools - 200+ tools for AI assistants
  • IPLD - InterPlanetary Linked Data
  • Audit - Security and audit logging

🔍 Finding Documentation

By Topic

  • Vector Search & Embeddings: See guides/knowledge_graphs/ and ../ipfs_datasets_py/vector_stores/
  • Knowledge Graphs: See guides/knowledge_graphs/ for all KG documentation
  • File Processing: See guides/processors/ for file conversion and multimedia
  • MCP Integration: See guides/tools/ for MCP server and tool documentation
  • Deployment: See guides/deployment/ for production deployment guides
  • Security: See guides/security/ for audit logging and governance

By Use Case

  • Getting Started: Start with getting_started.md and installation.md
  • Building AI Applications: See user_guide.md and MCP documentation
  • Contributing Code: See developer_guide.md and architecture docs
  • Production Deployment: See guides/deployment/ and guides/infrastructure/

Documentation Maintenance

Guidelines

  1. Centralization: All documentation lives in the docs/ directory
  2. Organization: Follow the established structure for different doc types
  3. Cross-Referencing: Use relative links between documentation files
  4. Archive Old Docs: Move completed session/phase reports to archive/
  5. Update Guides: Keep permanent guides in guides/ up to date
  6. Index Files: Each subdirectory should include a README.md or index file

Build and publish docs site

This repository now includes a root-level mkdocs.yml configured to publish docs from docs/, including generated API pages in docs/api/.

  • Build site locally: mkdocs build
  • Serve docs locally: mkdocs serve

If MkDocs is not installed in your environment, install it with pip install mkdocs.

Recent Reorganization (February 2026)

The documentation was comprehensively reorganized to improve navigation:

  • Archived 100+ files: Session reports, phase completions, and planning docs moved to archive/
  • Created guides structure: 45 permanent guides organized by feature in guides/
  • Reduced clutter: 85% reduction in docs root files (177 → 27 core files)

For details, see DOCS_REORGANIZATION_2026_02_16.md.

Need Help?

  • General Questions: See FAQ or User Guide
  • Bug Reports: Open an issue on GitHub
  • Feature Requests: Check existing issues or open a new one
  • Contributing: See CONTRIBUTING.md for guidelines