Skip to content

Add LLMKube to Inference Platform section #373

@Defilan

Description

@Defilan

LLMKube is a Kubernetes operator for llama.cpp-native LLM inference.

  • GitHub: https://github.com/defilantech/llmkube
  • Apache 2.0 license
  • CRD-based model and inference service management
  • NVIDIA CUDA and Apple Silicon Metal GPU support
  • Multi-GPU layer sharding, pre-flight memory validation
  • Helm chart, Prometheus metrics, OpenAI-compatible API

I'm the creator and maintainer. Happy to provide any additional info needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-priorityIndicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions