Config-driven language plugin factory for CodeGraph. Takes a declarative language configuration and returns a fully-formed LanguagePlugin, reducing per-language plugin code from ~500-1000 lines to ~100-150 lines of config.
The factory:
- Sets up grammar helpers (extension-to-grammar mapping)
- Wires up standard or overridden extractors (with Phase 2 configs)
- Composes
extractAllEntitiesfrom individual extractors - Returns a
LanguagePluginconforming to the@codegraph/typesinterface
Core fields:
id/displayName/extensions-- Language identitygrammar-- Tree-sitter grammar objectgrammarPackage-- Optional package name for lazy loadingnodeTypes-- Maps entity types to tree-sitter node type names:functions,classes,interfaces,variables,imports,types,calls
fields-- Customizable field name mappings (name,parameters,returnType,body,superclass,callee)
Declarative configs for language-specific behavior without writing custom extractors:
ImportConfig(Phase 2a) -- Structured import extraction: module field/node types, specifier extraction, alias handling, namespace detection, quote stripping.ParamConfig(Phase 2b) -- Rich parameter extraction: typed parameters, default values, name filtering (self,cls,this), custom identifier node types.DocstringConfig(Phase 2c) -- Documentation extraction strategies:'preceding-comment'(Java/Go/Rust style),'body-first-string'(Python style),'attribute', or'none'.VisibilityConfig(Phase 2d) -- Export/visibility detection strategies:'modifier'(pub/public keywords),'naming'(Go uppercase, Python_prefix),'keyword'(JS export), or'all-public'. Also supports async and abstract modifier detection.
The overrides field allows replacing any generic extractor with a custom implementation:
extractFunctions,extractClasses,extractInterfaces,extractVariables,extractImports,extractTypes,extractCalls,extractInheritance- Lightweight overrides:
isExported,isAsync,isAbstract,extractDocstring,extractParameters,extractReturnType,extractClassName,extractSuperclasses builtinFunctions-- Set of names to skip in call extraction
Tier-2 languages use this factory directly with config-only definitions:
import { createLanguagePlugin } from '@codegraph/plugin-generic';
export const rubyPlugin = createLanguagePlugin({
id: 'ruby',
displayName: 'Ruby',
extensions: ['.rb', '.rake', '.gemspec'],
grammar: RubyGrammar,
nodeTypes: {
functions: ['method', 'singleton_method'],
classes: ['class'],
imports: ['call'], // require/require_relative
calls: ['call', 'method_call'],
},
});Tier-1 languages (Python, Go, Rust) use the factory with custom override extractors for full-fidelity extraction.