Skip to content

Latest commit

 

History

History
242 lines (175 loc) · 7.48 KB

File metadata and controls

242 lines (175 loc) · 7.48 KB

Zoekt Query Language Guide

This guide explains the Zoekt query language, used for searching text within Git repositories. Zoekt queries allow combining multiple filters and expressions using logical operators, negations, and grouping. Here's how to craft queries effectively.

For a brief overview of Zoekt's query syntax, see these great docs from neogrok.


Syntax Overview

A query is made up of expressions. An expression can be:

  • A negation (e.g., -),
  • A field (e.g., repo:),
  • A grouping (e.g., parentheses ()),

Logical OR operations combine multiple expressions. The AND operator is implicit, meaning multiple expressions written together will be automatically treated as AND.


Query Components

1. Fields

Fields restrict your query to specific criteria. Here's a list of fields and their usage:

Field Aliases Values Description Examples
archived: a: yes or no Filters archived repositories. archived:yes
case: c: yes, no, or auto Matches case-sensitive or insensitive text. case:yes content:"Foo"
content: c: Text (string or regex) Searches content of files. content:"search term"
file: f: Text (string or regex) Searches file names. file:"main.go"
fork: f: yes or no Filters forked repositories. fork:no
lang: l: Text Filters by programming language. lang:python
public: yes or no Filters public repositories. public:yes
regex: Regex pattern Matches content using a regular expression. regex:/foo.*bar/
repo: r: Text (string or regex) Filters repositories by name. repo:"github.com/user/project"
sym: Text Searches for symbol names. sym:"MyFunction"
branch: b: Text Searches within a specific branch. branch:main
type: t: filematch, filename, file, or repo Limits result types. type:filematch

2. Negation

Negate an expression using the - symbol.

Examples:

  • Exclude a repository:
    -repo:"github.com/example/repo"
    
  • Exclude a language:
    -lang:javascript
    

3. Grouping

Group queries using parentheses () to create complex logic.

Examples:

  • Match either of two repositories:
    (repo:repo1 or repo:repo2)
    
  • Find test in either python or javascript files:
    content:test (lang:python or lang:javascript)
    

4. Logical Operators

Use or to combine multiple expressions.

Examples:

  • Match files in either of two languages:
    lang:go or lang:java
    

and boolean operator is applied automatically when expressions are separated by a space.


Special Query Types

Filtering by Repository Type

Zoekt supports filtering repositories by various attributes:

public:yes archived:no fork:no

This finds repositories that are public, not archived, and not forks.

Result Type Control

The type: operator controls what kind of results are returned:

type:repo content:config

This returns repository names instead of file matches. Valid values include:

  • filematch - Returns file content matches (default)
  • filename - Returns only matching filenames
  • repo - Returns only repository names

type: applies to the whole expression in its current scope, including or clauses. For example, type:repo foo or bar is equivalent to type:repo (foo or bar). Use parentheses to scope type: to only one branch, for example (type:repo foo) or bar.


Special Query Values

  • Boolean Values: Use yes or no for fields like archived: or fork:.

  • Text Fields: Text fields (content:, repo:, etc.) accept:

    • Strings: "my text"
    • Regular expressions: /my.*regex/
  • Escape Characters: To include special characters, use backslashes (\).

Examples:

  • Match the string foo"bar:
    content:"foo\"bar"
    
  • Match the regex foo.*bar:
    content:/foo.*bar/
    

Case Sensitivity

Zoekt supports three case sensitivity modes:

  • case:yes - Exact case matching
  • case:no - Case-insensitive matching
  • case:auto - Automatically detect based on pattern (default)

In auto mode, if the pattern contains uppercase letters, the search will be case-sensitive; otherwise, it will be case-insensitive.


Advanced Examples

  1. Search for content in Python files in public repositories:

    lang:python public:yes content:"my_function"
    
  2. Exclude archived repositories and match a regex:

    archived:no regex:/error.*handler/
    
  3. Find files named README.md in forks:

    file:"README.md" fork:yes
    
  4. Search for a specific branch:

    branch:main content:"TODO"
    
  5. Combine multiple fields:

    (repo:"github.com/example" or repo:"github.com/test") and lang:go
    

Tips

  1. Combine Filters: You can combine as many fields as needed. For instance:

    repo:"github.com/example" lang:go content:"init"
    
  2. Use Regular Expressions: Make complex content searches more powerful:

    content:/func\s+\w+\s*\(/
    
  3. Case Sensitivity: Use case:yes for exact matches:

    case:yes content:"ExactMatch"
    
  4. Match Specific File Types:

    file:".*\.go" content:"package main"
    

EBNF Summary

query       = expression , { "or" , expression } ;

expression  = negation
            | grouping
            | field ;

negation    = "-" , expression ;

grouping    = "(" , query , ")" ;

field       = ( ( "archived:" | "a:" ) , boolean )
            | ( ( "case:" | "c:" ) , ("yes" | "no" | "auto") )
            | ( ( "content:" | "c:" ) , text )
            | ( ( "file:" | "f:" ) , text )
            | ( ( "fork:" | "f:" ) , boolean )
            | ( ( "lang:" | "l:" ) , text )
            | ( ( "public:" ) , boolean )
            | ( ( "regex:" ) , text )
            | ( ( "repo:" | "r:" ) , text )
            | ( ( "sym:" ) , text )
            | ( ( "branch:" | "b:" ) , text )
            | ( ( "type:" | "t:" ) , type );

boolean     = "yes" | "no" ;
text        = string | regex ;
string      = '"' , { character | escape } , '"' ;
regex       = '/' , { character | escape } , '/' ;

type        = "filematch" | "filename" | "file" | "repo" ;