Python JSONPath extends the RFC 9535 specification with additional features and relaxed rules. If you need strict compliance with RFC 9535, set strict=True when calling findall(), finditer(), etc., which enforces the standard without these extensions.
In this guide, we first outline the standard syntax (see the specification for the formal definition), and then describe the non-standard extensions and their semantics in detail.
Think of a JSON document as a tree, objects (mappings) and arrays can contain other objects, arrays, or scalar values. Each of these (object, array, or scalar) is a node in the tree. The outermost object or array is called the root node.
In this guide, a JSON "document" may refer to:
- A file containing valid JSON text
- A Python string containing valid JSON text
- A Python object composed of dictionaries (or any Mapping), lists (or any Sequence), strings, numbers, booleans, or
None
A JSONPath expression (aka "query") is made up of a sequence of segments. Each segment contains one or more selectors:
- A segment corresponds to a step in the path from one set of nodes to the next.
- A selector describes how to choose nodes within that step (for example, by name, by index, or by wildcard).
What follows is a description of these selectors, starting with the standard ones defined in RFC 9535.
The root identifier, $, refers to the outermost node in the target document. This can be an object, an array, or a scalar value.
A query containing only the root identifier simply returns the entire input document.
Example query
$
{
"categories": [
{ "id": 1, "name": "fiction" },
{ "id": 2, "name": "non-fiction" }
]
}[
{
"categories": [
{ "id": 1, "name": "fiction" },
{ "id": 2, "name": "non-fiction" }
]
}
]A name selector matches the value of an object member by its key. You can write it in either shorthand notation (.thing) or bracket notation (['thing'] or ["thing"]).
Dot notation can be used when the property name is a valid identifier. Bracket notation is required when the property name contains spaces, special characters, or starts with a number.
Example query
$.book.title
{
"book": {
"title": "Moby Dick",
"author": "Herman Melville"
}
}["Moby Dick"]The index selector selects an element from an array by its index. Indices are zero-based and enclosed in brackets, [0]. If the index is negative, items are selected from the end of the array.
Example query
$.categories[0].name
{
"categories": [
{ "id": 1, "name": "fiction" },
{ "id": 2, "name": "non-fiction" }
]
}["fiction"]The wildcard selector matches all member values of an object or all elements in an array. It can be written as .* (shorthand notation) or [*] (bracket notation).
Example query
$.categories[*].name
{
"categories": [
{ "id": 1, "name": "fiction" },
{ "id": 2, "name": "non-fiction" }
]
}["fiction", "non-fiction"]The slice selector allows you to select a range of elements from an array. A start index, ending index and step size are all optional and separated by colons, [start:end:step]. Negative indices count from the end of the array, just like standard Python slicing.
Example query
$.items[1:4:2]
{
"items": ["a", "b", "c", "d", "e", "f"]
}["b", "d"]Filters allow you to remove nodes from a selection based on a Boolean expression, [?expression]. A filter expression evaluates each node in the context of either the root ($) or current (@) node.
When filtering a mapping-like object, @ identifies the current member value. When filtering a sequence-like object, @ identifies the current element.
Comparison operators include ==, !=, <, >, <=, and >=. Logical operators && (and) and || (or) can combine terms, and parentheses can be used to group expressions.
A filter expression on its own - without a comparison - is treated as an existence test.
Example query
$..products[?(@.price < $.price_cap)]
{
"price_cap": 10,
"products": [
{ "name": "apple", "price": 5 },
{ "name": "orange", "price": 12 },
{ "name": "banana", "price": 8 }
]
}[
{ "name": "apple", "price": 5 },
{ "name": "banana", "price": 8 }
]Filter expressions can also call predefined function extensions.
So far we've seen shorthand notation (.selector) and segments with just one selector ([selector]). Here we cover the descendant segment and segments with multiple selectors.
A segment can include multiple selectors separated by commas and enclosed in square brackets ([selector, selector, ...]). Any valid selector (names, indices, slices, filters, or wildcards) can appear in the list.
Example query
$.store.book[0,2]
{
"store": {
"book": [
{ "title": "Book A", "price": 10 },
{ "title": "Book B", "price": 12 },
{ "title": "Book C", "price": 8 }
]
}
}[
{ "title": "Book A", "price": 10 },
{ "title": "Book C", "price": 8 }
]The descendant segment (..) visits all object member values and array elements under the current object or array, applying the selector or selectors that follow to each visited node. It must be followed by a shorthand selector (names, wildcards, etc.) or a bracketed list of one or more selectors.
Example query
$..price
{
"store": {
"book": [
{ "title": "Book A", "price": 10 },
{ "title": "Book B", "price": 12 }
],
"bicycle": { "color": "red", "price": 19.95 }
}
}[10, 12, 19.95]The selectors and identifiers described in this section are an extension to the RFC 9535 specification. They are enabled by default. Set strict=True when constructing a JSONPathEnvironment, calling findall(), finditer(), etc. to disable all non-standard features.
Also note that when strict=False:
- The root identifier (
$) is optional and paths starting with a dot (.) are OK..thingis the same as$.thing, as isthingand$["thing"]. - Leading and trailing whitespace is OK.
- Explicit comparisons to
undefined(akamissing) are supported as well as implicit existence tests.
New in version 2.0.0
The key selector, .~name or [~'name'], selects at most one name from an object member. It is syntactically similar to the standard name selector, with the addition of a tilde (~) prefix.
When applied to a JSON object, the key selector selects the name from an object member, if that name exists, or nothing if it does not exist. This complements the standard name selector, which select the value from a name/value pair.
When applied to an array or primitive value, the key selector selects nothing.
Key selector strings must follow the same processing semantics as name selector strings, as described in section 2.3.2.1 of RFC 9535.
!!! info
The key selector is introduced to facilitate valid normalized paths for nodes produced by the [keys selector](#keys-selector) and the [keys filter selector](#keys-filter-selector). I don't expect it will be of much use elsewhere.
selector = name-selector /
wildcard-selector /
slice-selector /
index-selector /
filter-selector /
key-selector /
keys-selector /
keys-filter-selector
key-selector = "~" name-selector
child-segment = bracketed-selection /
("."
(wildcard-selector /
member-name-shorthand /
member-key-shorthand))
descendant-segment = ".." (bracketed-selection /
wildcard-selector /
member-name-shorthand /
member-key-shorthand)
member-key-shorthand = "~" name-first *name-char
{
"a": [{ "b": "x", "c": "z" }, { "b": "y" }]
}| Query | Result | Result Paths | Comment |
|---|---|---|---|
$.a[0].~c |
"c" |
$['a'][0][~'c'] |
Key of nested object |
$.a[1].~c |
Key does not exist | ||
$..[~'b'] |
"b" "b" |
$['a'][0][~'b'] $['a'][1][~'b'] |
Descendant, single quoted key |
$..[~"b"] |
"b" "b" |
$['a'][0][~'b'] $['a'][1][~'b'] |
Descendant, double quoted key |
New in version 0.6.0
The keys selector, ~ or [~], selects all names from an object’s name/value members. This complements the standard wildcard selector, which selects all values from an object’s name/value pairs.
As with the wildcard selector, the order of nodes resulting from a keys selector is not stipulated.
When applied to an array or primitive value, the keys selector selects nothing.
The normalized path of a node selected using the keys selector uses key selector syntax.
keys-selector = "~"
{
"a": [{ "b": "x", "c": "z" }, { "b": "y" }]
}| Query | Result | Result Paths | Comment |
|---|---|---|---|
$.a[0].~ |
"b" "c" |
$['a'][0][~'b'] $['a'][0][~'c'] |
Object keys |
$.a.~ |
Array keys | ||
$.a[0][~, ~] |
"b" "c" "c" "b" |
$['a'][0][~'b'] $['a'][0][~'c'] $['a'][0][~'c'] $['a'][0][~'b'] |
Non-deterministic ordering |
$..[~] |
"a" "b" "c" "b" |
$[~'a'] $['a'][0][~'b'] $['a'][0][~'c'] $['a'][1][~'b'] |
Descendant keys |
New in version 2.0.0
The keys filter selector selects names from an object’s name/value members. It is syntactically similar to the standard filter selector, with the addition of a tilde (~) prefix.
~?<logical-expr>
Whereas the standard filter selector will produce a node for each value from an object’s name/value members - when its expression evaluates to logical true - the keys filter selector produces a node for each name in an object’s name/value members.
Logical expression syntax and semantics otherwise match that of the standard filter selector. @ still refers to the current member value. See also the current key identifier.
When applied to an array or primitive value, the keys filter selector selects nothing.
The normalized path of a node selected using the keys filter selector uses key selector syntax.
filter-selector = "~?" S logical-expr
[{ "a": [1, 2, 3], "b": [4, 5] }, { "c": { "x": [1, 2] } }, { "d": [1, 2, 3] }]| Query | Result | Result Paths | Comment |
|---|---|---|---|
$.*[~?length(@) > 2] |
"a" "d" |
$[0][~'a'] $[2][~'d'] |
Conditionally select object keys |
$.*[~?@.x] |
"c" |
$[1][~'c'] |
Existence test |
$[~?(true == true)] |
Keys from an array |
New in version 2.0.0
The singular query selector consist of an embedded absolute singular query, the result of which is used as an object member name or array element index.
If the embedded query resolves to a string or int value, at most one object member value or array element value is selected. Otherwise the singular query selector selects nothing.
selector = name-selector /
wildcard-selector /
slice-selector /
index-selector /
filter-selector /
singular-query-selector
singular-query-selector = abs-singular-query
{
"a": {
"j": [1, 2, 3],
"p": {
"q": [4, 5, 6]
}
},
"b": ["j", "p", "q"],
"c d": {
"x": {
"y": 1
}
}
}| Query | Result | Result Path | Comment |
|---|---|---|---|
$.a[$.b[1]] |
{"q": [4, 5, 6]} |
$['a']['p'] |
Object name from embedded singular query |
$.a.j[$['c d'].x.y] |
2 |
$['a']['j'][1] |
Array index from embedded singular query |
$.a[$.b] |
Embedded singular query does not resolve to a string or int value |
# is the current key identifier. # will be the name of the current object member, or index of the current array element. This complements the current node identifier (@), which refers to a member value or array element, respectively.
It is a syntax error to follow the current key identifier with segments, as if it were a filter query.
When used as an argument to a function, the current key is of ValueType, and outside a function call it must be compared.
comparable = literal /
singular-query / ; singular query value
function-expr / ; ValueType
current-key-identifier
function-argument = literal /
filter-query / ; (includes singular-query)
logical-expr /
function-expr /
current-key-identifier
current-key-identifier = "#"
{ "abc": [1, 2, 3], "def": [4, 5], "abx": [6], "aby": [] }| Query | Result | Result Path | Comment |
|---|---|---|---|
$[?match(#, '^ab.*') && length(@) > 0 ] |
[1,2,3] [6] |
$['abc'] $['abx'] |
Match on object names |
$.abc[?(# >= 1)] |
2 3 |
$['abc'][1] $['abc'][2] |
Compare current array index |
New in version 0.11.0
The pseudo root identifier (^) behaves like the standard root identifier ($), but conceptually wraps the target JSON document in a single-element array. This allows the root document itself to be conditionally selected by filters.
jsonpath-query = (root-identifier / pseudo-root-identifier) segments
root-identifier = "$"
pseudo-root-identifier = "^"
{ "a": { "b": 42 }, "n": 7 }| Query | Result | Result Path | Comment |
|---|---|---|---|
^[?@.a.b > 7] |
{ "a": { "b": 42 } } |
^[0] |
Conditionally select the root value |
^[?@.a.v > value(^.*.n)] |
{ "a": { "b": 42 }, "n": 7 } |
^[0] |
Embedded pseudo root query |
The filter context identifier (_) starts an embedded query, similar to the root identifier ($) and current node identifier (@), but targets JSON-like data passed as the filter_context argument to findall() and finditer().
current-node-identifier = "@"
extra-context-identifier = "_"
filter-query = rel-query / extra-context-query / jsonpath-query
rel-query = current-node-identifier segments
extra-context-query = extra-context-identifier segments
singular-query = rel-singular-query / abs-singular-query / extra-context-singular-query
rel-singular-query = current-node-identifier singular-query-segments
abs-singular-query = root-identifier singular-query-segments
extra-context-singular-query = extra-context-identifier singular-query-segments
{ "a": [{ "b": 42 }, { "b": 3 }] }{ "c": 42 }| Query | Result | Result Path | Comment |
|---|---|---|---|
$.a[?@.b == _.c] |
{ "b": 42 } |
$['a'][0] |
Comparison with extra context singular query |
In addition to the operators described below, the standard logical and operator (&&) is aliased as and, the standard logical or operator (||) is aliased as or, and null is aliased as nil and none.
Also, true, false, null and their aliases can start with an upper case letter.
The membership operators test whether one value occurs within another.
An infix expression using contains evaluates to true if the right-hand side is a member of the left-hand side, and false otherwise.
- If the left-hand side is an object and the right-hand side is a string, the result is true if the object has a member with that name.
- If the left-hand side is an array, the result is true if any element of the array is equal to the right-hand side.
- For scalars (strings, numbers, booleans, null),
containsalways evaluates to false.
The in operator is equivalent to contains with operands reversed. This makes contains and in symmetric, so either form may be used depending on which reads more naturally in context.
A list literal is a comma separated list of JSONPath expression literals. List should appear on the left-hand side of contains or the right-hand side of in.
basic-expr = paren-expr /
comparison-expr /
membership-expr /
test-expr
membership-expr = comparable S membership-op S comparable
membership-operator = "contains" / "in"
membership-operand = literal /
singular-query / ; singular query value
function-expr / ; ValueType
list-literal
list-literal = "[" S literal *(S "," S literal) S "]"
{
"x": [{ "a": ["foo", "bar"] }, { "a": ["bar"] }],
"y": [{ "a": { "foo": "bar" } }, { "a": { "bar": "baz" } }],
"z": [{ "a": "foo" }, { "a": "bar" }]
}| Query | Result | Result Path | Comment |
|---|---|---|---|
$.x[?@.a contains 'foo'] |
{"a": ["foo", "bar"]} |
$['x'][0] |
Array contains string literal |
$.y[?@.a contains 'foo'] |
{"a": ["foo", "bar"]} |
$['y'][0] |
Object contains string literal |
$.x[?'foo' in @.a] |
{"a": ["foo", "bar"]} |
$['x'][0] |
String literal in array |
$.y[?'foo' in @.a] |
{"a": ["foo", "bar"]} |
$['y'][0] |
String literal in object |
$.z[?(['bar', 'baz'] contains @.a)] |
{"a": "bar"} |
$['z'][1] |
List literal contains embedded query |
=~ is an infix operator that matches the left-hand side with a regular expression literal on the right-hand side. Regular expression literals use a syntax similar to that found in JavaScript, where the pattern to match is surrounded by slashes, /pattern/, optionally followed by flags, /pattern/flags.
$..products[?(@.description =~ /.*trainers/i)]
The union or concatenation operator, |, combines matches from two or more paths.
The intersection operator, &, produces matches that are common to both left and right paths.
Note that compound queries are not allowed inside filter expressions.
jsonpath-query = root-identifier segments
compound-jsonpath-query = jsonpath-query compound-op jsonpath-query
compound-op = "|" /
"&"
$..products.*.price | $.price_cap
$.categories[?(@.name == 'footwear')].products.* & $.categories[?(@.name == 'headwear')].products.*