Skip to content

Commit 968c0aa

Browse files
add readme
1 parent 88401c3 commit 968c0aa

1 file changed

Lines changed: 123 additions & 0 deletions

File tree

README.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# SearchExpressionParser
2+
3+
![Swift 4.2](https://img.shields.io/badge/Swift-4.2-blue.svg?style=flat)
4+
![Version](https://img.shields.io/github/tag/CleanCocoa/SearchExpressionParser.svg?style=flat)
5+
![License](https://img.shields.io/github/license/CleanCocoa/SearchExpressionParser.svg?style=flat)
6+
![Platform](https://img.shields.io/badge/platform-macOS-lightgrey.svg?style=flat)
7+
[![Carthage compatible](https://img.shields.io/badge/Carthage-compatible-4BC51D.svg?style=flat)](https://github.com/Carthage/Carthage)
8+
9+
Parses search strings (as in: what you put into a search engine) into evaluable expressions.
10+
11+
## Parsing
12+
13+
You call the `Parser.parse(searchString:)`. This returns a tree of the parsed expression combinations. You can ask the `Expression` object if it is matches a given haystack, for example:
14+
15+
```swift
16+
import SearchExpressionParser
17+
guard let expr = try? Parser.parse(searchString: "Hello") else { fatalError() }
18+
expr.isSatisfied(by: "Hello World!") // true
19+
```
20+
21+
Empty search strings evaluate to a wildcard matching anything.
22+
23+
### Efficient full-text search
24+
25+
To use search expressions effectively in an app, I found it beneficial to operate on an all-lowercase representation of the text and use C's `strstr`.
26+
27+
So in a note-taking app, for example, you should consider lowercasing your notes in-memory and then use C-String comparison for the expressions.
28+
29+
First, make your text implement the `CStringExpressionSatisfiable` protocol:
30+
31+
```swift
32+
struct Note {
33+
let text: String
34+
private let cString: [CChar]
35+
36+
init(text: String) {
37+
self.text = text
38+
self.cString = text
39+
// Favor simple over grapheme cluster characters
40+
.precomposedStringWithCanonicalMapping
41+
.cString(using: .utf8)!
42+
}
43+
}
44+
45+
import SearchExpressionParser
46+
47+
extension Note: CStringExpressionSatisfiable {
48+
func matches(needle: [CChar]) -> Bool {
49+
return strstr(self.cString, needle) != nil
50+
}
51+
}
52+
```
53+
54+
Then pass this object to the expression.
55+
56+
```swift
57+
let warAndPeace = Note(String(contentsOf: "books/Tolstoy/War-and-Peace.txt"))
58+
let protagonist = try! Parser.parse(searchString: "\"Pierre Bezukhov\" OR \"Pyotr Kirillovich\"")
59+
protagonist.isSatisfied(by: warAndPeace) // true
60+
```
61+
62+
This sadly puts the burden of implementing the matching algorithm on your side, but this is by design so you keep a C-String around instead of relying on the framework to convert the text for you on the fly -- because that's be useless. The speed gain is well worth the couple lines of code compared to regular `String.contains` matching, which even gets slower when Emoji are involved.
63+
64+
### Operators
65+
66+
Operators are all caps: `AND`, `OR`, `NOT`/`!`.
67+
68+
- `foo bar baz` is equivalent to `foo AND bar AND baz`
69+
- `NOT b` equals `!b`
70+
- `! b` (note the space) is `! AND b`
71+
- `"!b"` is a phrase search for "!b", matching the literal exclamation mark
72+
- Escaping works in addition to phrase search, too: `\!b`
73+
- Escaping in phrase searches also works: `hello "you \"lovely\" specimen"`
74+
- Escaping operator keywords treats them literal: `\AND`. Note that a lowercase "and" will not be treated as an operator, only all-caps will.
75+
76+
You can parenthesize expressions:
77+
78+
!(foo OR (baz AND !bar))
79+
80+
... is, of course, equivalent to:
81+
82+
!foo OR !baz AND !foo OR !bar
83+
84+
As of yet, there is no real operator precedence implementation because the full-text search context I was using this in didn't need that.
85+
86+
The `Expression` object of this nested term looks like this:
87+
88+
// !(foo OR (baz AND !bar))
89+
NotNode(
90+
OrNode(lhs: ContainsNode("foo"),
91+
rhs: AndNode(lhs: ContainsNode("baz"),
92+
rhs: NotNode(ContainsNode("bar")))))
93+
94+
95+
### Expressions
96+
97+
When you call the high-level `Parser.parse(searchString:)` entry point, you get an object in return that conforms to `Expression`.
98+
99+
The `Expression` protocol is:
100+
101+
public protocol Expression {
102+
func isSatisfied(by satisfiable: StringExpressionSatisfiable) -> Bool
103+
func isSatisfied(by satisfiable: CStringExpressionSatisfiable) -> Bool
104+
}
105+
106+
You can pass the haystack to `isStatisfied`, e.g. the text you want to search.
107+
108+
When the case of words doesn't matter, remember it's much faster if you make the text you want to search conform to `CStringExpressionSatisfiable` and pass _that_ in, instead. See above for details.
109+
110+
The expressions provided are:
111+
112+
- `AnythingNode` will match anything you put it; it's the wildcard or empty search.
113+
- `ContainsNode` represents check similar to `String.contains`.
114+
- `NotNode` wraps 1 other node and reverses the result of its outcome.
115+
- `AndNode` and `OrNode` both take 2 other notes and combine their results with the boolean operator equivalents.
116+
117+
## License
118+
119+
Copyright (c) 2018 Christian Tietze. Distributed under the MIT License.
120+
121+
## Apps that use this
122+
123+
- [The Archive](https://zettelkasten.de/the-archive/), a fast plain-text note-taking app for macOS.

0 commit comments

Comments
 (0)