IBX-7987: Node filter for extraction text#155
Conversation
b77b0f1 to
8193162
Compare
ef8a6e0 to
f9aaba8
Compare
f9aaba8 to
40499a9
Compare
konradoboza
left a comment
There was a problem hiding this comment.
Interesting approach and convenient usage. LGTM.
| /** | ||
| * Return false to preserve the node, true to remove it. | ||
| */ | ||
| public function filter(DOMNode $node): bool; |
There was a problem hiding this comment.
IMHO the result of false/true combined with filter might be a bit misleading. My first thought was that it should work the opposite way. Maybe changing it to filterOut would be more clear?
There was a problem hiding this comment.
I personalny stand with the original naming.
There was a problem hiding this comment.
I have to agree with @alongosz here.
Most methods (ArrayCollection::filter) and functions (array_filter) work in the opposite way:
- when
true, an entry is preserved - when
false, an entry is removed
Therefore, I'd suggest reversing the logic to comply with generally established PHP practice.
….php Co-authored-by: Andrew Longosz <alongosz@users.noreply.github.com>
|
| /** | ||
| * Return false to preserve the node, true to remove it. | ||
| */ | ||
| public function filter(DOMNode $node): bool; |
There was a problem hiding this comment.
I have to agree with @alongosz here.
Most methods (ArrayCollection::filter) and functions (array_filter) work in the opposite way:
- when
true, an entry is preserved - when
false, an entry is removed
Therefore, I'd suggest reversing the logic to comply with generally established PHP practice.




Background
See JIRA issue.
Node filtering
This PR introduced extension point allowing to exclude nodes from text extraction:
\Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterInterfaceimplementations should be tagged viaibexa.field_type.richtext.text_extractor.node_filterCommon use case
\Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterFactoryInterface1 covers the common use case: excluding node with given path, e.g./foo/bar/baz.Usage example:
1 The Factory pattern prevents the exposure of implementation details within contracts
Nodes excluded out of the box
The following nodes are excluded by default (xpath syntax):
//eztemplate/ezconfigNotes for QA
Identify other potential elements which should be excluded from full text index
TODO:
$ composer fix-cs).