Skip to content

Commit 033fec4

Browse files
authored
[improve][pip] PIP-404: Introduce per ledger properties (#23837)
1 parent bfbd5cf commit 033fec4

1 file changed

Lines changed: 91 additions & 0 deletions

File tree

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# PIP-404: Introduce per-ledger properties
2+
3+
# Background knowledge
4+
5+
As we don't have a secondary index on the Bookkeeper, so we can't query entries by message metadata efficiently.
6+
The `ManagedCursor` provides a method `asyncFindNewestMatching` to find the newest entry that matches the given
7+
predicate by binary search(See `OpFindNewest.java`).
8+
In https://github.com/apache/pulsar/pull/22792, we optimized `seeking by timestamp` by calculating
9+
the range of ledgers that may contain the target timestamp by `LedgerInfo#timestamp` and we don't need to scan all
10+
ledgers.
11+
12+
However, when we enabled `AppendIndexMetadataInterceptor` and we want to query entries by `BrokerEntryMetadata#index`,
13+
there is no more efficient way,
14+
we have to scan all ledgers by binary search to find the target entry.
15+
16+
# Motivation
17+
18+
Introduce per-ledger properties and we can store the extra per-ledger properties in the `LedgerInfo`,
19+
so we can query entries by `incremental index` more efficiently, say, `BrokerEntryMetadata#index`.
20+
21+
# Goals
22+
23+
## In Scope
24+
25+
* Provide a way to set per-ledger properties, it should be a generic way and can be used by any plugin.
26+
27+
## Out of Scope
28+
29+
* Set extra per-ledger properties, it should be done by the specific plugin. Such as `KoP`.
30+
31+
# High Level Design
32+
33+
We can store the `incremental index` of the first entry in the ledger into `LedgerInfo#properties`, say,
34+
`BrokerEntryMetadata#index`.
35+
When we want to query entries by `BrokerEntryMetadata#index`, we can calculate the range of ledgers that may contain the
36+
target index by `LedgerInfo#properties` and we don't need to scan all ledgers.
37+
38+
In https://github.com/apache/pulsar/pull/22792, we provided a new method to find the newest entry with given range of entries:
39+
```java
40+
void asyncFindNewestMatching(FindPositionConstraint constraint, Predicate<Entry> condition,
41+
Position startPosition, Position endPosition, FindEntryCallback callback,
42+
Object ctx, boolean isFindFromLedger) {
43+
}
44+
```
45+
We can use this method directly in the above scenario.
46+
47+
# Detailed Design
48+
49+
## Public-facing Changes
50+
51+
### Public API
52+
53+
* Add the following method in `ManagedLedger`:
54+
55+
```java
56+
CompletableFuture<Void> asyncAddLedgerProperty(long ledgerId, String key, String value);
57+
CompletableFuture<Void> asyncRemoveLedgerProperty(long ledgerId, String key);
58+
```
59+
60+
### Binary protocol
61+
62+
* Add a new field `properties` in `LedgerInfo`:
63+
64+
```protobuf
65+
message LedgerInfo {
66+
required int64 ledgerId = 1;
67+
optional int64 entries = 2;
68+
optional int64 size = 3;
69+
optional int64 timestamp = 4;
70+
optional OffloadContext offloadContext = 5;
71+
// Add the following field
72+
repeated KeyValue properties = 6;
73+
}
74+
```
75+
76+
# Backward & Forward Compatibility
77+
78+
It is fully backward compatible.
79+
80+
# Alternatives
81+
82+
None
83+
84+
# Links
85+
86+
<!--
87+
Updated afterwards
88+
-->
89+
90+
* Mailing List discussion thread: https://lists.apache.org/thread/bcf13todophd05tn0qrrdwqyw8yvboly
91+
* Mailing List voting thread: https://lists.apache.org/thread/dtj9ccntsjorb54yrb5fr3pppwl4m2r5

0 commit comments

Comments
 (0)