|
| 1 | +# PIP-404: Introduce per-ledger properties |
| 2 | + |
| 3 | +# Background knowledge |
| 4 | + |
| 5 | +As we don't have a secondary index on the Bookkeeper, so we can't query entries by message metadata efficiently. |
| 6 | +The `ManagedCursor` provides a method `asyncFindNewestMatching` to find the newest entry that matches the given |
| 7 | +predicate by binary search(See `OpFindNewest.java`). |
| 8 | +In https://github.com/apache/pulsar/pull/22792, we optimized `seeking by timestamp` by calculating |
| 9 | +the range of ledgers that may contain the target timestamp by `LedgerInfo#timestamp` and we don't need to scan all |
| 10 | +ledgers. |
| 11 | + |
| 12 | +However, when we enabled `AppendIndexMetadataInterceptor` and we want to query entries by `BrokerEntryMetadata#index`, |
| 13 | +there is no more efficient way, |
| 14 | +we have to scan all ledgers by binary search to find the target entry. |
| 15 | + |
| 16 | +# Motivation |
| 17 | + |
| 18 | +Introduce per-ledger properties and we can store the extra per-ledger properties in the `LedgerInfo`, |
| 19 | +so we can query entries by `incremental index` more efficiently, say, `BrokerEntryMetadata#index`. |
| 20 | + |
| 21 | +# Goals |
| 22 | + |
| 23 | +## In Scope |
| 24 | + |
| 25 | +* Provide a way to set per-ledger properties, it should be a generic way and can be used by any plugin. |
| 26 | + |
| 27 | +## Out of Scope |
| 28 | + |
| 29 | +* Set extra per-ledger properties, it should be done by the specific plugin. Such as `KoP`. |
| 30 | + |
| 31 | +# High Level Design |
| 32 | + |
| 33 | +We can store the `incremental index` of the first entry in the ledger into `LedgerInfo#properties`, say, |
| 34 | +`BrokerEntryMetadata#index`. |
| 35 | +When we want to query entries by `BrokerEntryMetadata#index`, we can calculate the range of ledgers that may contain the |
| 36 | +target index by `LedgerInfo#properties` and we don't need to scan all ledgers. |
| 37 | + |
| 38 | +In https://github.com/apache/pulsar/pull/22792, we provided a new method to find the newest entry with given range of entries: |
| 39 | +```java |
| 40 | +void asyncFindNewestMatching(FindPositionConstraint constraint, Predicate<Entry> condition, |
| 41 | + Position startPosition, Position endPosition, FindEntryCallback callback, |
| 42 | + Object ctx, boolean isFindFromLedger) { |
| 43 | + } |
| 44 | +``` |
| 45 | +We can use this method directly in the above scenario. |
| 46 | + |
| 47 | +# Detailed Design |
| 48 | + |
| 49 | +## Public-facing Changes |
| 50 | + |
| 51 | +### Public API |
| 52 | + |
| 53 | +* Add the following method in `ManagedLedger`: |
| 54 | + |
| 55 | +```java |
| 56 | + CompletableFuture<Void> asyncAddLedgerProperty(long ledgerId, String key, String value); |
| 57 | + CompletableFuture<Void> asyncRemoveLedgerProperty(long ledgerId, String key); |
| 58 | +``` |
| 59 | + |
| 60 | +### Binary protocol |
| 61 | + |
| 62 | +* Add a new field `properties` in `LedgerInfo`: |
| 63 | + |
| 64 | +```protobuf |
| 65 | +message LedgerInfo { |
| 66 | + required int64 ledgerId = 1; |
| 67 | + optional int64 entries = 2; |
| 68 | + optional int64 size = 3; |
| 69 | + optional int64 timestamp = 4; |
| 70 | + optional OffloadContext offloadContext = 5; |
| 71 | + // Add the following field |
| 72 | + repeated KeyValue properties = 6; |
| 73 | +} |
| 74 | +``` |
| 75 | + |
| 76 | +# Backward & Forward Compatibility |
| 77 | + |
| 78 | +It is fully backward compatible. |
| 79 | + |
| 80 | +# Alternatives |
| 81 | + |
| 82 | +None |
| 83 | + |
| 84 | +# Links |
| 85 | + |
| 86 | +<!-- |
| 87 | +Updated afterwards |
| 88 | +--> |
| 89 | + |
| 90 | +* Mailing List discussion thread: https://lists.apache.org/thread/bcf13todophd05tn0qrrdwqyw8yvboly |
| 91 | +* Mailing List voting thread: https://lists.apache.org/thread/dtj9ccntsjorb54yrb5fr3pppwl4m2r5 |
0 commit comments