Rollback cherrypicking 207 branch UT fix. This PR will be merged after Sachin merge of rollback commits.#50
Open
debasatwa29 wants to merge 14 commits into
Conversation
Summary: Add BloomFilterStreamFanOutHashBasedNumberedShardSpec One pager: https://docs.google.com/document/d/173EgL8wRLGrF2o8_xMtfIPPM9HE1GDRrC28s6YsMjXc/edit?usp=sharing Reviewers: O1139 Druid, yyang Reviewed By: O1139 Druid, yyang Subscribers: jenkins, yyang, mleonard, shawncao, realtime-analytics Differential Revision: https://phabricator.pinadmin.com/D719672 commit : faf5d00
Summary: Allow mixed shard spec type for real time ingestion Background: for real time ingestion of an existing data source, there will be exceptions thrown if we change the shard spec type, this diff adds an optional config to allow it. If not set, no behavior change. Reviewers: O1139 Druid, itallam Reviewed By: O1139 Druid, itallam Subscribers: itallam, jenkins, shawncao, realtime-analytics Differential Revision: https://phabricator.pinadmin.com/D726029 commit: 64dc058
Summary: Fix real time shard spec compatibility issue Reviewers: O1139 Druid, yyang, itallam Reviewed By: O1139 Druid, yyang, itallam Subscribers: jenkins, shawncao, realtime-analytics Differential Revision: https://phabricator.pinadmin.com/D726744 commit: 763e96f
Summary: Add in memory bitmap support when rollup is false Reviewers: O1139 Druid, jgu, yyang, itallam Reviewed By: O1139 Druid, jgu, yyang, itallam Subscribers: jenkins, shawncao, realtime-analytics JIRA Issue(s): RTA-2719 Differential Revision: https://phabricator.pinadmin.com/D729042 commit: 0a54b60
…o monitor latency to insert rows to bloom filter and add an option to config monitor only certain data source in a broker stage Summary: Add a metric to monitor pending persist submission and add a metric to monitor latency to insert rows to bloom filter and add an option to config monitor only certain data source in a broker stage Reviewers: O1139 Druid, itallam Reviewed By: O1139 Druid, itallam Subscribers: shawncao, realtime-analytics Differential Revision: https://phabricator.pinadmin.com/D732268 Commit: 6106c47
Summary:
Currently the druid services exports <hostname>:<port> to ZK by default, which works good when running
on Teletraan, as the hostname are valid ec2 urls. But when running on K8s, the hostname become unresolvable
Pod names. This can be fixed by support exporting <IP>:<port> as service address.
Test Plan:
Tested manually.
{F28879950}
Reviewers: O1139 Druid, ericnguyen
Reviewed By: O1139 Druid, ericnguyen
Differential Revision: https://phabricator.pinadmin.com/D750919
commit: 643d4a0
Signed-off-by: ssagare <ssagare@pinterest.com>
Summary: Set useInMemoryBitmapInQuery default to true. Now there's only one knob `enableInMemoryBitmap` in ingestion spec that controls whether to use in memory bitmap Reviewers: O1139 Druid, itallam Reviewed By: O1139 Druid, itallam Subscribers: jenkins, shawncao, #realtime-analytics Differential Revision: https://phabricator.pinadmin.com/D754879 Signed-off-by: ssagare <ssagare@pinterest.com>
Summary: stream namespaced fan out shard spec. Add namespace support to stream fan out shard spec. Test Plan: Made corresponding change in DRUIDHADOOP repo. In flink producer schema, use partition dimension together with fanOutSize to calculate the kafka partition. Ingest some data. Query by timeline, getting row info. Use the partition dimension value from timeline query to query by filter. Both results are same. Sum up metrics ct and verified same for both as well. Reviewers: O1139 Druid, jwang Reviewed By: O1139 Druid, jwang Subscribers: jwang, jenkins, mleonard, #realtime-analytics Differential Revision: https://phabricator.pinadmin.com/D755523 commit : 648473d Signed-off-by: ssagare <ssagare@pinterest.com>
…; fix an issue on loading bloom filters in broker Summary: Pull upstream fix apache#10664 to remove confusing error messages in the log "Not all bytes were read from the S3ObjectInputStream" Add a query context returnEmptyResults for debugging pruning effect purpose Fix an issue on read only byte buffer exception leading to unable to load bloom filters in broker Reviewers: O1139 Druid, yyang Reviewed By: O1139 Druid, yyang Subscribers: jenkins, shawncao, realtime-analytics Differential Revision: https://phabricator.pinadmin.com/D755650 commit : 54b73af Signed-off-by: ssagare <ssagare@pinterest.com>
…ema definition to support both real time and batch segments
Summary:
Add generic bloom filter index creation support in ingestion spec schema definition to support both real time and batch segments
Added a flag `createBloomFilterIndex` which defaults to false and can be optionally set to true for any String dimension in the ingestion schema to create bloom filter indexes for a dimension which can be used later by broker hosts to prune segments. The segment pruning logic in broker process will now look for both filters on dimensions having bloom filter indexes created in addition to filters on current partition dimensions that are used by some shard specs like HashBasedShardSpec, SingleDimensionShardSpec and BloomFilterNamedShardSpec. The bloom filter index can be enabled regardless of what shard spec is in use and regardless of whether a segment is created by batch or real time ingestion.
"dimensionsSpec": {
"dimensions": [
{"name": "partner_id", "type":"string", "createBloomFilterIndex": true},
"eventtype",
"app",
{"name": "root_pin_id", "type":"string", "createBloomFilterIndex": true},
"pin_id",
"contenttype",
"pinformat"
]
}
Test Plan: Unit test and integration test
Reviewers: O1139 Druid, jgu, itallam
Reviewed By: O1139 Druid, jgu, itallam
Subscribers: jenkins, mleonard, shawncao, #realtime-analytics
Differential Revision: https://phabricator.pinadmin.com/D747062
commit: f1f73d1
Signed-off-by: ssagare <ssagare@pinterest.com>
…ema definition to support both real time and batch segments
Summary:
Add generic bloom filter index creation support in ingestion spec schema definition to support both real time and batch segments
Added a flag `createBloomFilterIndex` which defaults to false and can be optionally set to true for any String dimension in the ingestion schema to create bloom filter indexes for a dimension which can be used later by broker hosts to prune segments. The segment pruning logic in broker process will now look for both filters on dimensions having bloom filter indexes created in addition to filters on current partition dimensions that are used by some shard specs like HashBasedShardSpec, SingleDimensionShardSpec and BloomFilterNamedShardSpec. The bloom filter index can be enabled regardless of what shard spec is in use and regardless of whether a segment is created by batch or real time ingestion.
"dimensionsSpec": {
"dimensions": [
{"name": "partner_id", "type":"string", "createBloomFilterIndex": true},
"eventtype",
"app",
{"name": "root_pin_id", "type":"string", "createBloomFilterIndex": true},
"pin_id",
"contenttype",
"pinformat"
]
}
Test Plan: Unit test and integration test
Reviewers: O1139 Druid, jgu, itallam
Reviewed By: O1139 Druid, jgu, itallam
Subscribers: jenkins, mleonard, shawncao, #realtime-analytics
Differential Revision: https://phabricator.pinadmin.com/D747062
commit: f1f73d1
Signed-off-by: ssagare <ssagare@pinterest.com>
Signed-off-by: ssagare <ssagare@pinterest.com>
Signed-off-by: ssagare <ssagare@pinterest.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes # Rollback cherrypicking 207 branch UT fix.
Rollback cherrypicking 207 branch UT fix. This PR will be merged after Sachin merge of rollback commits.
This PR has: