Add operator select which selects a sample from one of the inputs#3130
Add operator select which selects a sample from one of the inputs#3130mzient wants to merge 4 commits into
select which selects a sample from one of the inputs#3130Conversation
based on a run-time provided index. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
|
CI MESSAGE: [2553368]: BUILD STARTED |
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
|
CI MESSAGE: [2553418]: BUILD STARTED |
|
CI MESSAGE: [2553418]: BUILD FAILED |
| namespace dali { | ||
|
|
||
| DALI_SCHEMA(Select) | ||
| .DocStr(R"(Builds a batch by selecting each sample from one of the input batches. |
There was a problem hiding this comment.
| .DocStr(R"(Builds a batch by selecting each sample from one of the input batches. | |
| .DocStr(R"(Builds an output by selecting each sample from one of the inputs. |
I don't think we need to stress out that the input/output is batch of samples.
| .DocStr(R"(Builds a batch by selecting each sample from one of the input batches. | ||
|
|
||
| This operator is useful for conditionally selecting results of different operations. | ||
| The shapes of the corresponding samples in the inputs may differ, but the number of dimensions |
| sample is taken. | ||
| Providing a negative index will produce an empty tensor with the same number of dimensions as |
There was a problem hiding this comment.
| sample is taken. | |
| Providing a negative index will produce an empty tensor with the same number of dimensions as | |
| sample is taken. | |
| Providing a negative index will produce an empty tensor with the same number of dimensions as |
If you want to have a newline.
| char *out_ptr = static_cast<char*>(out.raw_mutable_tensor(i)); | ||
| const char *in_ptr = static_cast<const char*>(inp.raw_tensor(i)); | ||
| ptrdiff_t start = 0; | ||
| for (int block = 0; block < blocks; block++) { | ||
| ptrdiff_t end = sample_size * (block + 1) / blocks; | ||
| tp.AddWork([in_ptr, out_ptr, start, end](int) { | ||
| memcpy(out_ptr + start, in_ptr + start, end - start); | ||
| }, end - start); | ||
| start = end; | ||
| } |
There was a problem hiding this comment.
I think this is a similar pattern used numpy reader. Maybe we can have a fucntion for this (even, make contiguous can use that in future).
There was a problem hiding this comment.
I guess we can factor it out when there are more usages of this kind of copy. NumpyReader is a bit more involved, though.
Regarding MakeContiguous - we won't have cpu2cpu (mixed) MakeContiguous when we unify workspaces and buffer objects.
| "All inputs must have the same type. " | ||
| "Got: ", inp0.type().id(), " and ", inp.type().id())); | ||
|
|
||
| DALI_ENFORCE(inp.sample_dim() == sample_dim, make_string( |
There was a problem hiding this comment.
Shouldn't we check layouts as well?
There was a problem hiding this comment.
No; the semantics are that we pick the first non-empty layout - this sort-of implies that they can differ - and there are checks in the executor that require the non-empty input layouts to be of correct length.
| return out | ||
| check_single_input(fn.squeeze, axis_names="YZ", get_data=get_data, input_layout="HWCYZ") | ||
|
|
||
| def test_select_cpu(): |
There was a problem hiding this comment.
I think you need to update test_dali_variable_batch_size.py as well.
There was a problem hiding this comment.
The regular test uses variable batch size. I don't think we need to double that.
There was a problem hiding this comment.
We need as the test_dali_variable_batch_size.py checks if all operators are tested. There is no other way to enforce this kind of test.
| data = fn.external_source(source=get_data, layout="HWC") | ||
| data2 = fn.external_source(source=get_data, layout="HWC") | ||
| data3 = fn.external_source(source=get_data, layout="HWC") | ||
| idx = fn.random.uniform(range=[0, 3], dtype=types.INT32) |
There was a problem hiding this comment.
| idx = fn.random.uniform(range=[0, 3], dtype=types.INT32) | |
| idx = fn.random.uniform(range=[0, 2], dtype=types.INT32) |
|
|
||
| from numpy import random | ||
| from numpy.core.fromnumeric import shape | ||
| from nvidia.dali import Pipeline |
There was a problem hiding this comment.
The decorator makes it easier to have a function that returns a pipeline (it does away with forwarding arguments, with pipe, set_outputs, etc - here there's no such function and forwarding arguments would likely make the code more repetitive.
| def test_error_inconsistent_ndim(): | ||
| def check(device): | ||
| pipe = Pipeline(1, 3, 0) | ||
| pipe.set_outputs(fn.select(np.float32([0,1]), np.float32([[2,3,4]]), input_idx=2)) |
There was a problem hiding this comment.
| pipe.set_outputs(fn.select(np.float32([0,1]), np.float32([[2,3,4]]), input_idx=2)) | |
| pipe.set_outputs(fn.select(np.float32([0,1]), np.float32([[2,3,4]]), input_idx=1)) |
To make sure that dimension problem is the only one and the order of error reporting doesn't matter here.
| def test_error_inconsistent_type(): | ||
| def check(device): | ||
| pipe = Pipeline(1, 3, 0) | ||
| pipe.set_outputs(fn.select(np.float32([0,1]), np.int32([2,3,4]), input_idx=2)) |
There was a problem hiding this comment.
| pipe.set_outputs(fn.select(np.float32([0,1]), np.int32([2,3,4]), input_idx=2)) | |
| pipe.set_outputs(fn.select(np.float32([0,1]), np.float32([[2,3,4]]), input_idx=1)) |
To make sure that dimension problem is the only one and the order of error reporting doesn't matter here.
There was a problem hiding this comment.
It's about type:
| pipe.set_outputs(fn.select(np.float32([0,1]), np.int32([2,3,4]), input_idx=2)) | |
| pipe.set_outputs(fn.select(np.float32([0,1]), np.int32([2,3,4]), input_idx=1)) |
There was a problem hiding this comment.
Right. Just a copy paste from the previous example.
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
| np.random.seed(1234) | ||
|
|
||
| def generate_data(ndim, ninp, type, max_batch_size): | ||
| batch_size = np.random.randint(1, max_batch_size+1) |
There was a problem hiding this comment.
Variable batch size - voila!
|
CI MESSAGE: [2558517]: BUILD STARTED |
|
CI MESSAGE: [2558517]: BUILD FAILED |
|
CI MESSAGE: [3840109]: BUILD STARTED |
Signed-off-by: Michał Zientkiewicz mzient@gmail.com
Why we need this PR?
Pick one, remove the rest
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
JIRA TASK: DALI-2202