updated bmerge() to support joins on complex columns with zero imaginary part, treating them as double#7085
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7085 +/- ##
==========================================
- Coverage 98.69% 98.69% -0.01%
==========================================
Files 79 79
Lines 14677 14692 +15
==========================================
+ Hits 14486 14500 +14
- Misses 191 192 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Generated via commit 906d3af Download link for the artifact containing the test results: ↓ atime-results.zip
|
|
I do not use complex numbers so I don't really understand what is a typical use case, can you please explain? |
This comment was marked as off-topic.
This comment was marked as off-topic.
|
that is not a typical use case |
|
you should add an example in an Rd file with a typical use case |
|
I've included an example and a brief description in the .Rd file. Thanks! |
| try(products_all[sales, on = .(id = product_id), nomatch = 0]) | ||
|
|
||
| \dontshow{ | ||
| rm(products_clean, sales, products_all) |
There was a problem hiding this comment.
we don't have any rm() commands in dontshow in other Rd files, please remove.
| products_all = data.table(id = c(101+0i, 102+1i, 103+0i), name = c("widget", "gadget", "thingamajig")) | ||
|
|
||
| # The join fails because the entire 'id' column is checked first. | ||
| try(products_all[sales, on = .(id = product_id), nomatch = 0]) |
There was a problem hiding this comment.
please remove.
examples should show typical use cases, not errors.
| \item If a \code{complex} column contains values with only a zero imaginary part (e.g., \code{10+0i}), it is treated as a \code{double} for the join, allowing it to match with \code{integer} and \code{double} columns successfully. | ||
| \item If any value in a \code{complex} join column has a non-zero imaginary part (e.g., \code{10+2i}), the join will stop with an error, as there is no defined way to sort or match such a number against a real number. |
There was a problem hiding this comment.
I wonder if this feature is desirable. For the new example using current master I get
> products_clean[sales, on = .(id = product_id), nomatch = 0]
Erreur dans bmerge(i, x, leftcols, rightcols, roll, rollends, nomatch, mult, :
typeof x.id (complex) != typeof i.product_id (integer)which seems totally reasonable and actionable. If I want a join, I would need to convert the types to be the same.
So I would suggest closing this pr and opening a new one that clarifies the documentation. What do you think @jangorecki @MichaelChirico @ben-schwen @aitap ?
(I don't really use complex numbers so I don't understand if this is a typical or desirable use case, but I guess any user is capable of converting complex to another joinable type)
There was a problem hiding this comment.
Personally I would not add examples of joining of complex columns into the manual. It is just too uncommon. Unit tests yes, NEWS entry yes, but manual examples not really.
There was a problem hiding this comment.
Hi, thank you for the detailed feedback.
I understand there are two key takeaways from the discussion:
- Documentation: I will remove the examples from the .Rd file along with the rm() command, and I will open a new PR specifically for the documentation cleanup.
- Feature Design: There is an ongoing higher-level discussion about whether data.table should automatically coerce complex to double for joins, or if it's better to retain the existing behavior and require the user to handle coercion explicitly.
Please let me know the final decision on the feature design and how you'd like me to proceed. Thanks!
There was a problem hiding this comment.
please ask @MichaelChirico for review because he filed the original issue.
|
Hi @MichaelChirico, could you please share the next steps for this PR so I can proceed towards closing it? |

closes #6627
in this pr I
bmerge()to detect and handlecomplextypes.Im(x) == 0, column is coerced todouble.hi @tdhock @joshhwuu @MichaelChirico can you please review this when you have time.
thanks.