fix:[9018]Fixed RunArray slice offsets#9036
Conversation
|
Looking into test(intermittent) case failure. |
b17d380 to
d9e54fb
Compare
|
Need help on. |
Can ignore, it's an unrelated issue: #9024 |
Jefffrey
left a comment
There was a problem hiding this comment.
Thanks for picking this up. Left some comments, also changed PR body to not close original issue as I still need to check what other kernels/code might need fixing.
d9e54fb to
6fb02cd
Compare
Thanks for reviewing and feedback. If you permit, once this PR is good to go, can take this. |
arrow-buffer/src/buffer/run.rs
Outdated
| /// Similar to [`values`] but accounts for logical slicing, returning only the values | ||
| /// that are part of the logical slice of this buffer. | ||
| /// | ||
| /// [`values`]: Self::values | ||
| pub fn values_slice(&self) -> &[E] { |
There was a problem hiding this comment.
I think for this docstring we also need a clear warning note that the first & last run values may be inaccurate as they don't take into account the logical slicing. I wonder if there is a better way to handle this without too much overhead 🤔
There was a problem hiding this comment.
Thanks. Added warning but not sure logical run ends API(logical_run_ends) is needed or not, but added it. Please feel free to suggest to remove it.
There was a problem hiding this comment.
I quite like the logical_run_ends() API actually; it makes it harder to unintentionally misuse values_slice(). I'm thinking we remove values_slice() and expose only logical_run_ends(), since users can always derive values_slice() for themselves manually anyway.
My only comment is perhaps rename logical_run_ends() to something else, since it might be confusing with how we describe physical vs logical in the docstring for the struct. Perhaps something like sliced_values()?
There was a problem hiding this comment.
Thanks. Removed values_slice and renamed logical_run_ends to sliced_values. Without values_slice, it is unclear whether adding get_physical_len would be useful for end users?
There was a problem hiding this comment.
Without
values_slice, it is unclear whether addingget_physical_lenwould be useful for end users?
Could you elaborate on this? I don't recall discussing adding a get_physical_len() API 🤔
There was a problem hiding this comment.
It's my suggestion. I was thinking, would a get_physical_len API be useful for end users?
There was a problem hiding this comment.
It's probably fine to omit for now, I don't see a use for it
6fb02cd to
57e6a3c
Compare
|
@Jefffrey Please help to close this. Thanks |
|
I'll try take another look soon. |
Jefffrey
left a comment
There was a problem hiding this comment.
Apologies it took me so long to re-review, I'll try to prioritise this so we can get it over the line 👍
arrow-buffer/src/buffer/run.rs
Outdated
| /// Similar to [`values`] but accounts for logical slicing, returning only the values | ||
| /// that are part of the logical slice of this buffer. | ||
| /// | ||
| /// [`values`]: Self::values | ||
| pub fn values_slice(&self) -> &[E] { |
There was a problem hiding this comment.
I quite like the logical_run_ends() API actually; it makes it harder to unintentionally misuse values_slice(). I'm thinking we remove values_slice() and expose only logical_run_ends(), since users can always derive values_slice() for themselves manually anyway.
My only comment is perhaps rename logical_run_ends() to something else, since it might be confusing with how we describe physical vs logical in the docstring for the struct. Perhaps something like sliced_values()?
57e6a3c to
a2b3af9
Compare
arrow-buffer/src/buffer/run.rs
Outdated
| assert_eq!(buffer.get_end_physical_index(), 0); | ||
| } | ||
| #[test] | ||
| fn test_sliced_buffer_access() { |
There was a problem hiding this comment.
This test probably can be removed now that we don't have values_slice()
arrow-buffer/src/buffer/run.rs
Outdated
| /// Similar to [`values`] but accounts for logical slicing, returning only the values | ||
| /// that are part of the logical slice of this buffer. | ||
| /// | ||
| /// [`values`]: Self::values | ||
| pub fn values_slice(&self) -> &[E] { |
There was a problem hiding this comment.
Without
values_slice, it is unclear whether addingget_physical_lenwould be useful for end users?
Could you elaborate on this? I don't recall discussing adding a get_physical_len() API 🤔
a2b3af9 to
bfbe554
Compare
|
@Jefffrey Thank you for approving this request. I appreciate your feedback, which helps improve it with each iteration. |
|
Thanks @manishkr |
# Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. --> - Part of apache#9018 . # Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> To consider offset in slicing of RunArray. # What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> 1. Considered offset in slicing of RunArray. 2. Enhanced RunArray slice API. # Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 3. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> yes # Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. If there are any breaking changes to public APIs, please call them out. --> Yes, extended API to access RunArray slices directly than getting it from index.
Which issue does this PR close?
RunArrays #9018 .Rationale for this change
To consider offset in slicing of RunArray.
What changes are included in this PR?
Are these changes tested?
yes
Are there any user-facing changes?
Yes, extended API to access RunArray slices directly than getting it from index.