-
Notifications
You must be signed in to change notification settings - Fork 1k
Avoid out-of-bounds access in overlaps
#7598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If 'ux' contains 0 rows, pretend that all comparisons against its non-existent elements fail.
This used to happen when from[i] was 0. (No match on non-range columns?)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7598 +/- ##
==========================================
+ Coverage 99.00% 99.01% +0.01%
==========================================
Files 87 87
Lines 16893 16896 +3
==========================================
+ Hits 16725 16730 +5
+ Misses 168 166 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Generated via commit eab8609 Download link for the artifact containing the test results: ↓ atime-results.zip
|
|
Is this ready to merge and be included in 1.18.2 or move to next patch? |
|
I haven't found a chance to review -- @ben-schwen or @jangorecki |
|
FWIW, |
Technically this one was harmless (and thus not caught by sanitizers) because the preceding VECSEXP header always contained a 0, preventing the branch where VECTOR_ELT() would be called with a negative index.
Most of these had separate checks for either |
| for (int i=0; i<rows; ++i) { | ||
| const int len=totlen; | ||
| int wlen=0, j=0, m=0; | ||
| const int k = (from[i]>0) ? from[i] : 1; | ||
| if (k == to[i]) { | ||
| wlen = count[k-1]; | ||
| } else if (k < to[i]) { | ||
| tmp1 = VECTOR_ELT(lookup, k-1); | ||
| tmp2 = VECTOR_ELT(type_lookup, to[i]-1); | ||
| while (j<count[k-1] && m<type_count[to[i]-1]) { | ||
| if ( INTEGER(tmp1)[j] == INTEGER(tmp2)[m] ) { | ||
| ++wlen; ++j; ++m; | ||
| } else if ( INTEGER(tmp1)[j] > INTEGER(tmp2)[m] ) { | ||
| ++m; | ||
| } else ++j; | ||
| } | ||
| } | ||
| totlen += wlen; | ||
| if (len == totlen) | ||
| ++totlen; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| for (int i=0; i<rows; ++i) { | |
| const int len = totlen; | |
| const int k = from[i]; | |
| if (k > 0) { | |
| if (k == to[i]) { | |
| totlen += count[k-1]; | |
| } else if (k < to[i]) { | |
| int *s = INTEGER(VECTOR_ELT(lookup, k-1)); | |
| int *e = INTEGER(VECTOR_ELT(type_lookup, to[i]-1)); | |
| int scount = count[k-1], ecount = type_count[to[i]-1]; | |
| for (int j=0, m=0; j < scount && m < ecount; ) { | |
| if (s[j] == e[m]) { ++totlen; ++j; ++m; } | |
| else if (s[j] > e[m]) ++m; | |
| } | |
| } | |
| if (len == totlen) ++totlen; | |
| } | |
| } |
We are always so picky about moving INTEGER out of loops, but apparently here we did not care yet. Definitely not for this PR but should be followed-up!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only line I'm worried about is the creation of the return array, when we skip now (see suggested test).
Besides that LGTM
@aitap feel free to merge once that is cleared
If possible, I would favor this going into 1.18.2. Its a long standing bug of >10 years which only surfaced now when R > |
Co-authored-by: Benjamin Schwendinger <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>
Co-authored-by: Benjamin Schwendinger <[email protected]>
The underflow is covered by already existing tests.
* Add tests * overlaps: avoid accessing length-0 vectors in ux If 'ux' contains 0 rows, pretend that all comparisons against its non-existent elements fail. * overlaps: avoid 'lookup' list overflow This used to happen when from[i] was 0. (No match on non-range columns?) * NEWS entry * overlaps: uncomment one more underflow test Technically this one was harmless (and thus not caught by sanitizers) because the preceding VECSEXP header always contained a 0, preventing the branch where VECTOR_ELT() would be called with a negative index. * test formatting * Update src/ijoin.c Co-authored-by: Benjamin Schwendinger <[email protected]> * Update src/ijoin.c Co-authored-by: Benjamin Schwendinger <[email protected]> * Update src/ijoin.c Co-authored-by: Benjamin Schwendinger <[email protected]> * Update inst/tests/tests.Rraw * overlaps: uncomment the remaining underflow test The underflow is covered by already existing tests. --------- Co-authored-by: Benjamin Schwendinger <[email protected]>
|
@TysonStanley I cherry picked this into 1.18.2 I guess we are good to go to submit |

When
yhas no rows, the data pointers of the columns ofunique(y[,...])are poisoned. Instead of trying to dereference them, pretend that all comparisons fail and fill the return value withNAs.When there are no matches on the non-range columns, the index in
from[i]may become 0. So uncomment the(from[i]>0) ? from[i] : 1checks instead of trustingfrom[i]and accessingVECTOR_ELT(lookup, -1). I am especially interested in someone double-checking this part because there are other places wherefrom[i]is not checked for being> 0. Are they reachable withfrom[i] == 0?Fixes: #7597