-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Optimize String length computation. #1685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
JAVA-5842
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, just a few minor changes requests.
/** | ||
* Detects the position of the first NULL (0x00) byte in a 64-bit word using SWAR technique. | ||
* <a href="https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord"> | ||
*/ | ||
private int computeCStringLength(final int prevPos) { | ||
ensureOpen(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this check, as it's a private method and duplicates a check from the public method that calls it?
|
||
result: | ||
00000000 00000000 10000000 00000000 00000000 00000000 00000000 00000000 | ||
^^^^^^^^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^^^^^^^ | |
^^^^^^^^ |
result: | ||
00000000 00000000 10000000 00000000 00000000 00000000 00000000 00000000 | ||
^^^^^^^^ | ||
The high bit is set only at the 0x00 byte position. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The high bit is set only at the 0x00 byte position. | |
The high bit is set only at the 0x00 byte position. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you meant to center this...
@@ -182,11 +182,58 @@ public void skipCString() { | |||
buffer.position(pos + length); | |||
} | |||
|
|||
/** | |||
* Detects the position of the first NULL (0x00) byte in a 64-bit word using SWAR technique. | |||
* <a href="https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have reason to believe that this link will still be around in, say, 5 years? If not, maybe just go with Wikipedia.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seems Wiki URL like https://en.wikipedia.org/wiki/SWAR is better
pos += 8; | ||
} | ||
|
||
// Process remaining bytes one-by-one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems one-by-one
is more commonly used as one by one
long word = buffer.getLong(pos); | ||
/* | ||
Subtract 0x0101010101010101L to cause a borrow on 0x00 bytes. | ||
if original byte is 00000000 -> 00000000 - 00000001 = 11111111 (borrow causes high bit set to 1). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional. A little bit less clear than
if original byte is 00000000, then 00000000 - 00000001 = 11111111 (borrow causes high bit set to 1).
result: | ||
00000000 00000000 10000000 00000000 00000000 00000000 00000000 00000000 | ||
^^^^^^^^ | ||
The high bit is set only at the 0x00 byte position. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional. the above comment seems duplicated and could be deleted
00000000 00000000 11111111 00000000 00000001 00000001 00000000 00000111 | ||
|
||
ANDing mask with 0x8080808080808080 isolates the high bit (0x80) in positions where | ||
the original byte was 0x00, setting the high bit to 1 only at the 0x00 byte position. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional. maybe the following verbiage is little bit clearer:
, by setting the high bit to 1 only at the 0x00 byte position.
JAVA-5842