Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Add utf8 safeguards, fix chr method, add unicode and utf16 parsing for String #3239

Draft
wants to merge 30 commits into
base: nightly
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
7e4f0df
add better safeguards and fix chr method
martinvuyk Jul 13, 2024
7134f8f
update changelog
martinvuyk Jul 13, 2024
ab84608
rename to from_unicode
martinvuyk Jul 13, 2024
53d7038
move from_unicode to be static method
martinvuyk Jul 13, 2024
6d480b7
fix from_unicode
martinvuyk Jul 13, 2024
5236388
fix docstring
martinvuyk Jul 13, 2024
439aa21
fix indentation
martinvuyk Jul 13, 2024
c6f2dfb
fix list constructor
martinvuyk Jul 13, 2024
20bf017
fix use less lines
martinvuyk Jul 13, 2024
9a62b42
add utf16 decode
martinvuyk Jul 13, 2024
0bbc386
fix changelog
martinvuyk Jul 13, 2024
74e698b
fix detail
martinvuyk Jul 13, 2024
bf4093d
fix detail
martinvuyk Jul 13, 2024
5a2af26
fix detail
martinvuyk Jul 13, 2024
30c027f
fix detail
martinvuyk Jul 13, 2024
ddcbf0d
fix detail
martinvuyk Jul 13, 2024
9f5ee3b
simplify utf16 internals
martinvuyk Jul 13, 2024
fcc789c
fix detail
martinvuyk Jul 13, 2024
e08bc57
fix detail
martinvuyk Jul 13, 2024
9ffd5e6
fix detail
martinvuyk Jul 14, 2024
afb537a
fix detail
martinvuyk Jul 14, 2024
805041e
fix detail
martinvuyk Jul 14, 2024
0fcdf50
fix detail
martinvuyk Jul 14, 2024
be5a203
fix detail
martinvuyk Jul 14, 2024
fccdbcd
fix detail
martinvuyk Jul 14, 2024
f46ce80
add suggestion from @mzaks
martinvuyk Jul 14, 2024
6b47694
fix use unsafe_get
martinvuyk Jul 16, 2024
ca38ca3
Merge remote-tracking branch 'upstream/nightly' into add-utf8-safeguards
martinvuyk Jul 16, 2024
af3be58
use variant for unicode parsing
martinvuyk Jul 16, 2024
a4eedb0
Merge remote-tracking branch 'upstream/nightly' into add-utf8-safeguards
martinvuyk Jul 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix detail
Signed-off-by: martinvuyk <[email protected]>
  • Loading branch information
martinvuyk committed Jul 13, 2024
commit ddcbf0d3dbe98feddbf14399b3d6c52c36fb0bf3
3 changes: 1 addition & 2 deletions stdlib/src/builtin/string.mojo
Original file line number Diff line number Diff line change
Expand Up @@ -2377,11 +2377,10 @@ struct String(
curr_ptr[0] = 0xFF
else:
num_bytes = 4
alias bit_over_32 = 0x1_00_00
alias low_10b = 0b0011_1111_1111 # get lower 10 bits
var c2 = int(values.unsafe_get(values_idx + 1))
var value = ((int(c) & low_10b) << 10) | (c2 & low_10b)
var unicode = bit_over_32 + value
var unicode = 2**16 + value
curr_ptr[0] = UInt8(0xF0 | (unicode >> 18))
curr_ptr[1] = UInt8(c_byte | ((unicode >> 12) & low_6b))
curr_ptr[2] = UInt8(c_byte | ((unicode >> 6) & low_6b))
Expand Down