-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Raise on data mismatch in str.json_decode
#19347
fix: Raise on data mismatch in str.json_decode
#19347
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #19347 +/- ##
==========================================
- Coverage 80.21% 80.18% -0.03%
==========================================
Files 1523 1523
Lines 209979 210096 +117
Branches 2432 2432
==========================================
+ Hits 168426 168472 +46
- Misses 40997 41068 +71
Partials 556 556 ☔ View full report in Codecov by Sentry. |
str.json_decode
f63e4a3
to
fdb806f
Compare
@@ -287,6 +287,8 @@ where | |||
} | |||
} | |||
|
|||
let allow_extra_fields_in_struct = self.schema.is_some(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently have this behavior where a user can perform a column selection within a struct field by giving a partial schema - this happens because of the existing behavior of JSON deserialization to silently ignore extra fields.
In this PR we change that behavior to raise an error, but to maintain the ability to do this column selection I made it so that we don't raise when we see extra keys if the schema was provided by the user.
_ => None, | ||
BorrowedValue::Static(StaticNode::Null) => None, | ||
_ => { | ||
err_idx = if err_idx == rows.len() { i } else { err_idx }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure we only push NULL if the value itself was "null"
, and add tracking of an "error index" to all of the deserializers that tells us the position of the first parsing error.
fdb806f
to
8f4d95a
Compare
Fixes #13061
Todo: I think our NDJSON and JSON readers also suffer from this issue