Skip to content

fix: Avoid integer overflow in substr()#20199

Open
neilconway wants to merge 1 commit intoapache:mainfrom
neilconway:neilc/substr-overflow
Open

fix: Avoid integer overflow in substr()#20199
neilconway wants to merge 1 commit intoapache:mainfrom
neilconway:neilc/substr-overflow

Conversation

@neilconway
Copy link
Contributor

Rationale for this change

Evaluating SELECT SUBSTR('', 2, 9223372036854775807); yields (in a debug build):

thread 'main' (41414592) panicked at datafusion/functions/src/string/split_part.rs:236:47:
attempt to negate with overflow
stack backtrace:
   0: __rustc::rust_begin_unwind
             at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/std/src/panicking.rs:698:5
   1: core::panicking::panic_fmt
             at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/core/src/panicking.rs:80:14
   2: core::panicking::panic_const::panic_const_neg_overflow
             at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/core/src/panicking.rs:180:17
   3: datafusion_functions::string::split_part::split_part_impl::{{closure}}
             at ./datafusion/functions/src/string/split_part.rs:236:47
   4: core::iter::traits::iterator::Iterator::try_for_each::call::{{closure}}
             at /Users/neilconway/.rustup/toolchains/1.92.0-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/iter/traits/iterator.rs:2485:26
[...]

Found via fuzzing.

Are these changes tested?

Yes, added a unit test.

@github-actions github-actions bot added the functions Changes to functions implementation label Feb 7, 2026
Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to saturate at i64::MAX or just error out?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neilconway and @Jefffrey

// HACK: can be simplified if function has specialized
// implementation for `ScalarValue` (implement without `make_scalar_function()`)
let avg_prefix_len = start
let total_prefix_len = start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't using floating point arithmetic be slower? Perhaps we could simply return an error if we got an overflow? Or use saturating_add ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants