Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bufio scanner token too long #5043

Closed
SoftTools59654 opened this issue Feb 20, 2024 · 3 comments · Fixed by #5045 or #5048
Closed

bufio scanner token too long #5043

SoftTools59654 opened this issue Feb 20, 2024 · 3 comments · Fixed by #5045 or #5048
Assignees
Labels
bug Something isn't working community

Comments

@SoftTools59654
Copy link

I encountered the following error while importing data in line
bufio scanner token too long
The problem is related to the 64kb limit

@SoftTools59654 SoftTools59654 added the bug Something isn't working label Feb 20, 2024
@philrz philrz transferred this issue from brimdata/zui Feb 21, 2024
@philrz
Copy link
Contributor

philrz commented Feb 21, 2024

@SoftTools59654: I've moved this to the Zed repo since it's something we'd have to address first at that level.

Here's a repro with Zed commit 4dc6236.

$ zq -version
Version: v1.14.0-2-g4dc62369

$ perl -E 'say "=" x 65536' | zq -i line 'count()' -
stdio:stdin: bufio.Scanner: token too long

Whereas it works with one character less.

$ perl -E 'say "=" x 65535' | zq -i line 'count()' -
1(uint64)

While it seems there's likely to always be some upper limit on the size of values, upon seeing this @mccanne remarked that we should indeed be able to increase the current limit to something measured in megabytes.

@mattnibs mattnibs self-assigned this Feb 21, 2024
mattnibs added a commit that referenced this issue Feb 21, 2024
Increase the max buffer size for the lineio reader from the default 64KB
to 25MB.

Closes #5043
mattnibs added a commit that referenced this issue Feb 22, 2024
Increase the max buffer size for the lineio reader from the default 64KB
to 25MB.

Closes #5043
@philrz
Copy link
Contributor

philrz commented Feb 23, 2024

Verified in Zed commit f9325b7.

Per the attached PRs, the buffer that was formerly 64 KB has now been increased to 25 MB. Therefore the crossover point is now:

$ zq -version
Version: v1.14.0-4-gf9325b72

$ perl -E 'say "=" x 26214399' | zq -i line 'count()' -
1(uint64)

$ perl -E 'say "=" x 26214400' | zq -i line 'count()' -
stdio:stdin: bufio.Scanner: token too long

Thanks @mattnibs!

@philrz
Copy link
Contributor

philrz commented Feb 23, 2024

@SoftTools59654: I saw you clicked a 👍 on the last comment so it looks like you're already aware this buffer has been increased at the Zed layer. I'm not sure if you're a Zui Insiders user, but FYI the latest Zui Insiders release 1.6.1-13 does include the newer Zed so you can start loading bigger values with that before the next set of GA releases comes out. I also wanted to make you aware that testing the effects of the Zed change in Zui revealed a separate limit at the Zui layer where values bigger than about 2 MB seem to be currently rendered blank, so issue brimdata/zui#3016 will be used to track addressing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community
Projects
None yet
3 participants