Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Very large CSV file support #4

Open
jschuur opened this issue Jun 14, 2024 · 5 comments
Open

Feature Request: Very large CSV file support #4

jschuur opened this issue Jun 14, 2024 · 5 comments

Comments

@jschuur
Copy link

jschuur commented Jun 14, 2024

I loaded an 18 gig CSV file and the app used... 18 gig of RAM :)

There does seem to be a reference to 'out of core processing' here for how you might be able to address this with DuckDB:

https://duckdb.org/docs/guides/performance/how_to_tune_workloads.html#larger-than-memory-workloads-out-of-core-processing

@huyphams
Copy link

Did you double click to open or import it and then open @jschuur ?

@jschuur
Copy link
Author

jschuur commented Jun 24, 2024

Did you double click to open or import it and then open @jschuur ?

Originally I double clicked to open.

Hitting Cmd-O then dragging it into the 'Untitled' workspace and then opening showed 4 gig of RAM used instead of almost 18.

When I exited and restarted TableTool and then selected the file from that workspace again, and Activity Monitor only showed it using 73 megs of RAM.

@huyphams
Copy link

When we double-click, it will use RAM only. However, when you import it to the app, it will use the disk because it now has the data workspace.

4 gig of RAM
73 megs of RAM.

It is the way macOS caches the content, so let macOS manage it.

@jschuur
Copy link
Author

jschuur commented Jun 24, 2024

When we double-click, it will use RAM only. However, when you import it to the app, it will use the disk because it now has the data workspace.

4 gig of RAM
73 megs of RAM.

It is the way macOS caches the content, so let macOS manage it.

I can see that there might be a technical reason for this, but from the user's perspective, this kind of behaviour is less intuitive and could use some more guiding.

Had I not known about this nuance, my personal use case would have probably been to double click on new CSV files as opposed to going back to the same one via a workspace or adding them there first.

@huyphams
Copy link

yep, maybe set a temporary path when using memory mode is a good idea. I agree with you @jschuur

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants