Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce estimation time cost #2577

Merged
merged 2 commits into from
Nov 12, 2021
Merged

Conversation

wjsi
Copy link
Member

@wjsi wjsi commented Nov 8, 2021

What do these changes do?

Reduce time cost of size estimation when running sys.getsizeof of a DataFrame object by sampling.

Related issue number

Fixes #2565

Check code requirements

  • tests added / passed (if needed)
  • Ensure all linting tests pass, see here for how to run them

@wjsi wjsi added type: enhancement request mod: dataframe to be backported Indicate that the PR need to be backported to stable branch labels Nov 8, 2021
@wjsi wjsi added this to the v0.9.0a1 milestone Nov 8, 2021
@wjsi wjsi force-pushed the enh/size_est_time branch from ccc4af7 to df34c1d Compare November 8, 2021 12:56
@wjsi wjsi marked this pull request as draft November 9, 2021 00:47
@wjsi wjsi force-pushed the enh/size_est_time branch 3 times, most recently from 7d80cbc to 4822b99 Compare November 10, 2021 08:30
@wjsi wjsi force-pushed the enh/size_est_time branch from 4822b99 to 81e174d Compare November 10, 2021 09:48
@wjsi wjsi marked this pull request as ready for review November 10, 2021 10:38
@wjsi wjsi changed the title [WIP] Reduce estimation time cost Reduce estimation time cost Nov 11, 2021
Copy link
Collaborator

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@hekaisheng hekaisheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hekaisheng hekaisheng merged commit cc9d34f into mars-project:master Nov 12, 2021
@wjsi wjsi deleted the enh/size_est_time branch November 22, 2021 07:32
wjsi added a commit to wjsi/mars that referenced this pull request Dec 7, 2021
@wjsi wjsi added backported already PR has been backported and removed to be backported Indicate that the PR need to be backported to stable branch labels Dec 8, 2021
chaokunyang pushed a commit to chaokunyang/mars that referenced this pull request May 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use sample method to estimate DataFrame memory cost
3 participants