Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the disk_usage() function work on macOS #537

Merged
merged 2 commits into from
Aug 17, 2023

Conversation

asmeurer
Copy link
Contributor

Fixes the issue described at
#507 (comment)

I don't know yet what should be done on Windows. Windows support will need to be investigated separately. It will likely be a lot more work than macOS support.

Fixes # .

Description

This pull request:

  • a
  • b
  • c

Pull request checklist

  • Did you test this change locally?
  • Did you update the documentaion (if required)?
  • Did you add/update relevant tests for this change (if required)?

Additional information

Fixes the issue described at
conda-incubator#507 (comment)

I don't know yet what should be done on Windows. Windows support will need to
be investigated separately. It will likely be a lot more work than macOS support.
@costrouc
Copy link
Member

costrouc commented Aug 16, 2023

@asmeurer looks like there are some formatting issues.

As a side note I wonder if we are trying to achieve cross platform support. Would it make sense for us to use libraries that have already implemented this? E.g. psutil might be a heavier dependency.

Seem that psutil support filesystem https://psutil.readthedocs.io/en/latest/index.html?highlight=disk%20usage#psutil.disk_usage. But reading those docs it looks like they also recommend shutil.disk_usage which I'd be in favor of using.

@asmeurer
Copy link
Contributor Author

asmeurer commented Aug 16, 2023

Yes, I saw shutil.disk_usage, and it's actually already being used in orm.py. But it seems to return the total usage of the volume, not the usage of the given directory. For example, here's the output in my conda-store-server directory.

>>> import shutil
>>> shutil.disk_usage('.')
usage(total=1000240963584, used=857797771264, free=142443192320)

vs.

$ du -sAB1
1105606	.

I wonder if the usage in orm.py is intended to be this or not?

I'm not sure if psutil has a du equivalent, though maybe some other library does. It seems it can be computed in pure Python https://stackoverflow.com/questions/1392413/calculating-a-directorys-size-using-python.

There are also some subtleties around things like apparent size vs. disk usage. This function uses apparent size and I've maintained that on Mac with the -A flag. I don't know if it's that important for our usage (the apparent size is usually smaller because data doesn't fill blocks entirely, but it can also be larger with filesystems that support sparse files or compression)

asmeurer added a commit to asmeurer/conda-store that referenced this pull request Aug 16, 2023
conda-docker doesn't work on Mac, and making it work is not straightforward,
so for now we just disable it when we aren't on Linux.

With this and conda-incubator#537, basic Mac support seems to work with --standalone (conda-incubator#507).
I'm not sure what else I should be testing, but I can create environments and
they appear to be functional.
@costrouc
Copy link
Member

@asmeurer you are totally right. I was confusing disk usage and directory usage. LGTM!

@costrouc costrouc merged commit 148dfb4 into conda-incubator:main Aug 17, 2023
7 checks passed
@asmeurer
Copy link
Contributor Author

Also I just want to point out that at a cursory glance most of the solutions in that StackOverflow question are wrong because they don't account for hard links, which is especially important for conda environments. du does handle hard links correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done 💪🏾
Development

Successfully merging this pull request may close these issues.

3 participants