-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add options to the "delete-cache" command #1065
Comments
We are interested in cleaning cached models which were not accessed in the past N days. Would it make sense to provide an option like For the time being, I think this can be done with the Python API, but the CLI would be more convenient. from datetime import datetime
from huggingface_hub import scan_cache_dir
expiry = 21 # in days
now = datetime.now()
cache_info = scan_cache_dir()
to_clean = []
for repo in cache_info.repos:
delta = now - datetime.fromtimestamp(repo.last_accessed)
if delta.days >= expiry:
print(f"{repo.size_on_disk_str:>8}", f"{delta.days:>4} days", repo.repo_id)
to_clean += [revision.commit_hash for revision in repo.revisions]
delete_strategy = cache_info.delete_revisions(*to_clean)
print(f"Will free {delete_strategy.expected_freed_size_str}.")
# delete_strategy.execute() |
@Wauplin is this issue still open? Dose this needs a solution? |
I'll end up doing one or several of these while fixing #2219 as well. The one that most bothers me right now is that the sort order for |
A CLI tool has been introduced in #1025. It allow to scan and delete the HF cache directory. This is especially useful when hard drive gets full.
Currently a list of repos is printed with details like revision name, size and last modified date. The user can select which revisions to delete. It can be done either via a Terminal UI (if
huggingface_hub[cli]
is installed) or via a temporary file to edit (if TUI not supported).At the moment the selection is entirely manual to let the user decide what to do. To ease the process, we discussed about implementing new CLI options:
--filter
to filter repo names--sort
to sort by age, alphabetically, size,...--limit
to display only the top X repos--keep-last
to keep only the last revision of each repoEach option can be implemented separately in a different PR. CLI implementation can be found in ./commands/delete_cache.py while the cache scan tool itself is in ./utils/_cache_manager.py
The text was updated successfully, but these errors were encountered: