Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make example datasets optional #73

Closed
kasnerz opened this issue Sep 10, 2024 · 1 comment · Fixed by #103
Closed

Make example datasets optional #73

kasnerz opened this issue Sep 10, 2024 · 1 comment · Fixed by #103
Labels
enhancement New feature or request high priority Tasks which should be finished ASAP

Comments

@kasnerz
Copy link
Collaborator

kasnerz commented Sep 10, 2024

The factgenie repository has almost 100 MBs when cloned from scratch. Majority of that are example datasets and outputs.

I realized that while the example datasets can be helpful, they can also be percieved as bloatware.

We should probably distribute them separately from the main repository, so that the default factgenie installation is as lightweight as can be.


Update: I found out that majority of the bloat was in fact caused by Factgenie.mp4 that has been once uploaded to the main repository (and remained in the history even after deletion). I fixed it using git-filter-repo:

git-filter-repo --path Factgenie.mp4 --invert-paths

The repository is now only several MBs large 💪

We tried to preserve git history as much as possible, but please let us know if this cause you some issues with your local git branches.

The main point of the issue still holds, though: we should make downloading the example datasets optional.

@kasnerz kasnerz added the enhancement New feature or request label Sep 10, 2024
@oplatek
Copy link
Member

oplatek commented Sep 10, 2024

@kasnerz the git lfs is is a good way how to accept demo datasets. I "described" how to use it here https://github.com/ufal/factgenie/wiki/05-Developer-Notes#%EF%B8%8F-handling-large-files

@kasnerz kasnerz added the high priority Tasks which should be finished ASAP label Sep 16, 2024
@kasnerz kasnerz added the in progress Working on it! label Sep 25, 2024
@kasnerz kasnerz removed the in progress Working on it! label Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request high priority Tasks which should be finished ASAP
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants