New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor #7499

Merged

mofosyne merged 66 commits into ggerganov:master from mofosyne:refactor-convert-py

Jul 18, 2024

Commits on Jul 15, 2024

convert-*.py: licence -> license

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for d3a936f

Browse repository at this point
Copy the full SHA

d3a936f View commit details

Browse the repository at this point in the history
convert-*.py: add --get-outfile command and refactor

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for dbb1b47

Browse repository at this point
Copy the full SHA

dbb1b47 View commit details

Browse the repository at this point in the history
convert-*.py: add basename and finetune metadata

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for a42c2b7

Browse repository at this point
Copy the full SHA

a42c2b7 View commit details

Browse the repository at this point in the history
convert-*.py: model card metadata

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 916872f

Browse repository at this point
Copy the full SHA

916872f View commit details

Browse the repository at this point in the history
convert-*.py: metadata class moved to utility

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 4d5f18a

Browse repository at this point
Copy the full SHA

4d5f18a View commit details

Browse the repository at this point in the history
convert-*.py: encoding_scheme --> output_type

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 5c263cb

Browse repository at this point
Copy the full SHA

5c263cb View commit details

Browse the repository at this point in the history
convert-*.py: parse model card in metadata util. Add license_link and…
```
… license_name to kv store
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for b36e391

Browse repository at this point
Copy the full SHA

b36e391 View commit details

Browse the repository at this point in the history
convert-*.py: add base_version and add tags

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 8f73408

Browse repository at this point
Copy the full SHA

8f73408 View commit details

Browse the repository at this point in the history
convert-*.py: add parameter size class

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 0f1d50f

Browse repository at this point
Copy the full SHA

0f1d50f View commit details

Browse the repository at this point in the history
convert-*.py: add datasets and language to KV store

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 684c604

Browse repository at this point
Copy the full SHA

684c604 View commit details

Browse the repository at this point in the history
convert-*.py: move per model weight estimation away from util back to…
```
… main script

plus some refactoring
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for b1927ee

Browse repository at this point
Copy the full SHA

b1927ee View commit details

Browse the repository at this point in the history
convert-*.py: enable --model-name direct metadata override

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for f7c2079

Browse repository at this point
Copy the full SHA

f7c2079 View commit details

Browse the repository at this point in the history
convert-*.py: add general.organization to kv store

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 5a86dfa

Browse repository at this point
Copy the full SHA

5a86dfa View commit details

Browse the repository at this point in the history
convert-*.py: add quantized_by and enhance heuristics

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for dd15712

Browse repository at this point
Copy the full SHA

dd15712 View commit details

Browse the repository at this point in the history
convert-*.py: adjust help message

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for b0553f4

Browse repository at this point
Copy the full SHA

b0553f4 View commit details

Browse the repository at this point in the history
convert-*.py: use heuristics to parse _name_or_path

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 4d5cd06

Browse repository at this point
Copy the full SHA

4d5cd06 View commit details

Browse the repository at this point in the history
convert-*.py: base_model is actually in spec for model cards

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 32e80e0

Browse repository at this point
Copy the full SHA

32e80e0 View commit details

Browse the repository at this point in the history
convert-*.py: refactor parameter weight class

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 54918ad

Browse repository at this point
Copy the full SHA

54918ad View commit details

Browse the repository at this point in the history
convert-*.py: need to include self in per_model_weight_count_estimati…
```
…on()
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 39472a0

Browse repository at this point
Copy the full SHA

39472a0 View commit details

Browse the repository at this point in the history
convert-*.py: add heuristic to directory name fallback
```
Also add source_url for huggingface url
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 3625a42

Browse repository at this point
Copy the full SHA

3625a42 View commit details

Browse the repository at this point in the history
convert-*.py: add unittest to metadata class

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 91e65d9

Browse repository at this point
Copy the full SHA

91e65d9 View commit details

Browse the repository at this point in the history
convert-*.py: adjusted authorship KV store

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for d060fcd

Browse repository at this point
Copy the full SHA

d060fcd View commit details

Browse the repository at this point in the history
convert-*.py: separated unit test, hf_repo to repo_url

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for eaa47f5

Browse repository at this point
Copy the full SHA

eaa47f5 View commit details

Browse the repository at this point in the history
convert-*.py: Remove self.model_name that was left in since last rebase

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for e973443

Browse repository at this point
Copy the full SHA

e973443 View commit details

Browse the repository at this point in the history
convert_hf_to_gguf.py: optional, dataclass removed from type as it wa…
```
…s unused
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 5011eef

Browse repository at this point
Copy the full SHA

5011eef View commit details

Browse the repository at this point in the history
convert_hf_to_gguf.py: rebase error correction

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 2f23927

Browse repository at this point
Copy the full SHA

2f23927 View commit details

Browse the repository at this point in the history
convert_hf_to_gguf.py: Remove code that is already in fill_templated_…
```
…filename() and GGUFWriter()
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 4dc8ddd

Browse repository at this point
Copy the full SHA

4dc8ddd View commit details

Browse the repository at this point in the history
gguf_writer.py: generate tensor uuid if missing

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 007708e

Browse repository at this point
Copy the full SHA

007708e View commit details

Browse the repository at this point in the history
test: remove test_gguf.py and remove test_generate_any_missing_uuid()

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 7ecb8f0

Browse repository at this point
Copy the full SHA

7ecb8f0 View commit details

Browse the repository at this point in the history
convert-*.py: autogenerate general.uuid if missing

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for fdc5a3f

Browse repository at this point
Copy the full SHA

fdc5a3f View commit details

Browse the repository at this point in the history
convert-*.py: write_tensors() --> prepare_tensors_for_writing()

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 2a976e1

Browse repository at this point
Copy the full SHA

2a976e1 View commit details

Browse the repository at this point in the history
convert-*.py: refactor per model weight count estimation

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 59a01df

Browse repository at this point
Copy the full SHA

59a01df View commit details

Browse the repository at this point in the history
convert-*.py: pyright type fixes

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for dd14b8f

Browse repository at this point
Copy the full SHA

dd14b8f View commit details

Browse the repository at this point in the history
Apply suggestions from code review
```
Co-authored-by: compilade <git@compilade.net>
```
mofosyne and compilade committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 74383ba

Browse repository at this point
Copy the full SHA

74383ba View commit details

Browse the repository at this point in the history
convert-*.py: cast not required if Metadata.load_metadata_override re…
```
…turned a dict[str, Any] instead of a dict[str, object]

Co-authored-by: compilade <git@compilade.net>
```
mofosyne and compilade committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 4c91d07

Browse repository at this point
Copy the full SHA

4c91d07 View commit details

Browse the repository at this point in the history
convert-*.py: Removing the redundant metadata is not None from all co…
```
…nditions, and indenting them.

Co-authored-by: compilade <git@compilade.net>
```
mofosyne and compilade committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 6eb08ac

Browse repository at this point
Copy the full SHA

6eb08ac View commit details

Browse the repository at this point in the history
convert-*.py: parameter_class_attribute --> size_label

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for f8b5931

Browse repository at this point
Copy the full SHA

f8b5931 View commit details

Browse the repository at this point in the history
convert-*.py: remove redundant gguf_writer.add_name() calls

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 64707b6

Browse repository at this point
Copy the full SHA

64707b6 View commit details

Browse the repository at this point in the history
convert-*.py: prepare_tensors_for_writing() --> prepare_tensors()
```
> Especially since it can be used for other purposes than "for writing", like preparing the tensors to then count and sum all their sizes.

Co-authored-by: compilade <git@compilade.net>
```
mofosyne and compilade committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 04c4fff

Browse repository at this point
Copy the full SHA

04c4fff View commit details

Browse the repository at this point in the history
convert-*.py: import cast from typing and other refactor

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for f2b425c

Browse repository at this point
Copy the full SHA

f2b425c View commit details

Browse the repository at this point in the history
convert-*.py: remove autogenerated uuid

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for ad217d7

Browse repository at this point
Copy the full SHA

ad217d7 View commit details

Browse the repository at this point in the history
Update convert_hf_to_gguf.py
```
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
```
mofosyne and ngxson committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 60278e4

Browse repository at this point
Copy the full SHA

60278e4 View commit details

Browse the repository at this point in the history
Update convert_hf_to_gguf.py
```
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
```
mofosyne and ngxson committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for aa4e589

Browse repository at this point
Copy the full SHA

aa4e589 View commit details

Browse the repository at this point in the history
Update constants.py : spacing correction

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 2c06030

Browse repository at this point
Copy the full SHA

2c06030 View commit details

Browse the repository at this point in the history
constants.py : Revert removal of backward compatibility KEY_GENERAL_S…
```
…OURCE_URL
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 8156835

Browse repository at this point
Copy the full SHA

8156835 View commit details

Browse the repository at this point in the history
convert-*.py: remove reference to uuid generation

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for ccff6c7

Browse repository at this point
Copy the full SHA

ccff6c7 View commit details

Browse the repository at this point in the history
Apply suggestions from code review
```
Co-authored-by: compilade <git@compilade.net>
```
mofosyne and compilade committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 455c0e5

Browse repository at this point
Copy the full SHA

455c0e5 View commit details

Browse the repository at this point in the history
convert-*.py: dict_item --> Iterable

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 5ab1a84

Browse repository at this point
Copy the full SHA

5ab1a84 View commit details

Browse the repository at this point in the history
convert-*.py: update nix package to add python frontmatter

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 5cdb03b

Browse repository at this point
Copy the full SHA

5cdb03b View commit details

Browse the repository at this point in the history
convert-*.py: add logger and refactor load_model_card()

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 9954b64

Browse repository at this point
Copy the full SHA

9954b64 View commit details

Browse the repository at this point in the history
convert-*.py: quantized_by in model card is not relevant for converte…
```
…d gguf
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for abc351c

Browse repository at this point
Copy the full SHA

abc351c View commit details

Browse the repository at this point in the history
convert-*.py: pathlib.Path exist() --> is_file() or is_dir()

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 144a7ec

Browse repository at this point
Copy the full SHA

144a7ec View commit details

Browse the repository at this point in the history
covert-*.py: per_model_weight_count_estimation() tensor arg type is I…
```
…terable[tuple[str, LazyTensor]]
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 8629b7b

Browse repository at this point
Copy the full SHA

8629b7b View commit details

Browse the repository at this point in the history
covert-*.py: flake8 newline missing

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 4e37611

Browse repository at this point
Copy the full SHA

4e37611 View commit details

Browse the repository at this point in the history
convert-*.py: more rigorous regexp for get_model_id_components()

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for f98f109

Browse repository at this point
Copy the full SHA

f98f109 View commit details

Browse the repository at this point in the history
convert-*.py: flake8 remove blank line

mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 3b1766a

Browse repository at this point
Copy the full SHA

3b1766a View commit details

Browse the repository at this point in the history
gguf-py : use pyyaml instead of python-frontmatter
```
HF transformers already depends on pyyaml for model cards,
so it should already be in the environment
of the users of the convert scripts, unlike python-frontmatter.

This should be completely equivalent since the model cards
seem to use only YAML and never TOML.
```
compilade authored and mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 78a42fb

Browse repository at this point
Copy the full SHA

78a42fb View commit details

Browse the repository at this point in the history
convert_hf : use GGUFWriter to count model parameters

compilade authored and mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 417d7a7

Browse repository at this point
Copy the full SHA

417d7a7 View commit details

Browse the repository at this point in the history
metadata.py: account for decimal point in size label within model id …
```
…components
```
mofosyne committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for 9a925b5

Browse repository at this point
Copy the full SHA

9a925b5 View commit details

Browse the repository at this point in the history
Update convert_hf_to_gguf.py
```
It might help with the convert_lora_to_gguf.py script if default values were added here

Co-authored-by: compilade <git@compilade.net>
```
mofosyne and compilade committed Jul 15, 2024
Configuration menu
View commit details

Copy full SHA for c7b3616

Browse repository at this point
Copy the full SHA

c7b3616 View commit details

Browse the repository at this point in the history

Commits on Jul 16, 2024

Merge branch 'master' into refactor-convert-py

mofosyne committed Jul 16, 2024
Configuration menu
View commit details

Copy full SHA for 5da16bb

Browse repository at this point
Copy the full SHA

5da16bb View commit details

Browse the repository at this point in the history
convert-*.py: Add naming_convention_vocab_only()

mofosyne committed Jul 16, 2024
Configuration menu
View commit details

Copy full SHA for eb0bf6b

Browse repository at this point
Copy the full SHA

eb0bf6b View commit details

Browse the repository at this point in the history
convert_lora_to_gguf.py: remove model_name parameter. Doesn't exist i…
```
…n LoraModel()
```
mofosyne committed Jul 16, 2024
Configuration menu
View commit details

Copy full SHA for 7e9271c

Browse repository at this point
Copy the full SHA

7e9271c View commit details

Browse the repository at this point in the history

Commits on Jul 18, 2024

gguf-py : extract metadata from model name more resiliently

Using more than one regex to annotate the parts of the name,
this way, the order doesn't have to be fixed
and this should work correctly for more edge cases.

Also, the total parameter count of the model is used to figure out
if a size label is not actually a size label, but a context size.

* convert_lora : fix duplicate model type key

compilade committed Jul 18, 2024

2c18a9a

gguf-py : fix flake8 lint

compilade committed Jul 18, 2024
Configuration menu
View commit details

Copy full SHA for 4c9932c

Browse repository at this point
Copy the full SHA

4c9932c View commit details

Browse the repository at this point in the history
gguf-py : handle more name metadata extraction edge cases
```
* gguf-py : output the split plan on stdout when using dry_run

* convert_hf : unify vocab naming convention with the standard one

This also adds a way to name LoRA models.
```
compilade committed Jul 18, 2024
Configuration menu
View commit details

Copy full SHA for 73899f7

Browse repository at this point
Copy the full SHA

73899f7 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor #7499

convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor #7499

Commits on Jul 15, 2024

Commits on Jul 16, 2024

Commits on Jul 18, 2024