Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add private configs #1996

Merged
merged 121 commits into from
Dec 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
121 commits
Select commit Hold shift + click to select a range
e6b9729
Add the Tokenizer object logic
JosselinSomervilleRoberts Oct 4, 2023
38784f7
Changed most clients to use a Tokenizer object
JosselinSomervilleRoberts Oct 5, 2023
daf431b
Changed remaining clients to use a Tokenizer object
JosselinSomervilleRoberts Oct 6, 2023
cfd0c15
Removed calls to CachableClient.tokenize()
JosselinSomervilleRoberts Oct 6, 2023
4bf3c46
Add TODOs
JosselinSomervilleRoberts Oct 7, 2023
d6341ef
Make client methods abstract
JosselinSomervilleRoberts Oct 7, 2023
d2ac135
Resolve merge conflicts
JosselinSomervilleRoberts Oct 9, 2023
4e83fd2
Fix ICE Tokenizer test
JosselinSomervilleRoberts Oct 9, 2023
0445a78
Fix Critique breaking change
JosselinSomervilleRoberts Oct 9, 2023
e1cbe32
Revert fix
JosselinSomervilleRoberts Oct 9, 2023
40337f7
Fix all window service test issues except for Cohere
JosselinSomervilleRoberts Oct 10, 2023
037a869
Resolve merge conflicts with HuggingFace refactorization
JosselinSomervilleRoberts Oct 11, 2023
1077699
Refactor CachableClient -> CachingClient
JosselinSomervilleRoberts Oct 11, 2023
b0fefef
Refactor yalm_tokenizer_src -> yalm_tokenizer_data
JosselinSomervilleRoberts Oct 11, 2023
0e47750
Merge #1891
JosselinSomervilleRoberts Oct 11, 2023
c1b06e3
First draft of the model deployment/metadata refactorization
JosselinSomervilleRoberts Oct 17, 2023
802b2ec
Fix one of the TODO
JosselinSomervilleRoberts Oct 17, 2023
a907774
Merge branch 'main' into joss-refactor-1-tokenizer
JosselinSomervilleRoberts Oct 17, 2023
c9aa4fd
CachableTokenizer -> CachingTokenizer
JosselinSomervilleRoberts Oct 17, 2023
b1badb7
Port VLM model Idefics to use new Tokenizer logic
JosselinSomervilleRoberts Oct 17, 2023
57fd565
Add TODOs to remove tokenize and decode methods from Client
JosselinSomervilleRoberts Oct 17, 2023
50bb454
Change methods of CachingTokenizer to preserve existing Cache
JosselinSomervilleRoberts Oct 17, 2023
9876162
Change raw_request to request in _tokenization_raw_response_to_tokens…
JosselinSomervilleRoberts Oct 18, 2023
359994c
Merge branch 'main' into joss-refactor-1-tokenizer
JosselinSomervilleRoberts Oct 18, 2023
4d8c080
First draft of the model deployment/metadata refactorization
JosselinSomervilleRoberts Oct 17, 2023
ec97f8d
Fix one of the TODO
JosselinSomervilleRoberts Oct 17, 2023
068b199
Merge branch 'joss-refactor-4-deployments' of https://github.com/stan…
JosselinSomervilleRoberts Oct 18, 2023
427e723
Fixing black and mypy
JosselinSomervilleRoberts Oct 18, 2023
15d3cc4
Merge main
JosselinSomervilleRoberts Oct 25, 2023
06d5707
Support model in conf
JosselinSomervilleRoberts Oct 25, 2023
cebc733
Mode model definition to yaml
JosselinSomervilleRoberts Oct 27, 2023
e73c8a9
Done all metadata until Cohere (included)
JosselinSomervilleRoberts Oct 28, 2023
6edb232
Fix helm-summarize
JosselinSomervilleRoberts Oct 28, 2023
ae414ca
AI21 models done
JosselinSomervilleRoberts Oct 28, 2023
01be528
AI21 comment updated
JosselinSomervilleRoberts Oct 28, 2023
24e3446
Aleph Alpha deployments done
JosselinSomervilleRoberts Oct 28, 2023
5a3e504
Change AlephAlpha API key name
JosselinSomervilleRoberts Oct 28, 2023
57fd268
Merge branch 'main' into joss-refactor-4-deployments
JosselinSomervilleRoberts Oct 30, 2023
a412734
Add api_key binding for tokenizers
JosselinSomervilleRoberts Oct 30, 2023
faacc1b
Added AI21 Tokenizer
JosselinSomervilleRoberts Oct 30, 2023
f2fcd3e
Added AlephAlpha Tokenizer
JosselinSomervilleRoberts Oct 31, 2023
3495980
Added most of the model metadatas
JosselinSomervilleRoberts Oct 31, 2023
2fadd8c
Add last model metadatas
JosselinSomervilleRoberts Nov 1, 2023
71d76e6
Correct few errors in metadats
JosselinSomervilleRoberts Nov 1, 2023
bc8b068
Added Anthropic, BigScience and BigCode model deployments
JosselinSomervilleRoberts Nov 1, 2023
48ac8eb
Removed tags that were not necessary anymore with the new architecture
JosselinSomervilleRoberts Nov 1, 2023
3ca69f1
Added and tested tokenizers for Anthropic, BigCode and BigScience
JosselinSomervilleRoberts Nov 1, 2023
7f6dc4e
Added Cohere deployments
JosselinSomervilleRoberts Nov 1, 2023
56796cc
Use register_model_metadata
JosselinSomervilleRoberts Nov 1, 2023
a605581
Added Cohere command models and deprecated old Cohere models
JosselinSomervilleRoberts Nov 3, 2023
f27ac03
Merge branch 'main' into joss-refactor-4-deployments
JosselinSomervilleRoberts Nov 3, 2023
3b875cb
Added all tokenizers
JosselinSomervilleRoberts Nov 3, 2023
a3c9479
Added almost all model deployments (except palm, neurips and simple) …
JosselinSomervilleRoberts Nov 3, 2023
1a71401
Added deprecated field and updated many Together models
JosselinSomervilleRoberts Nov 4, 2023
3603549
Cleaning up
JosselinSomervilleRoberts Nov 4, 2023
077c1e5
Updating arguments for dependency injection
JosselinSomervilleRoberts Nov 4, 2023
15f60d9
Clean up handling of old keyword 'model'
JosselinSomervilleRoberts Nov 4, 2023
080702b
Better handle backward compatibility and remove auto model metadata
JosselinSomervilleRoberts Nov 4, 2023
af28659
Set some together model to legacy
JosselinSomervilleRoberts Nov 6, 2023
08fc833
Merge branch 'main' into joss-refactor-4-deployments
JosselinSomervilleRoberts Nov 6, 2023
c47c436
Nearly all tests should now pass
JosselinSomervilleRoberts Nov 7, 2023
578185a
All tests should now pass
JosselinSomervilleRoberts Nov 7, 2023
2d1ebb5
Merge branch 'main' into joss-refactor-4-deployments
JosselinSomervilleRoberts Nov 7, 2023
e5436d3
Trying to make the regression tests pass
JosselinSomervilleRoberts Nov 7, 2023
48e01c9
Lazy instantiate Aleph Alpha Client to pass regression test
JosselinSomervilleRoberts Nov 7, 2023
64b1685
Trying to make summarize work as expected
JosselinSomervilleRoberts Nov 7, 2023
84b6f48
Trying to make summarize compatible with old runs
JosselinSomervilleRoberts Nov 7, 2023
97377bf
helm-summarize is now compatible with old HELM
JosselinSomervilleRoberts Nov 7, 2023
bf4e201
Merge branch 'main' into joss-refactor-4-deployments
JosselinSomervilleRoberts Nov 8, 2023
7d0af69
Changes to frontend
JosselinSomervilleRoberts Nov 8, 2023
00d730f
Making sure all the input type for helm-run are handled properly
JosselinSomervilleRoberts Nov 9, 2023
205b871
Remove # ========= # in configs
JosselinSomervilleRoberts Nov 9, 2023
e177a1c
Rename host group to host organization
JosselinSomervilleRoberts Nov 9, 2023
401fcb4
Deleting get_default_deployment_for_model()
JosselinSomervilleRoberts Nov 9, 2023
ccff08e
Change creator organization to host organization for Cache
JosselinSomervilleRoberts Nov 9, 2023
9875991
A lot of small changes to answer some comments on the PR
JosselinSomervilleRoberts Nov 9, 2023
a892bab
Rename model_metadatas.yaml to singular
JosselinSomervilleRoberts Nov 9, 2023
8f0e964
Set openai/text-embedding-ada-002 as non deprecated
JosselinSomervilleRoberts Nov 9, 2023
fa04702
Merge branch 'main' into joss-refactor-4-deployments
JosselinSomervilleRoberts Nov 9, 2023
67a7351
Remove fancy headers
JosselinSomervilleRoberts Nov 9, 2023
3c61bf6
Add back model to adapter_keys_shown
JosselinSomervilleRoberts Nov 9, 2023
7628ce5
Revert frontend changes
JosselinSomervilleRoberts Nov 9, 2023
8901151
Added request.model_deployment
JosselinSomervilleRoberts Nov 9, 2023
606382a
Remove calls to client.tokenize
JosselinSomervilleRoberts Nov 9, 2023
33d8470
Added Lit GPT
JosselinSomervilleRoberts Nov 9, 2023
40c18cc
Added Neurips local
JosselinSomervilleRoberts Nov 9, 2023
47704c8
Fix YAML typo for lit-gpt
JosselinSomervilleRoberts Nov 9, 2023
d8d97ff
Fix test_run_entry
JosselinSomervilleRoberts Nov 9, 2023
4b461a3
Add some missing comments
JosselinSomervilleRoberts Nov 9, 2023
d630f48
Add private config files
JosselinSomervilleRoberts Nov 10, 2023
7cf4901
Add .gitignore
JosselinSomervilleRoberts Nov 10, 2023
bb277b8
Gitignore is not working
JosselinSomervilleRoberts Nov 10, 2023
f855d69
Trying to fix .gitignore
JosselinSomervilleRoberts Nov 10, 2023
3e490ad
Merge branch 'main' into joss-refactor-4-deployments
JosselinSomervilleRoberts Nov 13, 2023
853eca6
Add test to ensure that all models are available
JosselinSomervilleRoberts Nov 13, 2023
c6f8df7
Update comment style
JosselinSomervilleRoberts Nov 13, 2023
1fa74ce
Update get_deployment_name_from_model_arg()
JosselinSomervilleRoberts Nov 13, 2023
e4a8db4
Rename maybe_register_helm and move it to its own file
JosselinSomervilleRoberts Nov 14, 2023
8e515b5
Remove deprecation warning for cases like mode=text
JosselinSomervilleRoberts Nov 14, 2023
e0ca539
Split test read run specs in several tests
JosselinSomervilleRoberts Nov 14, 2023
d3f0e4e
Change Exception to Warning when deployments are found
JosselinSomervilleRoberts Nov 14, 2023
3f97b6d
Use importlib so that local paths work on a pypi install
JosselinSomervilleRoberts Nov 14, 2023
7cecdbd
Add default model metadata registration for huggingface models
JosselinSomervilleRoberts Nov 14, 2023
a8d931b
Changing Request so that model and model_deployment are always both f…
JosselinSomervilleRoberts Nov 14, 2023
57431f1
Fix test server service
JosselinSomervilleRoberts Nov 14, 2023
67252bc
Update tutorial
JosselinSomervilleRoberts Nov 15, 2023
abf9d0a
Alternative model deployment proposal (#2002)
yifanmai Nov 15, 2023
6ba0199
Merge main
JosselinSomervilleRoberts Nov 15, 2023
4b849a7
Fix helm-run and a few tests
JosselinSomervilleRoberts Nov 15, 2023
5749a8f
Merge branch 'main' into joss-refactor-4-deployments
JosselinSomervilleRoberts Nov 15, 2023
6f8a13c
Fix broken test
JosselinSomervilleRoberts Nov 17, 2023
e7eb250
Small fixes to the configs
JosselinSomervilleRoberts Nov 17, 2023
f177503
Fix Mistral #1998
JosselinSomervilleRoberts Nov 17, 2023
a3323bd
Fix files that were still not speciying both model and deployment in …
JosselinSomervilleRoberts Nov 17, 2023
c2d3e03
Fix some mypy issues
JosselinSomervilleRoberts Nov 17, 2023
5ce5648
Merge branch 'main' into joss-refactor-4-deployments
JosselinSomervilleRoberts Nov 17, 2023
0959f1d
Merge branch 'joss-refactor-4-deployments' into joss-refactor-8-private
JosselinSomervilleRoberts Nov 18, 2023
190efc0
Merge branch 'main' into joss-refactor-8-private
JosselinSomervilleRoberts Nov 18, 2023
cd09e52
Merge branch 'main' into joss-refactor-8-private
JosselinSomervilleRoberts Nov 20, 2023
0cbc583
Move private configs to prod_env
JosselinSomervilleRoberts Nov 20, 2023
42d3af7
Merge branch 'main' into joss-refactor-8-private
yifanmai Dec 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions src/helm/benchmark/config_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@
HELM_REGISTERED: bool = False


def register_helm_configurations():
def register_helm_configurations(base_path: str = "prod_env"):
JosselinSomervilleRoberts marked this conversation as resolved.
Show resolved Hide resolved
global HELM_REGISTERED
if not HELM_REGISTERED:
register_metadatas_if_not_already_registered()
register_tokenizers_if_not_already_registered()
register_deployments_if_not_already_registered()
register_metadatas_if_not_already_registered(base_path)
register_tokenizers_if_not_already_registered(base_path)
register_deployments_if_not_already_registered(base_path)
HELM_REGISTERED = True
4 changes: 3 additions & 1 deletion src/helm/benchmark/model_deployment_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,9 +177,11 @@ def get_model_names_with_tokenizer(tokenizer_name: str) -> List[str]:
return [deployment.model_name or deployment.name for deployment in deployments]


def register_deployments_if_not_already_registered() -> None:
def register_deployments_if_not_already_registered(base_path: str = "prod_env") -> None:
JosselinSomervilleRoberts marked this conversation as resolved.
Show resolved Hide resolved
global DEPLOYMENTS_REGISTERED
if not DEPLOYMENTS_REGISTERED:
path: str = resources.files(CONFIG_PACKAGE).joinpath(MODEL_DEPLOYMENTS_FILE)
private_path: str = os.path.join(base_path, MODEL_DEPLOYMENTS_FILE)
maybe_register_model_deployments_from_base_path(path)
maybe_register_model_deployments_from_base_path(private_path)
DEPLOYMENTS_REGISTERED = True
4 changes: 3 additions & 1 deletion src/helm/benchmark/model_metadata_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,11 +200,13 @@ def get_all_instruction_following_models() -> List[str]:
return get_model_names_with_tag(INSTRUCTION_FOLLOWING_MODEL_TAG)


def register_metadatas_if_not_already_registered() -> None:
def register_metadatas_if_not_already_registered(base_path: str = "prod_env") -> None:
JosselinSomervilleRoberts marked this conversation as resolved.
Show resolved Hide resolved
global METADATAS_REGISTERED
if not METADATAS_REGISTERED:
path: str = resources.files(CONFIG_PACKAGE).joinpath(MODEL_METADATA_FILE)
private_path: str = os.path.join(base_path, MODEL_METADATA_FILE)
maybe_register_model_metadata_from_base_path(path)
maybe_register_model_metadata_from_base_path(private_path)
JosselinSomervilleRoberts marked this conversation as resolved.
Show resolved Hide resolved
METADATAS_REGISTERED = True


Expand Down
8 changes: 7 additions & 1 deletion src/helm/benchmark/presentation/summarize.py
Original file line number Diff line number Diff line change
Expand Up @@ -1349,6 +1349,12 @@ def main():
help="Number of instance ids we're using; only for annotating scenario spec instance ids file",
default=1000,
)
parser.add_argument(
"--local-path",
type=str,
help="If running locally, the path for `ServerService`.",
default="prod_env",
)
parser.add_argument(
"--allow-unknown-models",
type=bool,
Expand Down Expand Up @@ -1378,7 +1384,7 @@ def main():
else:
raise ValueError("Exactly one of --release or --suite must be specified.")

register_helm_configurations()
register_helm_configurations(base_path=args.local_path)

# Output JSON files summarizing the benchmark results which will be loaded in the web interface
summarizer = Summarizer(
Expand Down
20 changes: 1 addition & 19 deletions src/helm/benchmark/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@
from helm.common.object_spec import parse_object_spec, get_class_by_name
from helm.proxy.services.remote_service import create_authentication, add_service_args

from helm.benchmark.model_metadata_registry import register_model_metadata_from_path
from helm.benchmark.model_deployment_registry import register_model_deployments_from_path
from helm.benchmark.config_registry import register_helm_configurations
from helm.benchmark.adaptation.adapter_spec import AdapterSpec
from helm.benchmark import vlm_run_specs # noqa
Expand Down Expand Up @@ -246,18 +244,6 @@ def main():
default=None,
help="Full class name of the Runner class to use. If unset, uses the default Runner.",
)
parser.add_argument(
"--model-metadata-paths",
nargs="+",
help="Experimental: Where to read model metadata from",
default=[],
)
parser.add_argument(
"--model-deployment-paths",
nargs="+",
help="Experimental: Where to read model deployments from",
default=[],
)
add_run_args(parser)
args = parser.parse_args()
validate_args(args)
Expand All @@ -266,10 +252,6 @@ def main():
register_huggingface_hub_model_from_flag_value(huggingface_model_name)
for huggingface_model_path in args.enable_local_huggingface_models:
register_huggingface_local_model_from_flag_value(huggingface_model_path)
for model_metadata_path in args.model_metadata_paths:
register_model_metadata_from_path(model_metadata_path)
for model_deployment_paths in args.model_deployment_paths:
register_model_deployments_from_path(model_deployment_paths)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you keep these flags for now because it looks like some people are using it e.g. #2110


run_entries: List[RunEntry] = []
if args.conf_paths:
Expand All @@ -284,7 +266,7 @@ def main():
ensure_directory_exists(args.output_path)
set_benchmark_output_path(args.output_path)

register_helm_configurations()
register_helm_configurations(base_path=args.local_path)

run_specs = run_entries_to_run_specs(
run_entries=run_entries,
Expand Down
4 changes: 3 additions & 1 deletion src/helm/benchmark/tokenizer_config_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,11 @@ def get_tokenizer_config(name: str) -> Optional[TokenizerConfig]:
return TOKENIZER_NAME_TO_CONFIG.get(name)


def register_tokenizers_if_not_already_registered() -> None:
def register_tokenizers_if_not_already_registered(base_path: str = "prod_env") -> None:
global TOKENIZERS_REGISTERED
if not TOKENIZERS_REGISTERED:
path: str = resources.files(CONFIG_PACKAGE).joinpath(TOKENIZER_CONFIGS_FILE)
private_path: str = os.path.join(base_path, TOKENIZER_CONFIGS_FILE)
maybe_register_tokenizer_configs_from_base_path(path)
maybe_register_tokenizer_configs_from_base_path(private_path)
TOKENIZERS_REGISTERED = True
9 changes: 6 additions & 3 deletions src/helm/config/model_deployments.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,12 @@
# Some models have several deployments, each with different parameters.

# If you want to add a new deployment, you can technically do it here but we recommend
# you to do it in private/model_deployments.yaml instead.
# you to do it in prod_env/model_deployments.yaml instead.

# Follow the template of this file to add a new deployment. You can copy paste this to get started:
# # This file defines all the model deployments that you do not want to be public.
# model_deployments: [] # Leave empty to disable private model deployments


model_deployments:

Expand Down Expand Up @@ -506,8 +511,6 @@ model_deployments:
class_name: "helm.benchmark.window_services.gpt2_window_service.GPT2WindowService"
args: {}



# HuggingFaceM4
- name: HuggingFaceM4/idefics-9b
model_name: HuggingFaceM4/idefics-9b
Expand Down
7 changes: 6 additions & 1 deletion src/helm/config/model_metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,12 @@
# The model names here should match the model names in model_deployments.yaml.

# If you want to add a new model, you can technically do it here but we recommend
# you to do it in private/model_metadata.yaml instead.
# you to do it in prod_env/model_metadata.yaml instead.

# Follow the template of this file to add a new model. You can copy paste this to get started:
# # This file contains the metadata for private models
# models: [] # Leave empty to disable private models


models:

Expand Down
10 changes: 10 additions & 0 deletions src/helm/config/tokenizer_configs.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
# This file defines all the tokenizers that are supported by the Helm API.

# If you want to add a new tokenizer, you can technically do it here but we recommend
# you to do it in prod_env/tokenizer_configs.yaml instead.

# Follow the template of this file to add a new tokenizer. You can copy paste this to get started:
# # This file contains the tokenizer configs for the private tokenizers
# tokenizer_configs: [] # Leave empty to disable private tokenizers


tokenizer_configs:

- name: simple/model1
Expand Down
Loading