Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] add benchmarking scripts #615

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft

Conversation

radekosmulski
Copy link
Contributor

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@radekosmulski radekosmulski marked this pull request as draft February 15, 2023 04:38
@@ -0,0 +1,428 @@
{
Copy link
Contributor

@bschifferer bschifferer Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line gave me an errer because I do not have this directoy


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be now fixed

@@ -0,0 +1,428 @@
{
Copy link
Contributor

@bschifferer bschifferer Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeit is nice - but we should store the output to a file, that we can save it

We need a warmup phase as well (first requests are normally slow). We need something like

import time

MODEL_NAME_PT = "t4r_pytorch_pt"

WarmUp

for _ in range(200):
    payload = cudf.DataFrame(data={'sess_pid_seq': np.random.randint(0, 390001, 20), 'id': 0}).groupby('id').agg({'sess_pid_seq': list})
    with grpcclient.InferenceServerClient("localhost:8001") as client:
        col_names = ['sess_pid_seq']
        inputs = nvt_triton.convert_df_to_triton_input(col_names, payload, grpcclient.InferInput)
        response = client.infer(MODEL_NAME_PT, inputs)
    end_time = time.time()

Collecting

out = []
for _ in range(200):
    payload = cudf.DataFrame(data={'sess_pid_seq': np.random.randint(0, 390001, 20), 'id': 0}).groupby('id').agg({'sess_pid_seq': list})

    start_time = time.time()
    with grpcclient.InferenceServerClient("localhost:8001") as client:
        col_names = ['sess_pid_seq']
        inputs = nvt_triton.convert_df_to_triton_input(col_names, payload, grpcclient.InferInput)
        response = client.infer(MODEL_NAME_PT, inputs)
    end_time = time.time()
    out.append(end_time-start_time)



Reply via ReviewNB

@rnyak rnyak self-requested a review February 15, 2023 13:57
@rnyak
Copy link
Contributor

rnyak commented Feb 15, 2023

@radekosmulski can you add some explanations in the notebook what's the purpose of this notebook? what data is being downloaded/used? thanks.

@@ -0,0 +1,428 @@
{
Copy link
Contributor

@rnyak rnyak Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #15.    apt-get install unzip -y

Can you add some explanations here for user who dont know what's going on here:

  • what data is used?
  • From where does it take the trained models? Who trained the models?  what script we should use to train the model and export it?
  • what does this rees46_ecom_dataset_small_for_ci.zip file include? and did you generate it? does it have the trained models in it as exported?

Thanks


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a whole new notebook for documenting training! 🙂 Will continue to keep information as I go.

@@ -0,0 +1,428 @@
{
Copy link
Contributor

@rnyak rnyak Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/transformers4rec/TF4Rec/models/ --> when and how these models were exported to that folder?  can you add some explanations?


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely! added a notebook with steps for training and exporting models

@rnyak rnyak modified the milestones: Merlin 23.02, Merlin 23.03 Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants