-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eval-callback: Example how to use eval callback for debugging #6576
Conversation
Pretty cool. Some notes:
ggml_tensor * t;
char * data; // tensor data
for (int64_t i3 = 0; i3 < t->ne[3]; i3++) {
for (int64_t i2 = 0; i2 < t->ne[2]; i2++) {
for (int64_t i1 = 0; i1 < t->ne[1]; i1++) {
for (int64_t i0 = 0; i0 < t->ne[0]; i0++) {
size_t i = i3*t->nb[3] + i2*t->nb[2] + i1*t->nb[1] + i0*t->nb[0];
float v = *(float *)(data + i);
}
}
}
}
|
This comment was marked as off-topic.
This comment was marked as off-topic.
Looks better, cool! |
@slaren If you can have a quick look at this please when you have time |
@slaren @ggerganov Can I merge ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tool! I've found that also printing the total sum of the elements in the tensor is sometimes a useful metric to look at when debugging
Merge after slaren approves
Not a big deal, but I think that |
} | ||
|
||
printf("%s: %24s = (%s) %10s(%s{%s}, %s}) = {%s}\n", __func__, | ||
t->name, ggml_type_name(t->type), ggml_op_name(t->op), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention that you can use ggml_op_desc
instead of ggml_op_name
to get proper names for unary ops too instead of just UNARY
.
I'm not sure, but it looks like this PR is behind the currently-failing CI builds on OSX, and I'm not currently smart-enough to figure out why. Example: https://github.com/ggerganov/llama.cpp/actions/runs/8651988935/job/23723903645#step:5:4773 Also, I could be wrong, but this might have snuck in a libcurl dependency in at a level that we aren't comfortable with, but again, I don't have a full handle on this yet either. |
Maybe adding |
…nov#6576) * gguf-debug: Example how to use ggml callback for debugging * gguf-debug: no mutex, verify type, fix stride. * llama: cv eval: move cb eval field in common gpt_params * ggml_debug: use common gpt_params to pass cb eval. Fix get tensor SIGV random. * ggml_debug: ci: add tests * ggml_debug: EOL in CMakeLists.txt * ggml_debug: Remove unused param n_batch, no batching here * ggml_debug: fix trailing spaces * ggml_debug: fix trailing spaces * common: fix cb_eval and user data not initialized * ci: build revert label * ggml_debug: add main test label * doc: add a model: add a link to ggml-debug * ggml-debug: add to make toolchain * ggml-debug: tests add the main label * ggml-debug: ci add test curl label * common: allow the warmup to be disabled in llama_init_from_gpt_params * ci: add curl test * ggml-debug: better tensor type support * gitignore : ggml-debug * ggml-debug: printing also the sum of each tensor * ggml-debug: remove block size * eval-callback: renamed from ggml-debug * eval-callback: fix make toolchain --------- Co-authored-by: slaren <slarengh@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Motivation
It can be useful to debug the inference graph, for example, to compare with the original pytorch version.
Also, one would need to retrieve intermediate tensors data like imatrix or advanced NLP.
This example shows how to use inference callback.
Suggestions from @slaren, thanks:
DbrxForCausalLM
#6515 (comment)Changes
ggml-debug
that prints each operation and outputs tensors' data.Example
Output