-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ntt cpu #533
ntt cpu #533
Conversation
e7add84
to
ba7d500
Compare
847ca3f
to
eb79777
Compare
600f4e2
to
707c474
Compare
82a9534
to
fd29c06
Compare
icicle_v3/tests/test_field_api.cpp
Outdated
auto out_main = std::make_unique<TypeParam[]>(N); | ||
auto out_ref = std::make_unique<TypeParam[]>(N); | ||
// // Randomize config | ||
// TODO - iterate over different configs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when are you going to do it? I think it's better than random since we know what we check. Otherwise CI can pass sometimes and fail in others, based on the randomized case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update: since rust tests are covering many cases, this test can doesn't have to as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
icicle_v3/tests/test_field_api.cpp
Outdated
const NTTDir dir = NTTDir::kForward; | ||
|
||
// Print config | ||
ICICLE_LOG_DEBUG << "logn: " << logn; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when done, please remove those logs.
icicle_v3/tests/test_field_api.cpp
Outdated
config.batch_size = batch_size; // default: 1 | ||
config.columns_batch = columns_batch; // default: false | ||
config.ordering = ordering; // default: kNN | ||
config.are_inputs_on_device = true; // TODO, ask yuval why set to true? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for CPU it doesn't matter but for devices with own memory, like GPU/ZPU/FPGA, we need to know where the data resides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
REGISTER_NTT_INIT_DOMAIN_BACKEND("CPU", (cpu_ntt_init_domain)); | ||
REGISTER_NTT_INIT_DOMAIN_BACKEND("CPU_REF", (cpu_ntt_init_domain)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do CPU and CPU_REF share the domain or will it be allocated for each?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we talked, I left it like this.
|
||
#ifdef EXT_FIELD | ||
REGISTER_NTT_EXT_FIELD_BACKEND("CPU", (cpu_ntt<scalar_t, extension_t>)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably this case also needs to be registered for CPU_REF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
||
using namespace field_config; | ||
using namespace icicle; | ||
namespace ntt_template { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also rename the namespace to maybe cpu_ntt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "cpu_ntt_h" or something else? "cpu_ntt" creates errors in cpu_ntt.cpp
// calculate twiddles | ||
// Note: radix-2 INTT needs ONE in last element (in addition to first element), therefore have n+1 elements | ||
|
||
std::unique_ptr<S[]> temp_twiddles(new S[s_ntt_domain.max_size + 1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comment why you need the temp twiddles. It's because otherwise twiddles are not nullptr but still not computed right?
BTW you'd better use std::make_unique<S[]>(max_size+1) instead of new.
1979bb0
to
5cc8257
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks good but I wrote a few comments
NTTDir dir = NTTDir::kForward, | ||
bool columns_batch = false) | ||
{ | ||
int size = 1 << logn; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that this can potentially overflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? isn't logn maximum 30?
template <typename S = scalar_t, typename E = scalar_t> | ||
eIcicleError cpu_ntt_ref(const Device& device, const E* input, int size, NTTDir dir, NTTConfig<S>& config, E* output) | ||
{ | ||
if (size & (size - 1)) { return eIcicleError::INVALID_ARGUMENT; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use ICICLE_LOG_ERROR to log errors to user in case of invalid inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
temp_cosets[i] = coset; | ||
} | ||
arbitrary_coset = temp_cosets.get(); | ||
temp_cosets.release(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
release?
Why don't you allocate the arbitrary coset directly? as a unique_ptr I mean
break; | ||
} | ||
} | ||
if (coset_stride == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add comments explaining what is being done in every step. It would make the code easier to understand
|
||
// NTT/INTT | ||
if (dit) { | ||
ICICLE_LOG_DEBUG << "DIT"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either print a clear message or remove
if (dir == NTTDir::kInverse) { | ||
// Normalize results | ||
S inv_size = S::inv_log_size(logn); | ||
ICICLE_LOG_DEBUG << "inv_size: " << inv_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too
dba7352
to
2e73477
Compare
b351426
to
4473fc3
Compare
9251bd1
to
fb4e977
Compare
fb4e977
to
38a68a0
Compare
No description provided.