Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NPUW] Weightless serialization #28469

Open
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

smirnov-alexey
Copy link
Contributor

@smirnov-alexey smirnov-alexey commented Jan 15, 2025

@smirnov-alexey smirnov-alexey changed the title WIP: [DO NOT MERGE][NPUW] Weightless serialization [NPUW] Weightless serialization Jan 20, 2025
}

// Weightless
// FIXME: all serialization needs a good rewriting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, might be first we need to dissimilate different bools flags with creating of consts and writing them instead of false and true

// Identify either full flow or weightless
bool is_weightless = false;
if (m_non_llm_props.count(ov::cache_mode.name()) &&
m_non_llm_props.at(ov::cache_mode.name()).as<std::string>() == "OPTIMIZE_SIZE") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
m_non_llm_props.at(ov::cache_mode.name()).as<std::string>() == "OPTIMIZE_SIZE") {
m_non_llm_props.at(ov::cache_mode.name()).as<CacheMode>() == CacheMode::OPTIMIZE_SIZE) {

BTW, weightless cache is used for GPU in CacheMode::OPTIMIZE_SIZE, because in generic case it will drop performance of import_model.
Is it the same for NPU? or you can use weightless cache in all cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same in our case. With weightless flow the performance is worse

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the code

Copy link
Contributor

@pereanub pereanub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have some failing tests in NPU functional tests, please check them as well.

Also, not sure that we need to remove property from compiler properties to keep backward compatibility. @PatrikStepan, @csoka any idea? L.E.: Or maybe we don't need this since we don't register it in the plugin as well.

Comment on lines 382 to 402
// FIXME: create proper op identificators instead of int
std::visit(overloaded{[&stream](const op::Concat& op) {
write(stream, int{0});
op.serialize(stream);
},
[&stream](const op::Const& op) {
write(stream, int{1});
op.serialize(stream);
},
[&stream](const op::Convert& op) {
write(stream, int{2});
op.serialize(stream);
},
[&stream](const op::Permute& op) {
write(stream, int{3});
op.serialize(stream);
},
[&stream](const op::Unpack& op) {
write(stream, int{4});
op.serialize(stream);
}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhhh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, missed that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines 412 to 430
switch (op_type) {
case 0:
lt_impl->m_transform = op::Concat::deserialize(stream);
break;
case 1:
lt_impl->m_transform = op::Const::deserialize(stream);
break;
case 2:
lt_impl->m_transform = op::Convert::deserialize(stream);
break;
case 3:
lt_impl->m_transform = op::Permute::deserialize(stream);
break;
case 4:
lt_impl->m_transform = op::Unpack::deserialize(stream);
break;
default:
NPUW_ASSERT(false && "Unsupported type");
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants