Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring Storage #12119

Open
wants to merge 35 commits into
base: develop
Choose a base branch
from
Open

Refactoring Storage #12119

wants to merge 35 commits into from

Conversation

jdunkerley
Copy link
Member

Pull Request Description

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.
  • If meaningful changes were made to logic or tests affecting Enso Cloud integration in the libraries,
    or the Snowflake database integration, a run of the Extra Tests has been scheduled.
    • If applicable, it is suggested to paste a link to a successful run of the Extra Tests.

@jdunkerley jdunkerley added the CI: No changelog needed Do not require a changelog entry for this PR. label Jan 25, 2025
@jdunkerley jdunkerley marked this pull request as ready for review January 26, 2025 21:58
Comment on lines +95 to +103
} else if (storage instanceof ColumnDoubleStorage doubleStorage) {
long n = doubleStorage.getSize();
for (long i = 0; i < n; i++) {
if (storage.isNothing(i)) {
appendNulls(1);
} else {
appendDouble(doubleStorage.getItemAsDouble(i));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About these: I'm a little bit slightly worried that because currently all (most?) storages are the 'base' e.g. DoubleStorage, this more 'abstract' variant will rarely run and so is not covered that much by tests.

This is a recurring problem that there are often more edge cases in the implementation of these low-level operations than seems feasible to have high-level tests (especially also ones shared with DB which are often slower).

Not for this PR but perhaps we need to add some 'unit' tests to std-table that test the internals more, to reinforce our correctness here?

Copy link
Member Author

@jdunkerley jdunkerley Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No DoubleStorage is a ColumnDoubleStorage so this will run.
All the typed storages are their specific ColumnStorage.

  • TypedStorage<T> => ColumnStorage<T>
  • BooleanStorage => ColumnStorage<Boolean> and ColumnBooleanStorage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on some tests on the builders to check we get the correct interfaces out makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No DoubleStorage is a ColumnDoubleStorage so this will run. All the typed storages are their specific ColumnStorage.

No, because above it is a more specific branch that has a specialization for DoubleStorage specifically. Which is good. But that means that specialization above runs, not this generic code. That was my point above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok - sorry yes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It felt sensible to add these in at this point but will revisit as pull operations out one by one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will be used pretty quickly as ConstantColumnDouble would be a ColumnDoubleStorage for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should have some java unit tests for these type of internals. I had a similar problem when I was looking to change the expression lanaguage.

Comment on lines 46 to 50
@Override
protected SpecializedStorage<String> newInstance(String[] data, int size) {
return new StringStorage(data, size, type);
public TextType getType() {
// As the type is fixed, we can safely cast it.
return (TextType) super.getType();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to make the Type also part of the generic arguments of SpecializedStorage to avoid these casts.

Please treat as a very optional suggestion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants