-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring Storage #12119
base: develop
Are you sure you want to change the base?
Refactoring Storage #12119
Conversation
df39a40
to
b2d2195
Compare
Replace getItemAsObject with getBoxed.
Still in progress.
Left `AddGroupNumber`, `AddRowNumber` and `AddRunning`.
Remove last usage of `getItem`. Fix so can build.
Working on size still.
Still to rename back getBoxed and getPrimitive. Failing 10 tests. Rebase next.
66e54f1
to
f95c03a
Compare
std-bits/table/src/main/java/org/enso/table/parsing/TypeInferringParser.java
Outdated
Show resolved
Hide resolved
std-bits/table/src/main/java/org/enso/table/parsing/TypeInferringParser.java
Outdated
Show resolved
Hide resolved
getItemAsLong and getItemAsBoolean.
std-bits/table/src/main/java/org/enso/table/data/column/builder/Builder.java
Show resolved
Hide resolved
} else if (storage instanceof ColumnDoubleStorage doubleStorage) { | ||
long n = doubleStorage.getSize(); | ||
for (long i = 0; i < n; i++) { | ||
if (storage.isNothing(i)) { | ||
appendNulls(1); | ||
} else { | ||
appendDouble(doubleStorage.getItemAsDouble(i)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About these: I'm a little bit slightly worried that because currently all (most?) storages are the 'base' e.g. DoubleStorage
, this more 'abstract' variant will rarely run and so is not covered that much by tests.
This is a recurring problem that there are often more edge cases in the implementation of these low-level operations than seems feasible to have high-level tests (especially also ones shared with DB which are often slower).
Not for this PR but perhaps we need to add some 'unit' tests to std-table
that test the internals more, to reinforce our correctness here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No DoubleStorage
is a ColumnDoubleStorage
so this will run.
All the typed storages are their specific ColumnStorage
.
TypedStorage<T>
=>ColumnStorage<T>
BooleanStorage
=>ColumnStorage<Boolean>
andColumnBooleanStorage
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree on some tests on the builders to check we get the correct interfaces out makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No
DoubleStorage
is aColumnDoubleStorage
so this will run. All the typed storages are their specificColumnStorage
.
No, because above it is a more specific branch that has a specialization for DoubleStorage
specifically. Which is good. But that means that specialization above runs, not this generic code. That was my point above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok - sorry yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It felt sensible to add these in at this point but will revisit as pull operations out one by one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These will be used pretty quickly as ConstantColumnDouble
would be a ColumnDoubleStorage
for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should have some java unit tests for these type of internals. I had a similar problem when I was looking to change the expression lanaguage.
@Override | ||
protected SpecializedStorage<String> newInstance(String[] data, int size) { | ||
return new StringStorage(data, size, type); | ||
public TextType getType() { | ||
// As the type is fixed, we can safely cast it. | ||
return (TextType) super.getType(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm tempted to make the Type
also part of the generic arguments of SpecializedStorage
to avoid these casts.
Please treat as a very optional suggestion
Pull Request Description
Important Notes
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
or the Snowflake database integration, a run of the Extra Tests has been scheduled.