Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c++] Hide internal dimensions #3266

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

XanthosXanthopoulos
Copy link
Collaborator

@XanthosXanthopoulos XanthosXanthopoulos commented Oct 31, 2024

Geometry dataframe creates some extra internal dimensions to support spatial indexing. When returning back the index column names and arrow schema these dimensions (being implementation-specific) should be hidden and replaces with soma_geometry as index column.

Additionally setting a range for spatial axes should only need the axis name and not the internal index names.

Comment on lines 247 to 248
std::shared_ptr<Context> ctx,
std::shared_ptr<Array> tiledb_array);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you can change the implementation so these are passed as const references instead of pointers without too much pain.

Copy link
Member

@johnkerl johnkerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include in this PR's description some copy/paste of:

  • Example use of how the user might keystroke out the index_column_names and domain at create
    • Either for SOMAGeometryDataFrame.create, and/or
    • For an ingestion step at the experiment level
  • What domain might look like when they read it back

Also:

libtiledbsoma/src/utils/arrow_adapter.h Outdated Show resolved Hide resolved
libtiledbsoma/src/utils/arrow_adapter.h Outdated Show resolved Hide resolved
Base automatically changed from xan/geometry_cast_util to main November 17, 2024 23:22
Copy link

codecov bot commented Nov 18, 2024

Codecov Report

Attention: Patch coverage is 11.33333% with 133 lines in your changes missing coverage. Please review.

Project coverage is 66.83%. Comparing base (946d3f7) to head (87cef7c).

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #3266       +/-   ##
===========================================
- Coverage   85.14%   66.83%   -18.32%     
===========================================
  Files          53      138       +85     
  Lines        5568    17689    +12121     
  Branches        0     1108     +1108     
===========================================
+ Hits         4741    11822     +7081     
- Misses        827     5484     +4657     
- Partials        0      383      +383     
Flag Coverage Δ
libtiledbsoma 58.36% <11.33%> (?)
python 85.25% <ø> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 85.25% <ø> (+0.10%) ⬆️
libtiledbsoma 53.17% <10.13%> (∅)


// Both min and max dimension share the same domain
if (ArraySchemaExperimental::current_domain(
*this->ctx()->tiledb_ctx(), *this->tiledb_schema())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You reference both of these throughout

*this->ctx()->tiledb_ctx(), *this->tiledb_schema())

Please make an

auto ctx = this->ctx()->tiledb_ctx();

and

auto schema = this->tiledb_schema();

*
* @return ArrowSchema
*/
static std::unique_ptr<ArrowSchema> arrow_schema_from_tiledb_dimension(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish you had made these helpers as a separate PR. As an ongoing ask I am continuing to ask for split-out PRs.

This PR is only 300 lines, which isn't a lot, but, all the codecov warnings are making this hard to read -- as if it had been a high-line-count PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ask you to split this not due to high line count but because you've got two different things going on. Pulling out helper functions is distinct from the goal of the PR (hiding internal dimensions) and is worth a separate PR. Then you can stack the dimension-hiding on top of it as a second PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, please split out a PR with only the arrow_adapter.h and arrow_adapter.cc mods. That, I will understand quickly, and will be able to review quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants