Replace `dump` functions with `operator<<` overloads #5026

kounelisagis · 2024-05-30T09:36:35Z

In TileDB-Py schema.dump() is called without any argument, leading writes to C stdout which Jupyter does not capture:

TileDB/tiledb/sm/array_schema/array_schema.cc

Lines 693 to 694 in 76dbda4

    
           if (out == nullptr) 
        
             out = stdout;

Implementing a function that captures the output of the dump() to a string and then prints it to sys.stdout will fix the problem. Of course, ArraySchema::dump(std::string*) calls the dump() functions of other classes, so those need to be implemented as well.

But we don't even want to write functions called dump internally; that's a C idiom; we will reserve that term for the C API only. Since we're dealing with text output, everything here can be done better by overloading operator<< for std::ostream.

TYPE: IMPROVEMENT
DESC: Output of schema.dump() in TileDB-Py is not captured by Jupyter. Replacing dump functions with operator<< overloads will give the ability to print the resulted string from Python.

tiledb/sm/c_api/tiledb.h

tiledb/sm/array_schema/array_schema.h

eric-hughes-tiledb

All these functions that implement a "dump" should be implemented as nonmember functions. This is in part to assure correctness. There's nothing in such a function for a schema that should ever need to rely on private member variables or private member functions. Using nonmember functions assures that they can't be called.

For historical reasons, our serialization and deserialization functions, both for storage and network, are implemented as member functions, but it would be better if they were moved out themselves. That's out of scope of this PR, but all the new code going in can be done this way.

We don't even want to write functions called dump internally; that's a C idiom; we will reserve that term for the C API only. Since we're dealing with text output, everything here can be done better by overloading operator<< for std::ostream. We don't want functions that return strings.

Changes required:

No new dump functions.
New overloads of operator<<.
Implement C API functions with stringstream and fstream internally.

For an quick reference for the write function signatures, see the section "Stream Insertion and Extraction" in https://stackoverflow.com/questions/4421706/what-are-the-basic-rules-and-idioms-for-operator-overloading.

tiledb/sm/array_schema/attribute.h

tiledb/sm/array_schema/array_schema.h

eric-hughes-tiledb

The various T::dump(FILE *) functions should all be inlined into their call sites in the C API implementation functions. The class-specific parts are not in operator<< functions and all that's remaining in these is a bit of interface code. Also in all of these function fwrite is better than fprintf.

tiledb/sm/array_schema/array_schema.cc

tiledb/sm/array_schema/attribute.h

tiledb/sm/array_schema/array_schema.cc

tiledb/sm/array_schema/array_schema.h

tiledb/sm/array_schema/attribute.cc

eric-hughes-tiledb

Even without dealing with the issues with Filter, there are a number of mostly-small changes to be made.

For class Filter, it looks like the following would work:

A single definition of operator<< for filters.
A pure virtual function Filter::output that (1) calls. This should be protected.
An override of Filter::output in each class derived from filter. Also protected.
A friend declaration for operator<< so that Filter::output is available.

tiledb/sm/array_schema/attribute.cc

tiledb/sm/array_schema/dimension.cc

tiledb/sm/array_schema/dimension.h

tiledb/sm/array_schema/dimension_label.h

tiledb/sm/array_schema/domain.h

tiledb/sm/array_schema/enumeration.cc

tiledb/sm/c_api/tiledb.h

tiledb/sm/cpp_api/schema_base.h

kounelisagis · 2024-06-10T18:07:54Z

Even without dealing with the issues with Filter, there are a number of mostly-small changes to be made.

For class Filter, it looks like the following would work:

A single definition of operator<< for filters.

A pure virtual function Filter::output that (1) calls. This should be protected.

An override of Filter::output in each class derived from filter. Also protected.

A friend declaration for operator<< so that Filter::output is available.

@eric-hughes-tiledb, making it a friend causes the need to be in tiledb::sm, causing all other operator<< overloads to also be in the namespace.

eric-hughes-tiledb · 2024-06-10T18:34:47Z

making it a friend causes the need to be in tiledb::sm

Friend declarations can be for namespace-qualified identifiers. friend ::something for the global namespace, for instance.

eric-hughes-tiledb

I won't approve this until ArraySchema::dump is gone, along with all of its ilk. Now that we have proper stream output, there's nothing left of value in these functions. All of the

All the assert statements here have to go. If the need for [[maybe_unused]] wasn't clear enough, assert does not always check the return value of an I/O function. (It does so only in some compiles, but not all.) We always need to check the return value of I/O functions.

The signature of Filter::output has to change. And we don't need Filter::dump any longer after this change; it and all its overrides can just go.

There are a number of places that use \n explicitly. These should be replaced with std::endl.

tiledb/sm/array_schema/array_schema.cc

tiledb/sm/array_schema/current_domain.cc

tiledb/sm/array_schema/dimension.cc

tiledb/sm/array_schema/ndrectangle.cc

tiledb/sm/array_schema/enumeration.cc

tiledb/sm/array_schema/attribute.cc

tiledb/sm/array_schema/dimension_label.cc

tiledb/sm/c_api/tiledb.h

tiledb/sm/filter/filter.h

tiledb/api/c_api/attribute/attribute_api.cc

tiledb/api/c_api/domain/domain_api.cc

eric-hughes-tiledb

I couldn't resolve all your suggestions

I have. The problem with the regression test seems to have been a preexisting defect; there was missing inclusion of the external dimension header.

I went ahead and fixed up all my last comments as well, since I already was building the branch locally.

LGTM.

The result is a solid improvement over the prior.

teo-tsirpanis

These APIs do not exist.

tiledb/doxygen/source/c-api.rst

tiledb/api/c_api/dimension/dimension_api.cc

tiledb/api/c_api/attribute/attribute_api.cc

teo-tsirpanis

Thanks!

[sc-32959] In TileDB-Py `schema.dump()` is called without any argument, leading writes to C `stdout` which Jupyter does not capture: https://github.com/TileDB-Inc/TileDB/blob/76dbda43d98bff7b71527fbffd0dfd657437b00f/tiledb/sm/array_schema/array_schema.cc#L693-L694 Implementing a function that captures the output of the `dump()` to a string and then prints it to `sys.stdout` will fix the problem. Of course, `ArraySchema::dump(std::string*)` calls the `dump()` functions of other classes, so those need to be implemented as well. **But** we don't even want to write functions called dump internally; that's a C idiom; we will reserve that term for the C API only. Since we're dealing with text output, everything here can be done better by overloading `operator<<` for `std::ostream`. --- TYPE: IMPROVEMENT DESC: Output of `schema.dump()` in TileDB-Py is not captured by Jupyter. Replacing `dump` functions with `operator<<` overloads will give the ability to print the resulted string from Python. --------- Co-authored-by: Eric Hughes <eric.hughes@tiledb.com> (cherry picked from commit f40530f)

[sc-32959] Following #5026, `dump` C++ APIs were mistakenly deleted. This PR reimplements them as deprecated. `operator<<` remains the canonical interface. --- TYPE: NO_HISTORY DESC: Reimplement `dump` C++ APIs as deprecated. --------- Co-authored-by: Luc Rancourt <lucrancourt@gmail.com> Co-authored-by: KiterLuc <67824247+KiterLuc@users.noreply.github.com> Co-authored-by: Theodore Tsirpanis <theodore.tsirpanis@tiledb.com>

…dump(FILE*)` with `operator<<` overload. (#5266) After #5026, the only `dump` API that remains without a string counterpart is on `FragmentInfo`. This PR adds the `tiledb_fragment_info_dump_str` C API and replaces `FragmentInfo::dump(FILE*)` with an `operator<<` overload. [sc-50605] --- TYPE: C_API | CPP_API DESC: Add `tiledb_fragment_info_dump_str` C API and replace `FragmentInfo::dump(FILE*)` with `operator<<` overload. --------- Co-authored-by: Theodore Tsirpanis <theodore.tsirpanis@tiledb.com>

Add dump functions

76dbda4

kounelisagis requested a review from ihnorton May 30, 2024 09:36

teo-tsirpanis reviewed May 30, 2024

View reviewed changes

tiledb/sm/c_api/tiledb.h Outdated Show resolved Hide resolved

Rename the API to the existing one

2c08b88

kounelisagis force-pushed the agis/add-dump-functions-with-str-argument branch from 8f974d8 to 2c08b88 Compare May 30, 2024 11:36

kounelisagis added 2 commits May 30, 2024 18:08

Remove duplicate code

2e50554

Merge dev

a8d2f26

kounelisagis marked this pull request as ready for review May 30, 2024 15:32

kounelisagis requested a review from teo-tsirpanis May 30, 2024 15:32

teo-tsirpanis reviewed May 30, 2024

View reviewed changes

tiledb/sm/array_schema/array_schema.h Outdated Show resolved Hide resolved

eric-hughes-tiledb suggested changes May 30, 2024

View reviewed changes

tiledb/sm/array_schema/attribute.h Outdated Show resolved Hide resolved

tiledb/sm/array_schema/array_schema.h Outdated Show resolved Hide resolved

eric-hughes-tiledb mentioned this pull request May 31, 2024

Migrate APIs out of StorageManager: store_array_schema. #5019

Merged

Change basic dump functions to operator<< overloads

f30a995

kounelisagis requested a review from eric-hughes-tiledb June 4, 2024 00:07

eric-hughes-tiledb reviewed Jun 5, 2024

View reviewed changes

eric-hughes-tiledb suggested changes Jun 5, 2024

View reviewed changes

TileDB-Inc deleted a comment from eric-hughes-tiledb Jun 10, 2024

Fix comments

67d7ed4

kounelisagis force-pushed the agis/add-dump-functions-with-str-argument branch from f5451fe to 67d7ed4 Compare June 11, 2024 00:01

kounelisagis requested a review from eric-hughes-tiledb June 11, 2024 00:01

kounelisagis added 2 commits June 11, 2024 03:40

Merge branch 'dev' into agis/add-dump-functions-with-str-argument

276c406

Fix merge errors

7baa8fd

kounelisagis force-pushed the agis/add-dump-functions-with-str-argument branch from 91bbef2 to 7baa8fd Compare June 11, 2024 01:05

kounelisagis and others added 3 commits June 11, 2024 15:34

Merge branch 'dev' into agis/add-dump-functions-with-str-argument

a8c64ee

Merge branch 'dev' into agis/add-dump-functions-with-str-argument

be6edf6

Fix windows

c0ebda3

eric-hughes-tiledb suggested changes Jun 14, 2024

View reviewed changes

eric-hughes-tiledb self-requested a review June 14, 2024 00:34

eric-hughes-tiledb reviewed Jul 3, 2024

View reviewed changes

tiledb/api/c_api/attribute/attribute_api.cc Outdated Show resolved Hide resolved

tiledb/api/c_api/domain/domain_api.cc Outdated Show resolved Hide resolved

eric-hughes-tiledb added 2 commits July 3, 2024 12:07

Fix a consequence of header changes.

0f8a70f

More of it

3d74fde

eric-hughes-tiledb approved these changes Jul 3, 2024

View reviewed changes

kounelisagis changed the title ~~Add dump functions with std::string* argument~~ Replace dump functions with operator<< overloads Jul 4, 2024

teo-tsirpanis reviewed Jul 4, 2024

View reviewed changes

tiledb/doxygen/source/c-api.rst Outdated Show resolved Hide resolved

tiledb/doxygen/source/c-api.rst Outdated Show resolved Hide resolved

kounelisagis force-pushed the agis/add-dump-functions-with-str-argument branch 2 times, most recently from 700a974 to e72a39c Compare July 4, 2024 09:37

kounelisagis requested a review from teo-tsirpanis July 4, 2024 11:10

teo-tsirpanis reviewed Jul 4, 2024

View reviewed changes

tiledb/api/c_api/dimension/dimension_api.cc Show resolved Hide resolved

Document new dump_str APIs

7d71079

kounelisagis force-pushed the agis/add-dump-functions-with-str-argument branch from 0813e4b to 7d71079 Compare July 4, 2024 11:37

teo-tsirpanis reviewed Jul 4, 2024

View reviewed changes

tiledb/api/c_api/attribute/attribute_api.cc Show resolved Hide resolved

Add tiledb_dimension_dump_str

bde9480

kounelisagis requested a review from teo-tsirpanis July 4, 2024 13:31

teo-tsirpanis approved these changes Jul 4, 2024

View reviewed changes

Merge branch 'dev' into agis/add-dump-functions-with-str-argument

44399c2

KiterLuc merged commit f40530f into dev Jul 9, 2024
62 checks passed

KiterLuc deleted the agis/add-dump-functions-with-str-argument branch July 9, 2024 18:07

kounelisagis mentioned this pull request Jul 9, 2024

Update HISTORY for release 0.32.0 against TileDB 2.26.0 and replace dump() calls with operator<< TileDB-Inc/TileDB-Py#1975

Merged

eddelbuettel mentioned this pull request Jul 10, 2024

Refactor three removed dump() calls into stringstream and << use TileDB-Inc/TileDB-R#727

Merged

kounelisagis mentioned this pull request Jul 11, 2024

Reimplement C++ dump APIs as deprecated. #5179

Merged

kounelisagis mentioned this pull request Aug 28, 2024

Add tiledb_fragment_info_dump_str C API and replace FragmentInfo::dump(FILE*) with operator<< overload. #5266

Merged

kounelisagis mentioned this pull request Sep 12, 2024

Make dump calls conditional to fix errors when building against <2.26 TileDB-Inc/TileDB-Py#2062

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace `dump` functions with `operator<<` overloads #5026

Replace `dump` functions with `operator<<` overloads #5026

kounelisagis commented May 30, 2024 •

edited

Loading

eric-hughes-tiledb left a comment

eric-hughes-tiledb left a comment

eric-hughes-tiledb left a comment

kounelisagis commented Jun 10, 2024 •

edited

Loading

eric-hughes-tiledb commented Jun 10, 2024

eric-hughes-tiledb left a comment

eric-hughes-tiledb left a comment

teo-tsirpanis left a comment

teo-tsirpanis left a comment

Replace dump functions with operator<< overloads #5026

Replace dump functions with operator<< overloads #5026

Conversation

kounelisagis commented May 30, 2024 • edited Loading

eric-hughes-tiledb left a comment

Choose a reason for hiding this comment

eric-hughes-tiledb left a comment

Choose a reason for hiding this comment

eric-hughes-tiledb left a comment

Choose a reason for hiding this comment

kounelisagis commented Jun 10, 2024 • edited Loading

eric-hughes-tiledb commented Jun 10, 2024

eric-hughes-tiledb left a comment

Choose a reason for hiding this comment

eric-hughes-tiledb left a comment

Choose a reason for hiding this comment

teo-tsirpanis left a comment

Choose a reason for hiding this comment

teo-tsirpanis left a comment

Choose a reason for hiding this comment

Replace `dump` functions with `operator<<` overloads #5026

Replace `dump` functions with `operator<<` overloads #5026

kounelisagis commented May 30, 2024 •

edited

Loading

kounelisagis commented Jun 10, 2024 •

edited

Loading