Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

postgres: add boolean array #453

Merged
merged 2 commits into from
Jul 3, 2023

Conversation

Amar1729
Copy link
Contributor

Not sure I'm doing this in the right way, but this is my attempt to add support for BOOLEAN[] arrays ("_bool") to the postgres typesystem.

With these changes, I'm getting this error when I attempt to run the python postgres tests:

$ just test-python -k postgres -x

...
RuntimeError: No conversion rule from BoolArray(true) to connectorx::pandas::typesystem::PandasTypeSystem.

But I'm not sure how to fix that error. Any advice?

@wangxiaoying
Copy link
Contributor

Hi @Amar1729 , thanks for the PR!

To resolve this error, you need to add the conversion from Postgres::BooleanArray to PandasTypeSystem::BooleanArray in this file. Basically add a line like this:

{ BoolArray[Vec<bool>]                           => BoolArray[Vec<bool>]       | conversion auto_vec }

@Amar1729
Copy link
Contributor Author

@wangxiaoying sorry for the delay on this! got busy IRL.

I think I'm pretty close to this - muddled through most of the impls that i was missing last time, including the ArrowAssoc for Vec<bool> ones.

Now i'm getting the following errors:

$ just test-python -k postgres
...
FAILED connectorx/tests/test_postgres.py::test_postgres_types_csv - RuntimeError: Cannot produce a bool, context: {t,f}.
FAILED connectorx/tests/test_postgres.py::test_postgres_types_simple - RuntimeError: Cannot produce a alloc::vec::Vec<bool>, context: {t,f}.

for which I'm a little bit stumped. From that error it seems to me like the postgres_arrow::{CsvProtocol, SimpleProtocol} are both using the defined impl Produce<'r, bool> , but i added bool to both impl_csv_vec_produce and impl_simple_vec_produce ...

@wangxiaoying
Copy link
Contributor

wangxiaoying commented Jun 17, 2023

Hi @Amar1729 , I think the error for csv is because postgres's rust driver does not support return bool type directly for csv protocol directly. The bool type it return for csv protocol is a string (doc), for each "t" means true and "f" means false. So we need to manually write the function to convert the result like here. For both bool and Option<bool>. So you probably also need to manually write conversion function to produce vec<bool> as well.

Similar for simple protocol. The result of getting the value in simple protocol is in string try_get, and similar to csv, we get the string value first and then parse its result into the corresponding type (e.g., int, bool) like here using .parse(). Actually I don't remember what's its return value for bool type but seems here unlike csv we can directly parse the single value of bool. However, for the vector type, we directly split the result by , and then use parse() like here, which may not work for vector<bool>. You may want to print out what's its result look like and then write a conversion function to parse it.

@Amar1729
Copy link
Contributor Author

Alright, i think i figured it out! I've got postgres-related python tests passing:

$ just test-python -k postgres
========================== 31 passed, 5 skipped, 101 deselected, 1 xfailed in 4.54s ==========================

Let me know if there's anything I need to change.

  • I wasn't sure about the "right" way to change the arrow_assoc macros
  • are the new Produce impls OK? or should they be made into macros too?
  • are there more tests I need to add? I've only updated the test_postgres.py and seed data
  • I can rebase/squash these commits if everything looks good.

@Amar1729
Copy link
Contributor Author

Amar1729 commented Jun 18, 2023

I added a test in connectorx/tests/test_polars.rs as well, but couldn't figure out how to create a null Series, so the query is slightly different from the other tests for array types in test_polars. cargo test --features all test_pg_pl_bool_array passes

@Amar1729 Amar1729 force-pushed the feat/postgres-bool-array branch 2 times, most recently from c5288e0 to f873fc0 Compare June 18, 2023 18:49
@Amar1729
Copy link
Contributor Author

My bad, forgot cargo fmt 😞 . updated

@Amar1729 Amar1729 changed the title Draft: postgres add boolean array postgres: add boolean array Jun 23, 2023
@Amar1729
Copy link
Contributor Author

@wangxiaoying should be good to re-run CI on this, I ran cargo fmt on the rust code i changed.

@Amar1729
Copy link
Contributor Author

Let me know if there's anything else that need to be changed / any edge cases i might have missed / whether i should squash or rework the commits. otherwise i think this is done 👍

@wangxiaoying
Copy link
Contributor

Hi @Amar1729 , thank you for the PR! It looks good to me. The support is done for all the destinations and the test looks good to me. I think we can merge this PR after rebase. : )

@Amar1729
Copy link
Contributor Author

@wangxiaoying no problem! What should i rebase, it looks like this branch is based off of current main for this repo? Or do you want me to squash all the commits into one?

@wangxiaoying
Copy link
Contributor

@wangxiaoying no problem! What should i rebase, it looks like this branch is based off of current main for this repo? Or do you want me to squash all the commits into one?

Hi @Amar1729 , sorry for the late reply. I think we can just squash all the commit in this PR into one.

- update typesystems, destinations
- update pandas_columns, transports, and postgres
  - implement Vec<bool>, Option<Vec<bool>> for PostgresCSVSourceParser
    and PostgresSimpleSourceParser (parsing for `t` and `f` input)
- modify `impl_arrow_assoc_vec` to take three arguments
  - had to change the macro since booleans use `MutableBooleanArray`
- update `test_postgres.py`, `test_polars.rs`
@Amar1729
Copy link
Contributor Author

Amar1729 commented Jul 1, 2023

@wangxiaoying no problem! done.

@wangxiaoying wangxiaoying merged commit fe1b38a into sfu-db:main Jul 3, 2023
@Amar1729 Amar1729 deleted the feat/postgres-bool-array branch July 4, 2023 07:36
@wangxiaoying
Copy link
Contributor

Thanks @Amar1729 ! It is now merged, and included in 0.3.2-alpha.7, which will be released when this is done!

@Amar1729
Copy link
Contributor Author

Amar1729 commented Jul 5, 2023

wo0t thanks for the update (and maintaining this library)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants