Separate unit tests for index types #580

jsmassa · 2022-11-21T18:37:37Z

Addresses #520

Turns all default values for a database into dynamic vars, so they can be set in tests.edn to test specific information.

The idea is to use the default configuration for all tests, so each test has to be written only once.

Problematic with this strategy are

:attribute-refs? because the entity IDs are shifted and depending on the function used either keyword or eids are returned, and
:schema-on-write because it needs a schema to be transacted, which means that here entity IDs are also shifted

For these cases, we would need some kind of translation, similar to the way it is done in the test/datahike/test/attribute-refs test namespaces. If possible, I would like to get rid of these test namespaces as they only contain a subset of the tests and are cumbersome to maintain.
Also, we would need to rewrite a lot of the tests to not check for the entity id but for example run queries as it is already done in some updated tests

Unproblematic, however, are config options such like

:store-backend and
:keep-history?

Please, tell me if this strategy makes sense or if there is some problem with it that I have overlooked.

I am also interested to hear which config options you think should be tested, or which combinations you think would make sense.

tests.edn

.circleci/config.yml

tests.edn

whilo · 2022-12-25T12:31:59Z

The strategy makes sense and normalizing the tests like this must be the key strategy to avoid duplication and arbitrary symmetry breaking (i.e. having different tests for different configurations). Thanks for addressing this @jsmassa !

I am not familiar enough with the attribute-ref mismatches. For schema settings I would suggest to always assume the stricter requirements (schema-on-write) and then potentially skip it automatically for schema-on-read. This can be done with a function in the test namespaces that dispatches on the schema configuration. Do you think this would work? Would this also be possible for the attribute-refs?

Also, is there still a reason to have attribute-refs only optionally? Edit: Yes, of course, because they depend on schema-on-write. But does this mean that when we always provide a schema all tests can be run with attribute-refs?

whilo · 2022-12-27T13:57:41Z

A related serious problem I just stumbled over in #332 is that many tests use the same database, which worked so far for the memory database because the connections were not (properly) shared. It would be good if the db :id for the memory database could be automatically retrieved from the test name. Can we address this here as well?

Edit: I have solved this there for now by manually duplicating the connections for the tests.

jsmassa · 2023-01-05T16:09:12Z

The schema isn't really a problem. Yes, we can always add a schema, but we could also only add it when required. That doesn't make much of a difference, I think.
The most problematic thing is still the entity ID which is being checked in a lot of the tests, so there is heaps of rewriting to do. Entity IDs for user entities are shifted using attribute references because system attributes like :db/ident or :db/valueType also need an entity ID and those are added to the database before the first user entities can be transacted.

jsmassa · 2023-01-05T16:09:30Z

The schema isn't really a problem. Yes, we can always add a schema, but we could also only add it when required. That doesn't make much of a difference, I think.
The most problematic thing is still the entity ID which is being checked in a lot of the tests, so there is heaps of rewriting to do. Entity IDs for user entities are shifted using attribute references because system attributes like :db/ident or :db/valueType also need an entity ID and those are added to the database before the first user entities can be transacted.

whilo · 2023-01-08T12:05:03Z

Could we just grab max-eid at the beginning of the tests and use this as the offset?

whilo

Looks good to me. Nice work!

jsmassa · 2023-01-10T20:17:43Z

Could we just grab max-eid at the beginning of the tests and use this as the offset?

This is approximately how I did it for the attribute reference tests. Some of the tests used specific input datoms though in addition to the expected output datoms, so the input had to be manipulated as well. This lead me to write some helper functions to shift collections of datoms according to the amount of the system entity datoms. So far, there are also separate functions to shift the value as well which is needed for attributes with reference value type. Of course, this can be made simpler with a more complex helper function detecting when the value is a reference and has to be shifted.

In some instances there might not be a way around using functions to shift the entity IDs to the correct values, but I would say that in the end, a database user will not care about which ID an entity gets allocated inside the database or if the attributes are saved as references or keywords. This is why I think that, ideally, we should avoid checking those details for high-level functions like transact, q or pull.

I would say that the strategies I used for the attribute reference tests in test/datahike/test/attribute_refs/ are ok and - with some improvements - can be used similarly to refactor most tests to work with all configurations. I would hope though that we can find a reformulation of a lot of the tests that make them more focused on the actual fact they are challenging and better readable than they would be with a habitual shifting of numbers.

whilo · 2023-01-11T18:40:32Z

Could we just grab max-eid at the beginning of the tests and use this as the offset?

This is approximately how I did it for the attribute reference tests. Some of the tests used specific input datoms though in addition to the expected output datoms, so the input had to be manipulated as well. This lead me to write some helper functions to shift collections of datoms according to the amount of the system entity datoms. So far, there are also separate functions to shift the value as well which is needed for attributes with reference value type. Of course, this can be made simpler with a more complex helper function detecting when the value is a reference and has to be shifted.

In some instances there might not be a way around using functions to shift the entity IDs to the correct values, but I would say that in the end, a database user will not care about which ID an entity gets allocated inside the database or if the attributes are saved as references or keywords. This is why I think that, ideally, we should avoid checking those details for high-level functions like transact, q or pull.

I would say that the strategies I used for the attribute reference tests in test/datahike/test/attribute_refs/ are ok and - with some improvements - can be used similarly to refactor most tests to work with all configurations. I would hope though that we can find a reformulation of a lot of the tests that make them more focused on the actual fact they are challenging and better readable than they would be with a habitual shifting of numbers.

I agree that we should not overfit to specific eids and rewrite tests to not rely on them in general is the right way forward.

whilo

👍

* separate index type tests * Do not use tools/unittest * Remove focused flag * Remove unnecessary ssh bits

separate index type tests

2de70a9

jsmassa force-pushed the 520-config-tests branch from 357f14b to 2de70a9 Compare November 21, 2022 18:53

jsmassa changed the title ~~separate index type tests~~ Separate unit tests for index types Nov 21, 2022

Do not use tools/unittest

8eec4d3

jsmassa self-assigned this Nov 21, 2022

jsmassa mentioned this pull request Nov 21, 2022

Add config tests to workflow #556

Closed

jsmassa commented Nov 21, 2022

View reviewed changes

tests.edn Show resolved Hide resolved

jsmassa requested review from kordano, whilo and TimoKramer November 21, 2022 19:59

TimoKramer previously approved these changes Nov 22, 2022

View reviewed changes

.circleci/config.yml Outdated Show resolved Hide resolved

.circleci/config.yml Outdated Show resolved Hide resolved

tests.edn Show resolved Hide resolved

tests.edn Show resolved Hide resolved

jsmassa added 2 commits November 22, 2022 19:25

Remove focused flag

8363042

Remove unnecessary ssh bits

d9fc407

jsmassa dismissed TimoKramer’s stale review via d9fc407 November 22, 2022 18:36

jsmassa marked this pull request as ready for review December 22, 2022 12:30

whilo previously approved these changes Jan 8, 2023

View reviewed changes

Merge main

a0cfb72

jsmassa dismissed whilo’s stale review via a0cfb72 January 10, 2023 19:05

Fix alignment

caa7d97

jsmassa requested review from whilo and TimoKramer January 11, 2023 13:18

whilo approved these changes Jan 11, 2023

View reviewed changes

jsmassa merged commit bdd20b6 into main Jan 11, 2023

jsmassa deleted the 520-config-tests branch January 11, 2023 21:30

whilo pushed a commit that referenced this pull request Mar 5, 2023

Separate unit tests for index types (#580)

7732a6a

* separate index type tests * Do not use tools/unittest * Remove focused flag * Remove unnecessary ssh bits

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate unit tests for index types #580

Separate unit tests for index types #580

jsmassa commented Nov 21, 2022 •

edited

Loading

whilo commented Dec 25, 2022 •

edited

Loading

whilo commented Dec 27, 2022 •

edited

Loading

jsmassa commented Jan 5, 2023

jsmassa commented Jan 5, 2023

whilo commented Jan 8, 2023

whilo left a comment

jsmassa commented Jan 10, 2023 •

edited

Loading

whilo commented Jan 11, 2023

whilo left a comment

Separate unit tests for index types #580

Separate unit tests for index types #580

Conversation

jsmassa commented Nov 21, 2022 • edited Loading

whilo commented Dec 25, 2022 • edited Loading

whilo commented Dec 27, 2022 • edited Loading

jsmassa commented Jan 5, 2023

jsmassa commented Jan 5, 2023

whilo commented Jan 8, 2023

whilo left a comment

Choose a reason for hiding this comment

jsmassa commented Jan 10, 2023 • edited Loading

whilo commented Jan 11, 2023

whilo left a comment

Choose a reason for hiding this comment

jsmassa commented Nov 21, 2022 •

edited

Loading

whilo commented Dec 25, 2022 •

edited

Loading

whilo commented Dec 27, 2022 •

edited

Loading

jsmassa commented Jan 10, 2023 •

edited

Loading