-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No data in database if nested objects within array #109
Comments
@CStejmar intriguing, not the behaviour I would have expected! So it sounds like the denesting on the Schema is actually working as expected which is great, but that the denesting on the records is broken somehow. I think the likely culprit is to double check the tests here. Those were thrown together somewhat hastily and could use love. I'll poke around with those and see if I can't get more details for you, but that's where I'd start. |
@CStejmar I think I found the problem. It looks like the denesting logic for the schema is in fact the problem. Your observation about the issue is exactly correct, and luckily your full blown Sandbox tests can be reduced down to just a single test on the So what's happening is that we are denesting the records correctly and placing values under the column path inside of the table. BUT in the schema denesting, we are not removing the table path as a prefix. Ex.
So yeah, we just need to figure out why that's happening and fix it. I'm not sure how breaking that will be for users of this repo... Feel free to fork and start working on that branch. I've pushed up the simplified reproducing test. |
@AlexanderMann Great! Thanks for your analysis! I will fork and branch from where you left of 👍 |
issue: datamill-co#109 The table_schemas and the table_records did not match for more complex and nested schemas.
I think I solved the issue: #111 All tests (branched from master) pass and I get data in the database which I didn't get before for nested objects in arrays! However, we might want it the other way around where we instead change the record to match the schema we have now to keep the naming table structure we had before. What do you think @AlexanderMann? |
issue: datamill-co#109 The table_schemas and the table_records did not match for more complex and nested schemas.
I noticed one thing now when looking trough my tables in a test database. The objects looks fine and all data enters the database. However, naming of subtables (arrays in the schemas) miss their array name when we have arrays within arrays. The first array name is dropped in the table name. With denest fix for schema:
Without denest fix/fixing record instead:
The above has errors in the naming of nested objects as you can see in So my conclusion is that we need some more work with the fix regarding this denesting before merging. I will start looking into it now! |
UpdatePull request is updated to fix the above problem with incorrect naming of subtables: #111 The plan ahead is now to first merge the tests @AlexanderMann set up in this PR: #110 and then rebase the changes from #111 on top of that to see if tests pass. If they do not pass, more work is needed. |
issue: datamill-co#109 The table_schemas and the table_records did not match for more complex and nested schemas.
issue: datamill-co#109 The table_schemas and the table_records did not match for more complex and nested schemas.
Fix and tests are merged to master. |
Hi,
I think I have encountered a problem with nested objects in arrays. An object in an array is fine but when I have an object within another object in an array (array->object->object) problem arises. No data is seen in that last level object even if I can see in the record data that values exist. If I instead have an array at the last level (array->object->array) it behaves as expected.
I have done tests with this in the module
test_sandbox
using a modifiedCATS_SCHEMA
and theFakeStream.generate_record_message
(also modified to match my schema) to generate test data. If I in the test make the last object's properties optional by"type": ["null", "string"]
for example, everything passes because the object's properties are all null. However, if I use"type": "string"
for a property, the test fails with:CRITICAL ('Exception writing records', IntegrityError('null value in column "adoption__immunizations__vaccination_type__shot" violates not-null constraint\nDETAIL: Failing row contains (Rabies, 2537-09-12 15:34:00+02, null, 1, 1554384634, 0).\nCONTEXT: COPY tmp_36a9e5fb_ac07_49f0_ac26_8fa9f1f69885, line 1: "Rabies,2537-09-12 13:34:00.0000+00:00,NULL,1,1554384634,0"\n'))
To me it seems like this case isn't handled and I can only find tests of
array_of_array
andobject_of_object
in the tests directory. Does anyone have a clue and can point me in the right direction?I will attach a snippet of what I added to the
test_sandbox.py
file so that you can test my examples and easier understand my problem.The text was updated successfully, but these errors were encountered: