Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flatten _validation.toml #2

Open
chStaiger opened this issue Jul 19, 2024 · 2 comments
Open

Flatten _validation.toml #2

chStaiger opened this issue Jul 19, 2024 · 2 comments

Comments

@chStaiger
Copy link
Member

Reading in the validation.toml leads to a very nested dictionary and list of dictionaries. We need to discuss how to flatten that a bit.
@StefanoRapisarda can you please have a look in the code how I am using the toml and give some suggestions how to flatten the data structure a bit?

@chStaiger
Copy link
Member Author

Here is how I read in the toml file:

val_path = Path("data/metadata_test/_validation_schema.toml") # param for client

And this is the code to get the information into a workable python data structure:

def flatten_list_of_dicts(key: str, val: str, data: list) -> dict:

@chStaiger chStaiger mentioned this issue Jul 19, 2024
@StefanoRapisarda
Copy link
Collaborator

StefanoRapisarda commented Jul 24, 2024

I made a new version of metadata for validation ("_validation_schema_v2.toml") and renamed the old one (v1).

This is an example of the v2 file structure:


[hosts]
host_id.type = "string"
host_id.format = "AA0_00000"
host_groupNumber.type = "integer"
host_sex.type = "string"
host_sex.format = "A"
host_sex.values = ["M","F"]
host_age.type = "integer"
host_death.type = "integer"
host_species.type = "string"
host_breed.type = "string"

The levels are <file_name> --> <column_name> --> <column_attribute>. So to access the type of the host_id column in the events file you should use metadata["events"]["host_id"]["type"]

Not all the columns have all the possible attributes, so I added the complete list of possible attributes at the very beginning of the file (metadata_keys). Compared to the previous version, this structure removes a layer, with the caveat that now you need to check for keywords listed in metadata_keys

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants