Skip to content

Commit

Permalink
docs: Instructions for gen_oscal and fix_any added to website.md (#389)
Browse files Browse the repository at this point in the history
* updated with description of how to build models with gen_oscal and fix_any

* Added text for order_classes

Co-authored-by: Chris Butler <chris@thebutlers.me>
  • Loading branch information
fsuits and butler54 authored Mar 7, 2021
1 parent fcbaa23 commit 5053e52
Showing 1 changed file with 32 additions and 0 deletions.
32 changes: 32 additions & 0 deletions docs/contributing/website.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,36 @@ This automation script principally ensures that:

running `make docs-automation` will ensure that the website is ready to deploy.

## Building the models from the OSCAL schemas.

The creation of the OSCAL models in ```trestle/oscal``` is a multi-step process:
- The oscal schemas are downloaded as modules from NIST into the ```nist-source/json/schema``` directory.
- The script ```scripts/gen-oscal.py``` loads each schema file and converts it to pydantic/python with ```datamodel-codegen```.
- The generated python files may need some fixup, so a separate script ```scripts/fix_any.py``` is run on each file.
- Note that there is one schema specific to IBM needs and it is loaded from ```3rd-party-schema-documents/IBM_target_schema_v1.0.0.json```.

The whole process is handled in the Makefile by ```make code-gen```. A normal user would never need to run this but developers may need to, particularly if there are changes to the OSCAL schemas.

Also note that the depenedent tools, pydantic and datamodel-codegen, may get updated by doing a fresh ```make install``` or ```make develop```, which may then result in a change to the model files.

### Items handled by ```fix_any.py```.
The original motivation for this script was to replace numerous situations where the type assigned to a given variable was simply ```Any```, which meant no type enforcement would apply for that variable, defeating the purpose of the strict type enforcement provided by Pydantic. As of this writing the number of such cases has been reduced to just one - which is handled by the script.

The remaining issues handled by the script are:
- Certain items in lists have 'Item' appended to the name, which can be confusing because some items in the schema do in fact end with 'Item'. As a result the script removes 'Item' in the name except when it is expected.
- Some items in self-referential classes are assigned ```= Field(None, min_items=1)``` and this results in ```ValueError: On field "parts" the following field constraints are set but not enforced: min_items.``` An example in class ```Part``` is:
- parts: Optional[List[Part]] = Field(None, min_items=1)
- The workaround for the above case is to change the assignment to ```= None```
- Any timestamp generated by ```datamodel-codegen``` is removed from the generated file, to avoid showing as diff's in the file history. This is expected to be handled directly by ```datamodel-codegen``` later.
- Finally, in order to guarantee there are no induced forward references in the files, the classes are reordered to minimize the need for forwards, and any that can't be avoided are explicitly provided at the bottom of the file.

### Side effects in the generated models.
The resulting models have some side effects that users and developers should be aware of:
- The current OSCAL schemas have situations where objects are defined within different classes in a schema using the same name, but the contents of those classes are different. ```datamodel-codegen``` handles this by creating separate classes as needed and appending 1, 2 etc. to the names, keeping them distinct. The resulting high level classes that reference them behave as expected, but if components of those classes are added in a granular way by a user or developer, the correct index must be used.
- The generated files have many classes that simply have a ```__root__``` element defined, along with a description. Such classes don't have particular value in such a simple form and could instead simply be defined in the parent class.

### Seeing the changes induced by ```fix_any.py``` on the classes.
As a convenience for developers, a separate script, ```scripts/order_classes.py``` is available, which orders the classes in a given file alphabetically. This way, if you use the script on files before and after applying ```fix_any.py``` you can use a normal diff tool to see the changes made. This is strictly as a development tool for doing the comparison and the resulting files will not work since they will have forward references.


## Expectations on developers of trestle functionality

0 comments on commit 5053e52

Please sign in to comment.