-
Notifications
You must be signed in to change notification settings - Fork 518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reflect current code state for $import in the doc #3894
Conversation
@@ -3,7 +3,7 @@ | |||
There are two modes of $import supported today- | |||
|
|||
1. Initial mode is intended to load FHIR resources into an empty FHIR server. Initial mode only supports CREATE operations and, when enabled, blocks API writes to the FHIR server. | |||
1. Incremental mode is optimized to load data into FHIR server periodically and doesn't block writes via API. It also allows you to load lastUpdated and versionId from resource Meta (if present in resource JSON). | |||
1. Incremental mode is optimized to load data into FHIR server periodically and doesn't block writes via API. It also allows you to load lastUpdated, or both lastUpdated and versionId, from resource Meta (if present in resource JSON). There are no performance differences between incremental and initial modes. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Xoriant in their run identified performance difference. Lets double click offline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my tests on hospital data with ~1B resources performed on correctly scaled FHIR server and database I saw zero performance differences. This is as expected, because slight differences in compute on FHIR server related to processing modes do not change the database t-log bottleneck, and only this bottleneck identifies the performance of import.
docs/BulkImport.md
Outdated
* Conditional references in resources are not supported for initial mode import. | ||
* If multiple resources share the same resource ID, then only one of those resources will be imported at random and an error will be logged corresponding to the remaining resources sharing the ID. | ||
* In initial mode, resources with conditional references will be logged as errors. In incremental mode, resources with conditional references will be loaded but references will not be resolved. | ||
* If multiple resources share the same resource ID, in initial mode, only one of those resources will be imported at random and an error will be logged corresponding to the remaining resources sharing the ID. In incremental mode, if lastUpdated is not provided, behavior is the same as in the initial. If lastUpdated is provided, then resource versions are deduplicated by lastUpdated, and only duplicates are reported as errors, the rest is loaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If lastUpdated is provided, then resource versions are deduplicated by lastUpdated, and only duplicates are reported as errors, the rest is loaded. --- is this true with allowed version flag ? In case there are multiple resources with same version id and different lastUpdated, then error will be reported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This is "universally" true irrespective of allow negative versions flag.
- If there are multiple records for the same resource with different lastUpdated and the same version, then version conflict will be reported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For statement "In incremental mode, if lastUpdated is not provided, behavior is the same as in the initial." - Initial mode import creates resources and not updates. So if lastUpdated is not provided but incremental version ids are provided, then the resource is updated.
Should we update the statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If there are multiple records for the same resource with different lastUpdated and the same version, then version conflict will be reported." - Version conflict will be reported on the randomly selected resource ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which record is rejected depends on order of processing across batches. If records are processed in different batches, there is no way to predict which record is loaded and which is rejected. If import is very small, and only single batch is created, then record with max version will be loaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For statement "In incremental mode, if lastUpdated is not provided, behavior is the same as in the initial." - Initial mode import creates resources and not updates. So if lastUpdated is not provided but incremental version ids are provided, then the resource is updated.
Should we update the statement?
Input versions are ignored if lastUpdated is not provided. This has always been incremental mode behavior. Statement in question refers only to deduplication logic, which deals with case of more than one record per resource. This statement is not targeting main difference between incremental update versus initial insert. I think current statement is correct.
I added some clarifications, please look.
Several corrections/clarifications