[FIX] Augment tempfile.SpooledTemporaryFile() for expected behavior #2963
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
This PR creates a new class to give tempfile.SpooledTemporaryFile an uplift. The current implementation in the standard library isn't satisfying the expected API. Augmenting the class methods and importing io.IOBase should get the class close to where it needs to be. Until python/cpython#3249 is merge, released, and used by this project we will need to keep this augmented class around.
Technical details:
While this issue has existed since Python 3.0, it was only recently felt in this project with the upgrade from pandas v0.45.x to pandas v1.0.2+ (pandas v1.0.0 and v1.0.1 introduced another bug which they fixed in v1.0.2, relying on API missing in the SpooledTemporaryFile implementation). See https://bugs.python.org/issue26175 and python/cpython#3249
Locally
loadcfda
,load_rosetta
,load_city_county_state_code
, and downloads were tested to ensure the changes did not break other areas. FABS and FPDS ETL scripts also use the class for obtaining delete records and should be tested. Several other ETL scripts (load_tas, load_dabs_submission_window_schedule) contain a path for loading a file which isn't used by the nightly pipeline which connects directly to broker for refreshing data.Requirements for PR merge:
Area for explaining above N/A when needed: