You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Another potential speedup here is to leverage the pyarrow+pandas integration. This should be more mature with pandas v2. Pandas is pushing more in this direction as well, slated to make pyarrow a required dependency in v3.
Unfortunately, it's not as simple as setting engine='pyarrow'. I tried briefly with 11743ac. If we want to go down this route, it might be best to convert the metadata TSV to parquet upfront, which would require rewriting some logic (I'm not sure how much). Previous discussion on using parquet for metadata: (1, 2)
Context
See parent issue for context on how Pandas is used in augur filter and why it is slow.
There are some potential optimizations to the current code without a full rewrite that's necessary with #1574.
Progress
--output-strains
and--output-metadata
#1469Not pursued
The text was updated successfully, but these errors were encountered: