Releases: kharchenkolab/Baysor
Releases · kharchenkolab/Baysor
v0.7.1
Added
--polygon-format=GeometryCollectionLegacy
format similar toGeometryCollection
, but with integer cell IDs. Used for compatibility with Xenium Ranger. This option will be deprecated after Xenium Ranger is updated.
Fixed
- CLI argument parsing
Changed
polygons.json
now has polygons for all cells in the dataset. Cells with 1 or 2 points now have polygons with 4 vertices.- Polygon vertices are now slightly shifted farther from the cell center to avoid molecules on the border.
v0.7.0
Added
- New algorithm for NCV estimation based on Random Indexing. It is used by default now.
- Support of Parquet format for input molecules (can directly use
transcripts.parquet
from Xenium) - Now, in case of continuous z-stack (like Xenium), 3D polygons are estimated by binning the z-stack into 20 slices.
- Added support for string IDs for prior segmentation. To specify the label for non-assigned molecules, use
--config.segmentation.unassigned_prior_label
. - New plotting configuration options (see
example_config.toml
)
Removed
no-ncv-estimation
was removed, as the NCV algorithm is fast and memory-efficient now- The Dirichlet sampling was removed, as it slowed-down the algorithm significantly, but didn't improve the segmentation quality
- Correspondingly, parameters
new_component_weight
andnew_component_fraction
were removed
- Correspondingly, parameters
Fixed
- Improved multithreading
- Optimized the algorithm performance
- Improved algorithm for polygon estimation: fixed bugs and reduced overlaps
- Fixed a bug in the
split
step, improved cell continuity
Changed
- Cell IDs in polygon GeoJSON are now strings to match segmentation.csv
- The whole polygon.json format was changed from
GeometryColection
toFeatureCollection
to match 10x format. - Polygons are now saved by default using the 10x FeatureCollection format. Parameter
save-polygons
is replaced withpolygon-format=FeatureCollection
. Set it toGeometryCollection
to save polygons in the format from Baysor v0.6 or tonone
to prevent saving polygons. - 2D and 3D polygons are now stored in separate files (
polygons_2d/3d.json
)
Version 0.6.2
Version 0.6.1
[0.6.1] — 2023-05-11
Fixed
- Fixed a bug with prior segmentation loading
- Fixed a bug with compartment genes
Version 0.6.0
[0.6.0] — 2023-04-20
Added
- New output cell QC parameters
avg_assignment_confidence
,max_cluster_frac
andlifespan
- Segmented cells are now saved to loom instead of TSV. To return an old behavior, use
count-matrix-format="tsv"
- Minimal multi-threading (see README)
Removed
iters
andn-cells-init
parameters were removed from the CLI shortcuts. To change them, use the config or--config.segmentation.iters
and--config.segmentation.n_cells_init
parameters (see 'Advanced configuration section in the readme').
Changed
- Breaking changes in config file structure and CLI
- Greatly improved responsiveness of the CLI and simplified installation process
- Major refactoring of the code
- Various performance improvements
- Faster and more precise algorithm for estimating boundary polygons. Now each cell has exactly one polygon in the output GeoJSON.
- For method details see Awrangjeb, 2015, it's pretty similar.
- Closes #15 and #41, potentially also #32 and #37.
- Using sparse PCA for NCV estimation on large datasets
baysor segfree
output is now fully compatible with loom v3 format- Cells and NCVs now have IDs in the format
{type}{run_id}-{cell_id}
, wheretype
isC
for cells andV
for NCVs, andrun_id
is a unique ID of Baysor run --save-polygons
now works regardless of-p
v0.5.2
[0.5.2] — 2022-06-29
Fixed
- Fixed some package versions, dependencies should cause fewer bugs now
- Fixed some bugs
- Fixed random seed for all CLI runs
- Adjusted some CLI parameters
Added
- Added
segfree
run option to extract NCV vectors
Version 0.5.1
Changed
- Slightly optimized compilation time
- Minor updates of core formulas
Added
- CLI parameter
no-ncv-estimation
to disable estimation of NCVs - MacOS build
Version 0.5.0
Changed
scale-std
now can be specified from CLI parameters- Several bugs fixed
- Allow missing genes in the input data
- Updates in the core algorithm
- All diagnostic plots were updated
Added
exclude-genes
option that removes genes from the data frame before segmentation.- Segmentation of compartments based on the list of compartment-specific genes
- Using information about compartment per molecule in the segmentation algorithm when available
- 3D segmentation
Removed
- The data can not be split by frames anymore. This functionality didn't work well previously and was hard to maintan.
Version 0.4.3
Changed
- Fixed the CLI installation bug
- More information in logging
- Better initialization for molecule clustering. This should improve cell separability a lot!
- Some memory optimization
Added
- Estimating scale when prior segmentation is provided as a CSV column
- Added the option
--save-polygons=GeoJSON
to save cell boundary polygons in the GeoJSON format
Version 0.4.2
Changed
- Fixed Makefile julia version
- Improved polygon visualization
- Regressed to Plots 1.6.0 because of the performance issues
- Fixed docker build
- Small bug fixes
min-transcripts-per-center
is renamed tomin-molecules-per-segment
and is working now- Fixed visualization of prior segmentation