Skip to content

Meeting Notes

Demetris Roumis edited this page Aug 6, 2024 · 107 revisions

240806 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Mateusz, Simon

Moving forward, will schedule meetings as needed

Maxime:

Mateusz:

  • Scale Bars on subplots   - READY FOR (API/Functionality) REVIEW: Attach scalebar to subplot range Bokeh 13921
    • data coordinate location and length specification

Simon

  • Review Bokeh 13921
  • Scale Bars on subplots   - TODO: Attach ScaleBar to subcoordinate axis HoloViews #6292
    • paused for ~2 weeks given dev releases
  • HoloNote Annotations   - WIP: 1D Annotation GUI Holonote #99     - Batch select and link to action (e.g. rename category, delete entire category) https://github.com/holoviz/holonote/issues/129 - long(er) term roadmap item. Check-in ~4 weeks.

Philipp

  • Streaming performance   - WIP: Stream-following behavior HoloViews #5318     - TODO: file issue for backtracking glitch and link it in existing PR     - TODO: Recompute range based on every new data point
    • no progress

Demetris

  • WIP: Public-facing docs
    • Host datasets on HoloViz S3
    • Create PRs for examples.holoviz repo
  • Rough benchmarking   - TODO: Neuro #99 Getting a couple before/after estimated numbers to report
  • Annotations
    • TODO: check holonote gui on multiple browsers
    • TODO: Add warnings to users <Bokeh 3.5 for new holoviews rangetool styling

240730 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Mateusz, Jim, Simon

Maxime:

  • Zoom enhancements for subcoordinate-Y plots   - READY TO MERGE!: zoom by subcoord_y group. hit-tested group-wise wheel zoom PR
  • Multi-chan TS in hvPlot     - MERGED: Optimize handling of wide datasets hvPlot #1350     - READY FOR REVIEW: subcoordinate_y into hvPlot API hvPlot #1379 - TODO: Demetris check is sufficient for CZI use case

Mateusz:

  • Scale Bars on subplots   - PR CREATED: Attach scalebar to subplot range Bokeh 13921
    • Still working on location and length specification
  • Zoom enhancements for subcoordinate-Y plots   - MERGED: Improve respect for maintain_focus = False when zooming Bokeh #14000

Simon

Philipp

  • Streaming performance   - WIP: Stream-following behavior HoloViews #5318     - TODO: file issue for backtracking glitch and link it in existing PR     - TODO: Recompute range based on every new data point

Demetris

  • WIP: Public facing docs   - Created examples website categories and will now update dedicated 'Neuroscience' page with CZI examples
  • Rough benchmarking   - TODO: Neuro #99 Getting a couple before/after estimated numbers to report
  • HoloNote Annotions
    • DONE: Finish adding tests and mark for ready for review to HoloNote visibility GUI sync PR
    • DONE: Fix handling of no init data, color specified condition for vis gui PR

240723 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Mateusz, Maxime, Jim

Mateusz:

  • Scale Bars on subplots
    • EARLY WIP: Attach scalebar to subplot range Bokeh 13921
      • WIP branch exists, no PR yet
  • Zoom enhancements for subcoordinate-Y plots
    • INVESTIGATING: Lack of range expansion when there remains unbounded directions, even when maintain_focus=False Bokeh #13827
      • working branch exists, will create a PR soon
  • Streaming
    • DONE: (next) play with Philipp's streaming example to inspect event timing and 'backtracking' glitch
      • reporting findings on internal Bokeh slack
      • on the python side, not bokehjs. Maybe Panel.

Philipp

  • Streaming performance
    • WIP: Stream-following behavior HoloViews #5318
      • merge PR?
      • TODO: Address 'backtracking' glitches.
      • TODO: Recompute range based on every new data point
      • TODO: file issue for backtracking glitch and link it in existing PR

Maxime:

  • Zoom enhancements for subcoordinate-Y plots
  • Multi-chan TS in hvPlot
  • Docs:
    • TODO: Public-facing website (DR and ML - Improve Examples website with page for Neuro category).
      • Start discussion this Thursday?

Demetris

  • Rough benchmarking
    • TODO: Neuro #99 Getting a couple before/after estimated numbers to report
  • HoloNote Annotions
    • Review: visibility GUI sync updates HoloNote #123
    • TODO: file issue about remaining AnnotationTable GUI requirements
    • TODO: DR keep testing 2D annotator and file bugs issues for Simon
  • WIP: Public facing docs
    • Deep image stack - annotations now trigger subcoord y timeseries.. but remaking entire timeseries plot on every event.

240716 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Mateusz

Mateusz:

  • Scale Bars on subplots
    • EARLY WIP: Attach scalebar to subplot range Bokeh 13921
      • Some progress. source as subcoord range is working but positioning is remaining issue.
  • Zoom enhancements for subcoordinate-Y plots
    • INVESTIGATING: Lack of range expansion when there remains unbounded directions, even when maintain_focus=False Bokeh #13827
      • still investigating, will provide update next week
  • Streaming
    • TODO: (next) play with Philipp's streaming example to inspect event timing and 'backtracking' glitch

Philipp

  • Streaming performance
    • WIP: Stream-following behavior HoloViews #5318
      • TODO: (waiting on Mateusz) address 'backtracking' glitches.
      • TODO: recompute range based on every new data point

Maxime:

Demetris

  • Rough benchmarking
    • TODO: Neuro #99 Getting a couple before/after estimated numbers to report
  • HoloNote Annotions
    • WIP: visibility GUI sync updates HoloNote #123
    • TODO: file issue about remaining AnnotationTable GUI requirements
    • TODO: DR keep testing 2D annotator and file bugs issues for Simon
  • WIP: Public facing docs
  • TODO: REVIEW zoom by subcoord_y group. hit-tested group-wise wheel zoom PR

240709 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Mateusz

  • Streaming timeseries
    • WIP: Stream-following behavior HoloViews #5318
      • TODO: address 'backtracking' glitches. Communicate an example with Mateusz to inspect.

Philipp

  • Streaming performance
    • MERGED: De-parameterize the options Keywords objects HoloViews #6314
      • Parameterized objects are expensive to create at scale and with many nesting levels
    • MERGED: Freeze models while updating plot(s) HoloViews #6315
      • 5-10 % speedup
  • Streaming timeseries
    • WIP: Stream-following behavior HoloViews #5318
      • TODO: address 'backtracking' glitches. Communicate an example with Mateusz to inspect.
        • Not much temporal event management in Bokeh. if data changes then state destroyed and rebuilt currently.
        • race condition when streaming data. When I continuously stream and then trigger some action that causes a redraw it seems to cause weird backtracking issue. I suspect it’s this sequence of events causing it:
  1. Stream events are sent continuously and an event arrives and the CDS data is updated
  2. At the same time a user triggers a redraw on the frontend, e.g. they reset the plot or resize the window
  3. This user actions triggers a redraw, which uses the updated full data
  4. The stream event is processed and the new data is drawn again even though the previous redraw already included the new data
    • Is that at all plausible? In any case I think it is a Bokeh issue.

Mateusz:

  • Scale Bars on subplots
    • EARLY WIP: Attach scalebar to subplot range Bokeh 13921
      • more to share next week
  • Zoom enhancements for subcoordinate-Y plots
    • INVESTIGATING: Lack of range expansion when there remains unbounded directions, even when maintain_focus=False Bokeh #13827
      • some duplication of work going on.
  • Bokeh performance improvement (Do not bill to CZI)
    • Optimize code paths involving data array indices Bokeh 13562 "Unlikely to make a big impact on its own" - Mateusz
    • _map_data being called too often Bokeh #4967
      • could be potentially a big performance improvement but would be too big task so paused on CZI billing for now
  • Streaming
    • TODO: play with Philipp's streaming example to inspect event timing and 'backtracking' glitch

Maxime:

Simon

  • HoloNote Annotions
    • WIP: 1D Annotation GUI Holonote #99
      • MERGED: Tabulator widget #106
        • initial PR merged. Will make further improvements in another PR:
          • e.g. Batch select and link to action (e.g. rename category, delete entire category)
    • 2D Annotation
      • TODO: DR keep testing 2D annotator and file bugs issues for Simon
  • Scale Bars on subplots

Demetris

  • Rough benchmarking
    • TODO: Neuro #99 Getting a couple before/after estimated numbers to report
  • HoloNote Annotions
    • MERGED: handle empty indicator when groupby is set HoloNote #118: MERGED
    • WIP: visibility GUI sync updates PR
    • TODO: file issue about remaining AnnotationTable GUI requirements
    • TODO: DR keep testing 2D annotator and file bugs issues for Simon
  • Streaming
  • Public facing docs
    • Many waveform snippets
    • multi-chan timeseries
    • deep image stack
    • large image navigation (neuroglancer nb)
    • spike raster
    • streaming timeseries, streaming images with overlays
    • Public-facing website (DR and ML - Improve Examples website with page for Neuro category)
  • PM
    • File issues
    • update project board and hours spreadsheet
    • Collaborator comms

240702 HB4N CZI R5 dev sync

Attendees: Demetris, Maxime, Simon

July priorities:

Maxime:

Mateusz:

  • Scale Bars on subplots
  • Zoom enhancements for subcoordinate-Y plots
    • Lack of range expansion when there remains unbounded directions, even when maintain_focus=False Bokeh #13827
  • Bokeh performance improvement (OSS time - Do not bill to CZI)
    • Optimize code paths involving data array indices Bokeh 13562 "Unlikely to make a big impact on its own" - Mateusz
    • _map_data being called too often Bokeh #4967

Simon

  • HoloNote Annotions
  • Scale Bars on subplots

Philipp

  • Streaming
    • Stream-following reset after pan and then reset

Demetris

  • Rough benchmarking
    • Getting a couple before/after estimated numbers to report
  • HoloNote Annotions
    • handle empty PR
    • visibility GUI from empty with groupby PR
  • Streaming
    • baseline workflow
  • Public facing docs
    • Many waveform snippets
    • multi-chan timeseries
    • deep image stack
    • large image navigation (neuroglancer nb)
    • spike raster
    • streaming timeseries, streaming images with overlays
    • Public-facing website (DR and ML - Improve Examples website with page for Neuro category)
  • PM
    • File issues
    • update project board and hours spreadsheet
    • Collaborator comms
    • Start Collect content for final report

240625 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Simon

Remaining Initiatives

Deliverables 1-2 - Bokeh Performance and Benchmarking

  • Bokeh performance improvements (Pause billing)
    • Optimize code paths involving data array indices [Bokeh 13562[(https://github.com/bokeh/bokeh/pull/13562)
      • "Unlikely to make a big impact on its own"
    • _map_data being called too often Bokeh #4967
  • Benchmarking (Pause billing)

Deliverable 3 - Multi-Chan Timeseries

  • Multi-chan TS in hvPlot

  • Zoom enhancements for subcoordinate-Y plots

    • Lack of range expansion when there remains unbounded directions, even when maintain_focus=False Bokeh #13827
    • Support hit-tested group-wise wheel zoom renderers for subcoordinate_y HoloViews #6268 (Maxime)
    • Customize zoom icons for subcoordinate-y plots HoloViews #6241
  • Hover tooltips

    • Group or hide hover tools in Bokeh toolbar HoloViews #6252
      • Simon started playing with this (applying group_tools on overlays) but didn't seem to work yet.

Deliverable 4 - Large Images (including deep image stacks)

Deliverable 5 - Range Annotations (holonote and scale bars)

Deliverable 6-7 - Streaming

  • streaming times series workflow extension (D6)
  • streaming images with overlays workflow (D7)
  • Stream-following reset after pan and then reset (tool)

Deliverable 8 - Public-facing docs

  • Many waveform snippets (DR)
  • multi-chan timeseries (DR)
  • deep image stack (DR)
  • large image navigation (neuroglancer nb) (DR)
  • spike raster (DR)
  • streaming timeseries, streaming images with overlays (DR)
  • Public-facing website (DR and ML - Improve Examples website with page for Neuro category)

Other updates:

Maxime

Simon

Mateusz

Demetris

  • Scoping remaining tasks
  • Tested start_gesture for Bokeh Range Tool
  • TODO: open issue in HoloViews to support tap behavior by default for source plots for RangeToolLink
  • Working on remaining public docs workflows
  • TODO: Demetris test larger datasets on medium approach to multi-chan ts workflow approach
  • TODO: solicit feedback from collaborators

Philipp

  • Might work on (Stream-following reset) (and/or Simon)

240618 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Simon, Maxime, Jim

Mateusz

Demetris

  • Presented Multi-chan timeseries (med and large version) and neuroglancer workflows at CZI meeting
  • Tested new HoloViews and filed some issues/feedback
  • Tested data pyramid in both standalone and notebook
  • Updated repo Readme
  • Reviewing PRs
  • TODO: continue testing start_gesture for Bokeh Range Tool, and open issue in HoloViews to support tap behavior by default for source plots for RangeToolLink
  • TODO: Work on remaining public docs workflows
  • TODO: Scope remaining tasks and hours for Q3, including a sprint
  • TODO: Demetris test larger datasets on medium approach to multi-chan ts workflow approach

Andrew

Philipp

Maxime

Simon

240604 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Andrew, Simon, Maxime, Mateusz

Andrew

  • Tested out Philipp's y-range PR
    • TODO: Philipp and Andrew sync on zoom in issue..
  • Data pyramid approach

Maxime

  • Subcoord_y group-wise normalization docs
    • Demetris reviewed and contributed
    • Merged
  • zoom by subcoord y group
    • reviewing contextual zooming
    • testing out a proxies approach to merge into groups
      • Mateusz: proxy approach won't work..
  • subcoord_y in hvPlot
  • Benchmarking (latency to display, interaction update) of small, medium, large multi-chan timeseries workflows
    • nothing yet

Simon

  • scale bar
    • Will include it into HoloViews 1.19
    • Demetris contributed to docs
  • holonote 1D GUI
    • Demetris reviewed and added suggestions
    • TODO: check that additions sync

Philipp

Mateusz

240528 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Andrew, Simon, Maxime, Mateusz, Jim

Mateusz

Philipp

  • Planning to finish up the NDOverlay key for subcoord_y tomorrow.
    • Maxime is depending on this to get index support into hvPlot
    • Still need to figure out how to support the grouping
  • Still working on taking advantage of indexing optimization for wide dataframes with col per chan
    • TODO: For wide DF, Maxime will try to use the col name as the ['channel'] dim name and the label is the implied common vdim ['amplitude'].
      • This should avoid having to make copies of the DF and allow for using the single-slice downsampling operations
  • subcoord-y + dmap + rangetoolink (6010, 6136)
    • no progress

Maxime

  • Subcoord_y group-wise normalization docs
    • TODO: Demetris review Maxime's PR
  • zoom by subcoord y group
    • original PR merged
    • opened issue (where) to customize zoom icons
    • Mateusz is working on something to group tools which may be relevant (no PR yet)
    • TODO: improve further when Bokeh supports contextual zooming
  • subcoord_y in hvPlot
    • wip
    • TODO: For wide DF, Maxime will try to use the col name as the ['channel'] dim name and the label is the implied common vdim ['amplitude'].
      • This should avoid having to make copies of the DF and allow for using the single-slice downsampling operations
  • opened issue about creating ridge plot with hvPlot (out of CZI scope for now)
  • Benchmarking (latency to display, interaction update) of small, medium, large multi-chan timeseries workflows
    • experimenting with ASV
    • will open a new repo for benchmarking in holoviz-dev

Simon

  • scale bar
    • Will include it into HoloViews 1.19
    • Add toolbar to hide visibility, tests, logic, docs
  • holonote 1D GUI
    • Last week: draft PR of tabulator sub widget
    • TODO: Demetris to play around with it**
  • Reviewing Bokeh PRs

Andrew

  • data pyramid approach
    • Had Added a downscaling factor dependency on the channel number
      • DR found it wasn't working for many channels
    • Cannot load subset of channels because subcoord-y complicates the translation of viewport y range to channel
      • probably fine for now because often users would be slicing only in time anyway..
    • Found some potential optimizations along with Philipp after the meeting. Confirmed bottleneck was the rendering, not loading.
      • Will continue testing and optimizing the speed.
    • Will need to sync with Philipp on https://github.com/holoviz/holoviews/pull/6247
      • Mateusz: 'hold render' will manually control for batch rendering. Document hold may not be ideal because it will hold everything.
      • TODO: Andrew investigate manual hold render.

240521 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Andrew, Simon, Maxime, Mateusz, Jim

Philipp

  • Planning to finish up the NDOverlay key for subcoord_y tomorrow.
    • Maxime is depending on this to get index support into hvPlot
    • Still need to figure out how to support the grouping
  • Still working on taking advantage of indexing optimization for wide dataframes with col per chan
    • TODO: For wide DF, Maxime will try to use the col name as the ['channel'] dim name and the label is the implied common vdim ['amplitude'].
      • This should avoid having to make copies of the DF and allow for using the single-slice downsampling operations
  • subcoord-y + dmap + rangetoolink (6010, 6136)
    • no progress

Maxime

  • Subcoord_y group-wise normalization docs
    • TODO: Demetris review Maxime's PR
  • zoom by subcoord y group
    • original PR merged
    • opened issue (where) to customize zoom icons
    • Mateusz is working on something to group tools which may be relevant (no PR yet)
    • TODO: improve further when Bokeh supports contextual zooming
  • subcoord_y in hvPlot
    • wip
  • opened issue about creating ridge plot with hvPlot (out of CZI scope for now)
  • Benchmarking (latency to display, interaction update) of small, medium, large multi-chan timeseries workflows
    • will work on this, this week

Mateusz

  • cursor-aware zoom tool
    • Finishing up this week!
    • Philipp added clarification about the requirement for ANY intersection of a collection of renderers, which is not yet supported.
  • Box-like range selection in rangetool
    • WIP PR
    • configurable tap gesture is novel concept and opens up a lot of possibilities
    • Final review needed?
  • PR for Handles for the rangetool is upcoming
  • NEXT:

Demetris:

  • Working on public docs - to be made into HoloViz example Series of workflows, also a blog post
    • revised to focus on the medium-sized workflow

Simon

  • scale bar
    • no progress
  • holonote 1D GUI
    • WIP, scoped out the tabulator sub widget into PR
  • minmax
    • merged

Andrew

  • data pyramid approach
    • Had Added a downscaling factor dependency on the channel number
      • DR found it wasn't working for many channels
    • Cannot load subset of channels because subcoord-y complicates the translation of viewport y range to channel
      • probably fine for now because often users would be slicing only in time anyway..
    • Found some potential optimizations along with Philipp after the meeting. Confirmed bottleneck was the rendering, not loading.
      • Will continue testing and optimizing the speed.

240430 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Andrew, Simon, Maxime, Mateusz, Jim

Demetris

  • Working on public docs - to be made into HoloViz example Series of workflows, also a blog post
    • Need to finish adding narrative text
    • File various issues

Simon

  • scale bar
    • WIP
  • holonote 1D GUI
    • WIP
  • minmax
    • almost done

Philipp

  • Take advantage of indexing optimization for wide dataframes with col per chan

Maxime

  • Subcoord_y group-wise normalization
    • merged
    • TODO: add minimal docs or example to docstring
  • zoom by group
    • merged
    • TODO: improve when Bokeh supports contextual zooming
  • ylim + subcoordy
    • merged
    • TODO: Maxime open issue to track overlay/level zoom tools
    • TODO: Maxime open issue for different icons for per-level zoom
  • Benchmarking (latency to display, interaction update) of small, medium, large multi-chan timeseries workflows
    • no
  • expose subcoord_y in hvPlot
    • no

Mateusz

Andrew

  • Carbon plan released 0.2.0 with Andrew change to ndpyramid
  • NEXT: review Demetris' large workflow notebook when it's ready for review
  • Streaming?

240423 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Andrew, Simon, Maxime, Mateusz, Jim

Demetris

  • HoloViews apply hard bounds
    • final edits, added docs, tests
    • merged!
    • TODO: check again with subcoord y given #6124
  • hvPlot allow downsample algo as input PR
    • merged
  • Contributed to HoloNote overlay PR
    • merged
  • Filed bokeh issue about maintain_focus not working 13827
    • assign to Mateusz?
  • Created mockup of 1D annotator GUI for holonote #99
  • Collected a set of holonote/panel issues, discussed with Simon
    • e.g. panel overflow #6757
  • Primary focus this week is on user facing docs

Simon

  • Pandas index handling in HoloViews (test with medium size (~2GB) subcoordinate_y example)
    • Philipp will share result
  • Scale Bars in HoloViews (tested with multi-group, custom unit subcoordinate_y example)
    • WIP, tests needed
  • Create new Annotate1D GUI #99
  • HoloNote overlay and retaining opts PR
    • merged
  • minmax interval
    • Philipp review? not yet.. need small improvements, examples
  • Just released HoloViews dev 1.19.21a1
    • People can test popups and pandas indexing

Maxime:

  • Subcoord_y group-wise normalization and zoom
    • merged (subcoordinate_group_ranges)
    • consider adding minimal docs or example to docstring
  • zoom by group
    • merged.. will be improved when Bokeh supports contextual zooming
  • ylim + subcoordy
    • merged
    • TODO: Maxime open issue to track overlay/level zoom tools
    • TODO: Maxime open issue for different icons for per-level zoom
  • Benchmarking (latency to display, interaction update) of small, medium, large multi-chan timeseries workflows
    • WIP
  • expose subcoord_y in hvPlot
    • later

Philipp:

Andrew:

  • large, multi-time-res code blocks
    • waiting on (subcoord_y + dmap + rangetool)
  • custom hover tooltips in HoloViews
  • merged

Mateusz:

240416 HB4N CZI R5 dev sync

Demetris

  • HoloViews apply hard bounds
    • reviewed by Simon, Philipp, will merge soon
  • hvPlot downsample algos
    • reviewed by Maxime, merged
  • Author narrative text, final code blocks for workflow
    • next up
  • Working with Simon to produce a minimal Range Annotation feature
    • next up

Maxime:

  • Subcoord_y group-wise normalization and zoom
    • ready for review (by Philipp)
  • zoom by group
    • ready for review (by Philipp)
  • ylim + subcoordy (WIP)
    • merge after zoom by group
  • reverse elements
    • merged
  • Benchmarking (latency to display, interaction update) of small, medium, large multi-chan timeseries workflows
    • next up
  • expose subcoord_y in hvPlot
    • later

Philipp:

Andrew:

  • large, multi-time-res code blocks
  • custom hover tooltips in HoloViews
  • reviewed by Simon
  • will merge

Simon:

  • Pandas index handling in HoloViews (test with medium size (~2GB) subcoordinate_y example)
    • ready for review (by Philipp)
    • Demetris will test with czi examples when merged
    • Maxime test with hvplot examples
    • Simon/Andrew test with holoviews, geoviews examples
  • Scale Bars in HoloViews (tested with multi-group, custom unit subcoordinate_y example)
    • WIP
    • tests
  • minmax interval
    • TODO: Philipp to take a look at #6134
  • Working with Demetris to produce a minimal Range Annotation feature

Mateusz:

240409 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Simon, Andrew, Mateusz

April 'Sprint' Tasks: https://github.com/orgs/holoviz-topics/projects/1/views/9

Everyone:

  • Contribute to API discussion on the Multi-Chan Timeseries Example (link to be shared)
  • Code review
  • Track hours per task and report at end of April (so we can evaluate the initial estimates)

Mateusz:

Demetris:

  • Author narrative text, final code blocks for workflow
  • Working with Simon to produce a minimal Range Annotation feature
  • Apply navigable hard bounds for largest of data+padding, x/ylims, dim ranges
  • hvplot downsample options

Andrew:

  • large, multi-time-res code blocks
  • custom hover tooltips in HoloViews

Simon:

  • Pandas index handling in HoloViews (test with medium size (~2GB) subcoordinate_y example)
  • Scale Bars in HoloViews (tested with multi-group, custom unit subcoordinate_y example)
  • Working with Demetris to produce a minimal Range Annotation feature

Maxime:

  • Subcoord_y group-wise normalization and zoom
  • Misc subcoord_y issues (reverse elements, ylim)
  • Benchmarking (latency to display, interaction update) of small, medium, large multi-chan timeseries workflows
  • expose subcoord_y in hvPlot

Philipp:

240402 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Simon, Andrew, Mateusz

Demetris

  • large image volume improvements
  • In process of planning Q2
  • We need an April sprint focused on Multi-Channel Timeseries performance (and features/bugs). Start next week and run for 2-3 weeks? Output -> Public facing holoviz-example-format demo with
  • Backlog
    • subcoordinate_y by group
    • subcoordinate_y y-ranges (probably also solves subcoordinate_y + dmap + minimap)
    • subcoordinate_y groupwise normalization
    • scale bar support in holoviews
    • max extents support in holoviews
    • annotations
    • pandas indexing
    • multi-resolution approach (datatree, ndpyramid)
    • benchmarking
  • Dropped chunk-aware minimap in favor or xarray datatree work

240319 HB4N CZI R5 dev sync

Attendees: Demetris, Philipp, Simon, Andrew, Jim

Demetris

  • Large Image Volume workflow
    • Neuroglancer Panel App
    • Hopefully can get merged into Neuroglancer. Requested here. Should I push for this in some other way?
  • TODO: Review/Prioritize tasks and priorities for Q2

Andrew

  • Multi-res Datatree
    • on init, it executes the DynamicMap three times before settling. No ideas yet.
    • Need a path forward for: Unrelated updates in overlay with dmap triggers computation of another dmap #6135
    • Status on using generalization of carbonplan ndpyramid without custom adaptation?
      • TODO: Andrew create a PR in ndpyramid for pyramid_downsample
    • Performance improvements are waiting on Simon's Index support PR and Maxime for fixes to subcoordinate y-range from RangeXY stream

240312 HB4N CZI R5 dev sync

Attendees: Mateusz, Demetris, Maxime, Philipp, Simon

Demetris:

Simon:

Maxime:

Mateusz:

Andrew

  • Multi-res Datatree
    • on init, it executes the DynamicMap three times before settling. No ideas yet.
    • Need a path forward for: Unrelated updates in overlay with dmap triggers computation of another dmap #6135

240305 HB4N CZI R5 dev sync

Attendees: Mateusz, Demetris, Andrew, Maxime, Philipp, Simon

Maxime:

Simon:

Mateusz:

Andrew

  • Multi-res Datatree
    • using plotsize stream
    • Philipp is investigating dmap triggering issues

Demetris

  • Provided summary of progress and plans for Bokeh blog post
  • Filed issue on Bokeh for cursor-position aware zoom tool.
  • Created MRE for y-range minimap w/subcoords issue

240227 HB4N CZI R5 dev sync

Attendees: Mateusz, Demetris, Andrew, Simon, Maxime, Philipp, Jim

Demetris

  • WIP holonote for batch processing
  • WIP Updated waveform workflow to align with hvPlot large timeseries work
  • Important for subcoordy - Adding group and label args to hover tooltips if supplied (avoids having to pass a custom tooltip)
  • TODO: file issue on Bokeh about zoom applying to the group under the cursor position

Andrew

  • Multi-res Datatree
    • miniscope workflow. The bottleneck was that DMAP was triggering an expensive computation every Hline/vline update. So using .persist
    • try rasterize( , streams=rangexy, plotsize=)
    • controlling the y range from the rangetool link does not work, likely conflict with the second stream
    • zoom_mul.. try using the plotsize stream to get the num of pixels on screen. then find pyramid level that works for that

Simon:

Mateusz:

Maxime:

  • 8: Apply subcoordinate zoom by group: [HoloViews #5901](https://github.com/holoviz/holoviews/pull/6122)
    • DR left feedback
  • Add operation for group-wise normalisation #6124
    • try to avoid changing the data itself
    • Will investigate soon.
  • 12: Benchmark
    • Working on it now. questioning whether things should be locked. deprioritize locking for now.
    • Utilizing pytest
    • Pytest benchmark, defining the parameter space is easy
    • Process for comparing two different commit:
      • Define multiple environments in the same folder
      • Orchestrator that will create multiple environments
      • Merge the results
  • Handling wide dataframes in hvPlot (ML will open issue)
    • issue created?

240220 HB4N CZI R5 dev sync

Attendees: Mateusz, Demetris, Andrew, Simon, Maxime

Simon:

  • 30: visualize large data stacked timeseries (re-assigned from Maxime to Simon)
  • 2: Add Bokeh ScaleBar support in HoloViews #5948
    • Now unblocked, with 3.4 dev/alpha release. Will start work again on this soon
  • Expose max_interval and min_interval as options for Bokeh plots #6060
    • Now unblocked, with 3.4 dev/alpha release. Will start work again on this soon
  • 8: Batch selection by description HoloNote #45
    • DR will try out the code to see if it meets requirements for neuro.

Mateusz:

Maxime:

  • 8: Apply subcoordinate zoom by group: [HoloViews #5901](https://github.com/holoviz/holoviews/pull/6122)
    • Ready for feedback
  • Add operation for group-wise normalisation #6124
    • normalizing the data
    • probably should just be adjusting the y range of the subcoordinate axis instead
    • Number of lines should be the bottleneck (elements being applied to), not the length of the curves.
  • 12: Benchmark
    • Will restart work on benchmarking this week.
  • Handling wide dataframes in hvPlot (ML will open issue)
    • WIP. unclear what to write.
    • We think what happens now is that hvPlot resets the index in a few places, which makes the data quite a bit larger. So the task is to avoid resetting the index. The operation for downsampling now in HoloViews is dependent on the index being the same for many timeseries columns.
    • See issue https://github.com/holoviz/holoviews/issues/6058 for context

Andrew

  • Multi-res Datatree
    • Need to ask Philipp about persist
    • Priorities discussed. DR posted to get Philipp'd thoughts

Demetris

  • Project management, hours accounting
  • Providing feedback on PRs
  • Looking at holonote for batch processing
  • waveform workflow
  • public facing docs

240213 HB4N CZI R5 dev sync

Attendees: Mateusz, Demetris, Philipp, Simon

Mateusz:

Simon:

Maxime (out)

Andrew

240206 HB4N CZI R5 dev sync

Attendees: Mateusz, Demetris, Philipp, Simon, Maxime

Mateusz:

Simon:

Maxime:

  • 8: Apply subcoordinate zoom by group: HoloViews #5901
    • Will start again this week
  • 12: Benchmark
    • Started recently. Got rid of ASV in favor of pytest benchmark. Might have something this week for discussion.
  • Handling wide dataframes in hvPlot (ML will open issue)
    • Not opened yet

Andrew

240130 HB4N CZI R5 dev sync

Attendees: Mateusz, Demetris, Philipp, Simon, Maxime, Jim

Mateusz:

Simon:

Maxime:

  • 8: Apply subcoordinate zoom by group: HoloViews #5901
    • No PR yet, but with JLS suggested HoloViews op (non-default) to apply in-group norm
  • 12: Benchmark
    • no progress yet
  • Handling wide dataframes in hvPlot (ML will open issue)
    • Not opened yet

240122 HB4N CZI R5 dev sync

Attendees: Mateusz, Demetris, Philipp, Simon, Maxime, Jim

Mateusz:

Simon:

Maxime:

  • 8: Apply subcoordinate zoom by group: HoloViews #5901
  • 12: Benchmark
  • Handling wide dataframes in hvPlot (ML will open issue)

Demetris

  • Populated the POC xarray-datatree task with use-cases and datasets. @PR, Is it ready for Andrew?
  • Benchmarking materials prepared and handed over to Maxime

240116 HB4N CZI R5 dev sync

Attendees: Mateusz, Demetris, Philipp, Simon

Mateusz:

Simon:

  • 2 - Add Bokeh ScaleBar support in HoloViews #5948
    • toggle the visibility with Bokeh toolbar element? Simon has a hack
    • waiting on Mateusz for example on custom units
    • Mateusz will add example to visibility toggle (with JS) in the PR
  • 2 - Viewport specific rendering HoloViews 6017
    • Philipp will follow up
  • 8 - Batch selection by description HoloNote #45
  • Edit range by manipulating rendered annotation element

Maxime:

Philipp:

Demetris:

  • With Philipp, better specify the datatree project for andrew
  • Benchmarking.. prepare materials meet with Maxime for handover
  • add dependency field in project

240109 HB4N CZI R5 dev sync

Attendees: Mateusz, Maxime, Demetris, Philipp

Mateusz:

Maxime:

  • Zoom tools automatically vertically scaled on subcoordinate_y overlays #6051
    • JLS reviewed... ready for merge
  • will start working on applying subcoordinate zoom by group this week
  • visualize large data stacked timeseries.. Philipp started addressing.. please review/support his progress

Philipp:

  • downsampling bottleneck: 90% of the time is spent slicing the range in the viewport.. exacerbated by having many traces
    • solution: slice once... have a single wide pandas DF that you only need to slice once
    • the resampling part is cheaper, but we should merge the PR optionally using tsdownsampler directly anyway
  • 3 levels of optimizations:
    • downsample op should check if data is shared, if so slice once and reuse
    • slicing on an index is faster than on a col
    • optimize the downsampling (more of an impact for scaling up the number of curves)

Demetris

  • hard bounds PR
  • create a large image workflow
  • public facing resources
  • PM

231219 HB4N CZI R5 dev sync

Attendees: Mateusz, Simon, Maxime, Demetris, Philipp

Mateusz:

Simon:

  • 4 - Fix invert axis for image stack viewer bottom view HoloViews #5801
    • no update. Hopefully will address this month
    • deprioritized for now
  • 8 - Add Bokeh ScaleBar support in HoloViews #5948
    • toggle the visibility
    • waiting on Mateusz for example on custom units
    • Scope: Chart(?) elements, element2d, RGB/Image types, Timeseries/subcoordinate_y
    • shooting for Q4 23
  • Viewport specific rendering HoloViews 6017
    • shooting for Q4 23
  • 24 (16 this quarter, 8 next quarter) - Chunk-aware minimap HoloViews #5953
    • Not going to happen by EOY.. punt to Q1 2024.

Maxime:

Demetris:

  • Met with Minian group; they submitted a CZI grant to help fund maintenance of their package. This would unblock us, but would happen in the summer 2024 at the earliest. Proceeding with more general imaging workflow tasks.
  • Just refreshed the image-stack workflow
  • Got the bokeh trace level logging to work with our benchmarking system, next step is to incorporate a full workflow.
  • Will now focus on improving project management and prioritizing goals for Q1
  • Will also start creating public facing resources based on what we have so far

231212 HB4N CZI R5 dev sync

Attendees: Mateusz, Simon, Maxime, Demetris, Philipp

Maxime:

  • 4 - Using rasterize on the source plot breaks the RangeToolLink: HoloViews #5908
    • ready to merge PR? YES merged
  • 16 - Apply subcoordinate zoom by group: HoloViews #5901
  • 8 - Automatically add subplot-configured zoom tools when using subcoordinate_y: [HoloViews #5902]-(https://github.com/holoviz/holoviews/issues/5902).
    • Maxime with test Bokeh implementation and send Mateusz if buggy
  • 24 - Benchmark (and profile) HoloViz workflows (12 this quarter, 12 next)
    • DR posted details in log render count PR.. If I don't get a response today, I'll start with the Trace approach

Simon:

  • 4 - Fix invert axis for image stack viewer bottom view HoloViews #5801
    • no update. Hopefully will address this month
  • 8 - Add Bokeh ScaleBar support in HoloViews #5948
    • https://github.com/holoviz/holoviews/pull/6002
    • .opts(scalebar=True, scale_opts={"background_fill_alpha": 1})
    • SH met with DR to talk about scalebars for subcoordy plots
    • this interacts with group-wise stacked timeseries
    • There is already a mechanism for user to define their own unit and the order of magnitudes needed
    • Mateusz will create an example for a user to set their own unit
    • MP has PR (WIP) for toolbar for any UI component, might be a useful implementation rather than a panel widget.
  • 24 (16 this quarter, 8 next quarter) - Chunk-aware minimap HoloViews #5953
    • Not going to happen by EOY.. punt to Q1 2024.
  • Viewport specific rendering HoloViews 6017
    • requires RangeUpdate events fixed in Bokeh

Mateusz:

231205 HB4N CZI R5 dev sync

Attendees: Mateusz, Simon, Maxime, Demetris, Philipp

  • Jim would like the minimap to act like a slider - more constrained, not zoomable, etc. This is currently possible, as shown by our multitimeseries workflow, but might need better API/default.
  • Minimap could incorporate box select as way to update the range.
    • DR will try to see if this is already possible

Maxime:

  • 4 - Using rasterize on the source plot breaks the RangeToolLink: HoloViews #5908

    • DR will test tomorrow
    • SH will review and merge later this week
  • 16 - Apply subcoordinate zoom by group: HoloViews #5901

  • 8 - Automatically add subplot-configured zoom tools when using subcoordinate_y: [HoloViews #5902]-(https://github.com/holoviz/holoviews/issues/5902)

  • 24 - Benchmark (and profile) HoloViz workflows (12 this quarter, 12 next)

    • no progress (waiting on Demetris and the benchmark messaging PR in Bokeh)

Simon:

Mateusz:

231128 HB4N CZI R5 dev sync

Attendees: Mateusz, Simon, Maxime, Demetris, Philipp

Simon:

  • 4 - Plot synchronized annotations across a HoloViews container HoloNote #14
  • merged
  • 4 - Fix invert axis for image stack viewer bottom view HoloViews #5801
  • work in progress
  • 8 - Add Bokeh ScaleBar support in HoloViews #5948
  • Simon will create a WIP PR
  • 24 (16 this quarter, 8 next quarter) - Chunk-aware minimap HoloViews #5953
  • Not going to happen by EOY.. punt to Q1 2024.

Maxime:

  • 4 - Using rasterize on the source plot breaks the RangeToolLink: HoloViews #5908
  • 16 - Apply subcoordinate zoom by group: HoloViews #5901
  • 8 - Automatically add subplot-configured zoom tools when using subcoordinate_y: [HoloViews #5902]-(https://github.com/holoviz/holoviews/issues/5902)
  • 24 - Benchmark (and profile) HoloViz workflows (12 this quarter, 12 next)
    • no progress (waiting on Demetris and the benchmark messaging PR in Bokeh)

Mateusz:

231114 HB4N CZI R5 dev sync

Attendees: Mateusz, Simon, Maxime, Demetris, Philipp, Jim (part)

  • Minian chat with Phil, Denise, Daniel, us might happen in early Dec. Trying to get package in healthier state.

Simon:

  • 4 - Plot synchronized annotations across a HoloViews container HoloNote #14
    • Might be already implemented (but buggy).
    • TODO: Simon chat w Demetris
  • 4 - Fix invert axis for image stack viewer bottom view HoloViews #5801
    • Simon will implement specifiying shared axes
  • 8 - Add Bokeh ScaleBar support in HoloViews #5948
    • No progress yet.
    • TODO: Simon will chat with Philipp about implementation.
  • 24 (16 this quarter, 8 next quarter) - Chunk-aware minimap HoloViews #5953
    • no progress yet

Maxime:

  • 4 - Using rasterize on the source plot breaks the RangeToolLink: HoloViews #5908
    • Will attempt this week
  • 16 - Apply subcoordinate zoom by group: HoloViews #5901
    • no progress
  • 8 - Automatically add subplot-configured zoom tools when using subcoordinate_y: [HoloViews #5902]-(https://github.com/holoviz/holoviews/issues/5902)
    • no progress
  • 24 - Benchmark (and profile) HoloViz workflows (12 this quarter, 12 next)
    • no progress (waiting on Demetris and the benchmark messaging PR in Bokeh)

Mateusz:

231031 HB4N CZI R5 dev sync

Attendees: Philipp, Demetris, Maxime, Ian, Simon

Ian (departing):

Simon: Est Hours - Tasks for this Quarter:

Maxime: Est Hours - Tasks for this Quarter:

Demetris

  • Project planning for year 2
  • CZI R6 drafting and collaborators
  • cleaning up multitimeseries and imagestack workflows for maxime

230926 HB4N CZI dev sync

Attendees: Jim, Demetris, Ian, Simon, Mateusz, Maxime

DR: TODO - figure out when we can start billing for the second CZI year. Check with numfocus if we can bill in the interim.

Ian:

  • Extending datashader functionality to take in 2D xarray with a shared dimension and using bokeh to plot stacked timeseries
  • Ian to open a WIP PR that DR can cite in the report.

Simon:

  • Vectorized annotations in holonote. The particular work inside holonote will be billed to CZI and he has a PR. DR link to it in the report. - The vectorized annotations work in HoloViews is not to be billed to CZI.
  • reviewed Maxime's PR on subcoordinate_y Next up:
  • styling
  • UI elements (for eeg viewer). Very likely won't have time for it this week.
  • annotating for shared kdim is later, but on the roadmap.

Mateusz:

  • Scale bar PR is finished. Ian reviewed. Mateusz needs to address the review.. mostly improving docstrings. WILL BE MERGED THIS WEEK 👍
  • Subplot zoom.. still working on testing.. but should be fine to be merged this week.
  • box limits PR has been merged

Maxime:

  • merged PR about Rangelinking to an overlay.
  • finished initial subcoordinate_y PR. DR just found two bugs.. one which Maxime has already fixed about being about to use annotations on the subplot overlay.
  • The other is more complex and impact being able to use rasterize on the minimap. It currently breaks. This touches DynamicMaps and is not easy to debug.
    • DR will file issue on this rasterize issue so we can go ahead and merge the subcoordinate_y. (Update: issue created)

230919 HB4N CZI dev sync

Maxime

  • separated the link to overlay into new PR and it has been merged
  • working on finishing the subcoord PR
    • fixed being able to display a single element that had subcoord_y set
    • When using the boolean with subcoordinate_y=True, the axes that are created are linear and centered on incrementing integers, e.g. the first subplot coord is centered on 0 and the next is centered on 1. So when you create the minimap, it has to match this axis, e.g. if you have 9 channels, the y axes will go from 0-8 (I think).
    • The other mode is to provide a range that has to be between 0 and 1. The axis you get from this is not the same as when using the boolean approach, but this should be fine for now.
    • The auto labeling wasn't really working. After a discussion with Demetris, we decided to require a label for now.
    • At a later stage we will also address the group of the subplots.. which should allow for interactive scaling per group. (DR added FR)
    • Enabling subcoordinate_y should probably automatically add subplot-configured zoom tools (DR added FR)
    • DR reviewed docs
    • Simon will code review
    • TODO: merge PR and then follow up with another that adds the subcoordinate zooming work that mateusz is working on.
  • Demetris will provide Maxime for a couple sentences for a blogpost about CZI work.

Mateusz

  • subcoord zoom PR https://github.com/bokeh/bokeh/pull/13345
    • Recommend setting the renderers explicitly because you have full control. When using auto renderers you have a mix of frame ranges and renderers and it gets complex in its behavior.
    • Ian and Demetris have now reviewed. should be merged soon
    • Mateusz doesn't really like the current level integer API. He think it might be done better with a list of ranges. Might still try to do this as a fully explicit mode which avoids all the complexity currently involved in determining what should be zoomed. To be done in the future.
  • box limits PR https://github.com/bokeh/bokeh/pull/13365
    • box annotation used as the overlay for the rangetool
    • "Just needs tests then should be done"
    • Ian did a quick review. Now merged
  • scale bar PR needs to be merged asap https://github.com/bokeh/bokeh/pull/13319
    • geographic projections and 2D will be in a future PR
    • "should be ready tomorrow for review". As of 230922, it is not ready for review. "There are some layout glitches and alignment issues I'm working on and I need to finish writing docstrings and add some more tests". The new target is Monday 25th.
  • Additional PR for minor rangetool issues is low priority for now and not very relevant to CZI because it's mostly for when bounds are not set

Simon:

  • The big new API PR has been merged. This particular PR should not be included in the interim report to CZI since more of the time spent on it was billed elsewhere. So DR will just mention the parts related to CZI work in the report.
  • Working on a dashboard app for annotating
  • Next:
    • will work on incorporating vspans and hspans into annotator (CZI billing inside holonote).
      • Mateusz said that hit testing is broken for strips
    • DR will check on billing gap to CZI
    • Plan for annotation UI elements
    • Styling the annotations
  • Next time ask about: Annotating multiple plots that share a kdim

Ian: large data handling

  • single session, one main nwb file with spikes and metadata
  • multiple probes, each with their own LFP data file
  • We now have a way with kerchunk to access the lfp files without copying it all to zarr
    • he was trying to do that so we could do multiple probes in a single reference file system
      • we can't do that right now because the original nwb lfp probe files each have difference chunk sizes
      • so we'll just have one reference set per probe and manage that internally.. which is fine because there are currently small numbers of probes for the Allen datasets (and each probe has a massive number of channels). But aligning the chunking strategy is a point of feedback that we could give to the Allen folks.
  • one of the forms of data in the files is spike waveform data. Ian made a couple examples of plotting the waveforms using either the allensdk or just using h5py. Have now handed this over to Demetris. Unclear how the waveforms are being identified.. there's some integer mapping. Demetris will try to figure this out.
  • Plotting LFP data. how do we go from visualizing a small number of timeseries at once, to plotting a lot of curves. We didn't have support in datashader for xarray 2D array data, so he's working on that now. It doesn't use any of the approach for Bokeh subplots or HoloViews subcoordinate_y.. it's just mapping the rows of a 2D xarray to a normalized stacked view of timeseries.. So it's very divergent from all the other work and is just a experiment in using datashader before we start converting the different approaches.

230918 1:1 with Ian and Demetris re: Large Data Handling

  • no copy json zarr file of references single probe now works.
    • There had been two issues, first was misnaming of fields that Demetris had identified.
    • Second issue was about compression. To uncompress gzip apparently you need to specify zlib with numcodecs.
  • So now the zarr json ref file could be local and we could pull from the probe file on S3.
  • Ideally, we would go on to do a two probe zarr json ref file, essentially organizing the multiple nwb files into a single ref file, essentially stacking into a reference array and adding a new dim of 'probe' to the 'time' and 'channel' array dims.
    • However, the original NWB LFP files are using different chunking from one probe file to the other (for reasons...?) and so these cannot be combined into a single virtual array in the zarr json ref file. You cannot create a chunked dask array where the chunks vary in a dimension other than their own.
    • So either we go back to creating a full zarr copy of the data, or we live with one zarr json ref file per nwb probe file. Probably the latter, but we'd need to effectively iterate over that in our own code.. but maybe it's not so bad since a single session isn't going to have that many neuropixel probes (our test dataset has 6 probes).
    • ** But this would be a point of valuable feedback for the allen institute when we establish communication.. they should align chunks
  • Spike waveforms. Ian created a couple demos. Data is in the Session file.. so it's the main file that contains references to the LFP files.
    • The first utilizes the Allen SDK, which has the benefit of autodownloading the data you ask for.
    • The second is Ian's version that reads the NWB file, pulls out the data you want, and plots it in a simple way.
    • Ian stopped at this point because he does not understand the identification of the spike waveforms.. it has an integer unit index.. so if he plots the first spikes using the allen SDK, then thei're not the first few spikes when he plots them with his approach.. so there's some mapping going on that we would need to figure out.
  • Scaling up of the Bokeh plotting of LFP data.
    • Started to use datashader for this but datashader does not yet support this. Datashader supports all sorts of interesting combinations of pandas data frames, but for xarrays, it will only plot a 1D vs 1D as a timeseries. So you'd have to pull our the 2D xarray into separate arrays.
    • So Ian is working on a PR to implement multiple timeseries from an array dataarray.. sharing x and plotting each value as the y. There's already a pandas implementation of this so it shouldn't be too difficult. He has it working for simple situations but there's things to do with cuda and anti-aliasing. But once this is done we would be able to pass any of the xarray/dask datarrays via the zarr/kerchunk reference and it should plot it.
    • This approach using datashader would not, as currently being worked on, make use of the subplots in Bokeh.. we would have to apply an offset to the data.
    • We could also try to see how bad it would be performance-wise to call datashader on each individual subplot/channel.. Although this wouldn't work with lines overlapping each other.. right? so the subcoordinate ranges would need to stay in swimlanes.. which is also not ideal.
    • Ian will continue working on 2D xarray support for datashader this week and then we'll play around with the implementation and go from there.

230905 HB4N CZI dev sync WIP.. DR will clean up notes soon

Attendees: Mateusz, Jim, Demetris, Simon, Maxime, Ian

  • Scale bar is very close being finished, the main thing left is geographic projections
  • range tool implementing movement limits PR WIP
  • subplot zoom is WIP, still working on wheel zoom tool, other problem is the configuration/api for having a user specify which layer/tier the tool should be applied to. Thinking about having an arg to specify integer which would map to a tier of subplot nesting that this tool would apply to.
    • Global zooming is still a valid thing to have, but somewhat overlaps in functionality if using a minimap. Ideally the zoom in the Y axis would snap to channels instead of showing half traces.. probably requires categorical axes.
  • Holonote ... big rewrite last week. Changing how the annotations work internally and this week starting to work on the features.
    • Showed demo of creating linked annotations for elements that don't share a key dimension
    • Next: testing, style annotations, mock-up UI
      • MP: zero tap latency is coming Bokeh so the distinction between single and double tap will not be useful anymore (as is currently used in holonote).. so we need to find another UI
        • idea: inline toolbar.. another set of tools appears that is only relevant to the current selection
        • idea: long press brings up a menu?
      • context menus
    • we need support for 3 types of deletion: delete single annotation, delete multiple selected annotations that may cross categories, and delete all annotations of a given category.
    • right now we have selection tools for data sources but probably what we want is some kind of object selection that allows you to perform some action on them. Then another one like right click on the particular annotation and tell it to select all of this kind or something
  • Maxime - upcoming work will be to pull out Philipp's work on using rangelinktool with an overlay and then clean up the subcoordinates PR.
  • Ian - Large data handling - NWB. coming back from 2 weeks PTO. Demetris has continued work.
    • Trying to use Zarr and Kerchunk to access original datasets and plot LFP data from NWB file. The kerchunk aspect not quite working yet.
    • upcoming tasks:
      • Take Demetris' example of plotting a chunk of this zarr data and scale that to plot all of the data
      • Look at some ideas about minimaps that we have in our original workflow examples to get that integrated with the dask, xarray, zarr, kerchunk approach.
      • Get example of working with spike data
  • Demetris took probe data from NWB file and plotted a representation of the probe alongside the timeseries data.. no linking yet.

230829 HB4N CZI dev sync

Attendees: Mateusz, Jim, Demetris, Philipp, Simon, Maxime

  • Bokeh 3.3 will be released when this round of CZI tasks are complete Mateusz working on:
    • Scale bar PR will be ready for review by Philipp tomorrow
      • There is also a PR for axes improvements for scale indicator... but this is probably now superseded by newer scale bar PR.
    • Range tool respecting bounds PR. This works, just requires tests.
      • There is also work (no PR yet) for RangTool UI - the rangeTool can disappear off the frame and there's no way to retrieve it. This is a regression from having made it an editable box annotation type that doesn't have limits bounds handling. This should also be available this week.
    • Support for zoom on subplots PR.
      • zoom in/out tools work. working on wheel zoom right now.
      • Needs some general API for making this configurable because right now if you have subcoordinates then it always uses the subcoordinates and can't be configured to zoom on the top level range coordinate system.
      • Potentially ready later week.
    • PR for mask_data() to only include data visible in the current viewport, aligning with a related Holoviews issue.
      • Useful to have, but pointless from CZI perspective right now, so not as important as scale bar and zoom fixes. Probably remove the CZI tag on this. done.
    • Subcoordinates/subplots in Bokeh. Stabilize the API and create documentation Stacked traces. TODOs posted on the PR
    • TODO: Maxime will take over Philipp's PR
    • TODO: Mateusz/Maxime check whether the ytick misalignment for stacked traces persists in Bokeh without Categorical axis
    • TODO: Philipp check if using Categorical axis in HoloViews resolves the ytick alignment.
    • TODO: Philipp will pull out fix for link on overlay element into another PR and hand that off to Maxime.
    • Mateusz: switching to categorical axis will allow for filtering the display of subcoordinate subplots
    • API for subcoordinate_y in HoloViews. Should it be on the individual element or container level?
      • Jim and Demetris leaning towards the container level, as this is a sort of new container type. Philipp leaning towards the individual element level, but maybe we can/should implement at both levels?
      • For CZI purposes, as long as the spacing can be automated, I (Demetris) don't care that much what level it's at, as long as it's doable.
      • TODO: Philipp will do a first pass implementation of boolean/automated subcoordinate_y and then hand it off to Maxime.

Next meeting:

  • Annotations
  • First-year report
  • large data handling

230818 HB4N CZI Large-data-handling sync update from Ian

Attendees: Ian, Demetris

  • Finished most of Phase 1 (ephys ecosystem review) from large-data-handling GOAL issue He hasn't looked back at the Pangeo stuff, but Ian is already familiar with that stuff.
  • He has got the Allen SDK installed and is playing with it a bit
    • It doesn't really work for him because the download is too slow and many assumptions therefore break (timeouts on downloads, etc.). However, before the download timeout, it shows you the URI that it's downloading from, so we can use that to get the file and rename it appropriately.
    • By default, they are getting data via HTTP so unsure where that's coming from. They also have files servable from S3, but users have to pay for that. The S3 access approach is something we'll want to do later because the Zarr stuff will work better with S3 likely, but it will just be user-pays.
  • single session NWB file (2.7 GB), contains spikes data
    • This link downloads the 2.7 GB file: ecephys_session_715093703.nwb
    • 12 GB of LFP data is dynamically downloaded as needed
    • So far he has read in data with h5py from two probes and rechunked them into a local zarr file containing a 3D array (probes, times, channels), about 2 million time samples and about 90 channels per probe. Although there are slightly different time lengths and different channel numbers across the probes. So he is having to nan-pad in the channel and time dims.
      • That's annoying that they have different time lengths.. Ian hasn't yet checked whether they are on the same clock and some start or end at a different point (best case scenario) or if they have actually different timestamps (bad).
      • If we concat the probes into a 3D array then we might have a 2D array for time coords where each probe series has a different time coord. but hopefully not.
  • Has started kerchunking the LFP data but hasn't yet figured out how to actually access the data that is nested within the hdf5 file.
  • Once we have a demo of accessing and plotting the data from zarr (ideally kerchunk-referenced to NWB, otherwise a local copy), then we should run a simple timing comparison of time-to-plot comparison with allen SDK.
    • accessing from zarr will be working in parallel with dask and hopefully the defaults will already be better than what allen SDK can do.
  • UPDATE: Ian has now sent DR some draft scripts to start playing with.
  • TODO: DR Next meeting show Ian this blog post. We should benchmark these visualizations with out data access/viz approach.

230816 HB4N CZI dev sync

Attendees: Jim, Philipp, Ian, Simon, Demetris, Mateusz

TODOs:

  • (Mateusz) - Allow the Bokeh zoom tool to scale an earlier/intermediate range on a plot that contains subcoordinates/subplots instead of the final range. FR filed.
  • (Philipp) - Fix issue in PR causing y-tick to depart from data trace when subcoordinates_y ranges overlap.
  • (Demetris) - Update the ephys workflow to work with LFP data

Minutes:

  • Stacked traces status, review requirements
    • Goal is full delivery (Bokeh + HoloViews) by mid Sept. (on track? - Maybe, but tight)
    • Rescaling traces = zooming into individual traces without increasing the spacing between the different traces
      • We don't necessarily need another Bokeh tool, but we need to be able to configure the existing zoom tools and have them work with subcoordinates
      • Mateusz: I think this is already possible, just need to experiment and see if you can configure them with renderers and assuming that it will use the right kind of range artists configured with those subcoordinates (which, we are unsure about), then it will just zoom in on those subcoordinates. It's unclear if the zoom tools take the renderers or they just take the plots and then zoom on the default range. Unclear how they interact with subcoordinates right now.
      • TODO: Mateusz will try to see what happens if you basically attach renderers to zoom tools. He will work on this next if it doesn't currently work.
        • UPDATE: Mateusz tried this and it doesn't work as expected because it doesn't know about subcoordinates. So this is a missing feature.
        • TODO: Have the zoom tool scale an earlier/intermediate/subcoordinate/subplot range instead of the final range.
      • Custom tools from Bokeh would need to control a new scale factor in HoloViews... something like a subcoordinate_scale parameter.
    • Offsetting the traces (and aligning with the ytick)
      • With the current subcoordinates_y PR, the yticks don't travel at all when panning (when yranges are set to overlap)
      • Mateusz: this is probably an issue with the PR because this works as expected with his Bokeh version.
      • Philipp: we could scale the offset of the target space so that the starting sample is centered. That is a computation that HoloViews/Panel could automatically handle. We could manually compute the y_ranges and then upstream it into HoloViz.
      • UPDATE: Demetris decided to drop the alignment of the left-most data point for now, as LFP and EEG data generally should have slow drift removed prior to plotting, so the y-tick should just be positioned close to (median?) the trace in order to be useful. However, there is still an issue with the subcoordinates_y PR preventing reasonable alignment of the ytick when panning with overlapping ranges. TODO for Philipp.
  • Annotation status, review requirements
    • Goal is full delivery by mid Sept. (on track? - Maybe)
    • Simon did a quick walkthrough of a notebook that uses holonote on the EEG viewer workflow. Still plenty of work to do.
      • right now, need to define an empty holoviews element to start building the annotator instance.
      • category/description colors don't work yet
      • there's an open vectorized spans PR
    • Simon will recreate MNE annotator functionality with holonote and see what gaps there are to start addressing them
  • Large data handling
    • Goal is MVP with ephys data by mid Sept. (on track? - Maybe)
    • DR prepared list of things for Ian to work through and become familiarized with
    • Ian is now getting familiarized with NWB ephys format and daskify the visualization and handling of it.
    • Also looking at use of zarr files for either local or remote access in a sensibly chunked way. We'll create some examples and then plug into real workflows.
    • The NWB file (~3GB) that Ian is working with doesn't contain the full band raw data, but it contains the spikes, and it contains references to remote LFP (low-frequency, continuous) data (which can get downloaded and stored externally as needed - about 12 GB). This is a good use case for us because we could use local or remote zarr to pull down the bits we want.
    • TODO: DR will update the ephys viewer workflow to explicitly display LFP (rather than full band data).
  • Benchmarking status
    • Ready to merge the PRs?
    • What will be utilized for the 1-year report? - A large multi-timeseries display time/interaction improvement
    • Sidelined for now.. will revisit as we make progress with Large data handling
  • Multiple Minimaps?
    • Philipp: I see no particular reason why we shouldn't do the same as MNE and try for a single minimap that controls multiple data-types in the same target plot.
    • Mateusz: also note that before canvas layouts are done, you can't share annotations between different plots (argument against having multiple plots/minimaps).
  • Imaging - DR is in the process of setting up a meeting with the original Minian developer
  • Scale bar status, review requirements
    • Goal is full delivery by mid Sept. (on track?). Ran out of time. Will get a status update offline.

230802 HB4N CZI dev sync

Attendees: Jim, Philipp, Mateusz, Demetris (DR), Ian

prior TODOs:

  • (in progress) Mateusz complete PR for independent independent subaxes in Bokeh
  • (next) Mateusz complete PR for scale bar in Bokeh
  • Ian/DR get benchmarking working on DR's computer
  • (in progress) Ian make better benchmark tests using some of the example workflow content, parameterized on data size and Bokeh output backend (2D canvas vs WebGL).
  • (in progress) Ian add additional timings (like zoom/pan)
  • (later) Ian have playwright take a screenshot at the end of the test
  • (in progress) DR try to create a pure Bokeh example showing unintuitive behavior of zoom out stopping when hitting a single bound, file issue
  • (in progress) DR file issue about RangeTool allowing pan beyond hard bounds of linked plot
  • (in progress) DR write up requirements and suggestion for a better rangetool API

Minutes:

Benchmarking:

  • Ian completed preliminary benchmarking PR is merged with a pure Bokeh benchmark; DR and IT can run it and reproduce similar results
  • DR helped debug and produced starter code for a holoviews+panel benchmark
  • DR tried benchmarking a panel app .servable() from an .ipynb.. won't continue with this for now
  • Ian is currently working on extending the benchmarking framework to cover:
    • First, more realistic use cases (panel + holoviews)
    • Second, timing an interactive zoom event
    • Both things above currently work when run once.. when run multiple times, getting some problems setup, timing, teardown.. getting multiple timings for a single setup.. so either asv is doing something incorrect, we are misunderstanding something, or maybe using the word 'setup' is being used inappropriately somewhere..
    • Also, the current holoviews benchmark creates a new hv.Curve when we press a button.. so it's a replacement of a bokeh plot.
      • Depending on whether it's done with a hv.DynamicMap, it might be just a replacement of the underlying columndatasource or a full replacement of the plot.. so we might have to plan the benchmark to handle both.
      • Ian noted that the current replacement produces a new Bokeh ID. Philipp doesn't think this is a sensible thing to benchmark. We don't want to be optimizing/supporting this approach.
      • TODO: Ian ping Philipp if feedback/guidance is needed for the approach to benchmark 'latency to first display' for a HoloViews plot
  • Mateausz was doing some work with selectors and hit testing; things were slow so tried to benchmark. He found that the hit testing was like a fraction of a percent of the entire update and canvas was like 98% of the duration. The point is that it may be hard to improve on latency to display if we stay with canvas rendering. Ian noted that we are accounting for this because our current testing includes both canvas and webGL, but for a one shot plot, the webGL doesn't show any improvement.. but maybe we will get to some point with WebGL is faster.

WebGL Image:

Sizing issues:

  • DR worked on responsive sizing of eeg viewer #64
  • DR filed issue with Panel: HoloViews/Bokeh aspect not being respected, and different behavior with fast template #5343

Data:

  • DR added EEG sim blink artifacts and channel correlations #68
  • DR Added some real EEG data into the workflow, working on imaging data

Minimap/RangeTool:

  • DR filed issue: Range tool is variably and incorrectly present on the target plot in Panel app #5315

Large data handling

  • DR will set meeting with Ian to discuss large data handling approach for imaging work soon

Subaxes/Canvas Layouts

  • Mateusz making progress on canvas layouts allowing for subplot
  • can now render multiple plots (as grid plot) on the same canvas
  • a new addition is that we can span some annotations across plots
  • there is still A LOT of work to do here, as there are many assumptions about a plot being a single canvas + cartesian frame that are no longer true.
  • still don't have a proper rending pipeline of well-defined stages (update geometry, paint, etc) so we don't know when coordinates change - what or when are things updated. So far this was handed by reactive properties of changing events, but this breaks from single changes because of a dependence on event order.
  • Philipp: a lot of this work is about multiple plots on one canvas, while the main objective here is being multiple subaxes on one plot, right?
  • Mateusz: Depends on how you view it.. if you want multiple subaxes, you technically have multiple subplots. the point is about being able to coordinate coordinates between the different plots. Especially as you start wanting to have a annotation spanning across.
  • DR may create a pure bokeh example of the eeg workflow to assist Mateusz's efforts
  • DR mentioned the current approach requires plotting lines with data offset by Y, but then creating a custom hover tool referencing the original y values so that the hover tooltips are accurate and not reflecting the offset values.
  • Mateusz mentioned the existence of a subplot function which was part of the original subcoordinate work that is not documented but presented in the open ridgeplot example PR. The caveat is that you can't have a line between two coordinate systems. You can't have multiple coordinate systems used by a single annotation. So you can't span things automatically... you'd have to revert back to working with the frame (source) coordinate system, which is usually not that meaningful. Scales are not used in this API right now. Also you can't have an axis with this, so on the left/y-axis there will be no ticks for the inner subplots, but that's fine as the hover is there to reveal the values.
  • After the meeting Philipp, exposed .subplot as subcoordinate_y in a protoype PR
  • It's been requested by other projects as well to have multiple plots, aligned with x axis time, and annotations across them
  • how would a scale bar work with this approach? (DR is confused about this point.. didn't follow the convo)
  • how to handle ongoing canvas/subcoordinates work with CZI billing?
    • let's work on the .subplot approach for a few days and see how far we can get with it and then see how to proceed with the canvas/subcoordinates work.
    • there's a risk of pulling Mateusz off of this because he is making progress on it now and coming back to this later might be difficult.
    • Mateusz has to pause anyway for a couple days to finish scale bar work

Scale bar

  • Mateusz will work on this now

Other

  • DR tried to build the Minian package for osx_arm64.. learning a lot but unsuccessful so far.. packaging is complicated :(

230719 HB4N CZI dev sync

  • Attendees: Ian, Demetris, Jim, Philipp, Simon, Mateusz
  • Ian
    • Benchmarking PR
    • TODO: IT/DR get benchmarking working on DR's computer
    • TODO: IT make better benchmark tests using some of the example workflow content, parameterized on data size and Bokeh output backend (2D canvas vs WebGL).
    • TODO: IT add additional timings
      • right now timing of time taken to serialize and deserialize pre-generated data and render a single frame
      • we could also time just the data transfer or just the first frame
      • interactions (zoom/pan)**
    • What about testing with Jupyter?
      • Philipp: The complication is probably not worth it (right now)... complicated to keep such a CI working, etc. Philipp does do full Jupyter testing using playwright in Panel, but it's a pain. Regardless, the comms between app serving and jupyter are pretty similar (web socket); some differences with binary serialization (but this is probably not a bottleneck).
      • Jim suggested using nbconvert (i.e. execute nbconvert from command line with subprocess and give it the execute preprocessor, etc) to show that there is no error / sanity check. But asv benchmarking is primarily about profiling.
      • compromise: TODO: IT have playwright take a screenshot at the end of the test (to log that we are benchmarking a successful render)
      • Mateusz, Demetris: agree, Jupyter testing likely not necessary right now
      • Conclusion: If specific concerns about Jupyter arise, let's log them and come back to it during a subsequent round of benchmarking efforts.
    • ASV has to be run from an installable source repo, so Ian added a pyproject.toml to the root (just for local install, not for release)
    • We can setup a github.io page for persistence of benchmark runs and auto push to that. So some of us would run these periodically on our local machines (not CI, it's unreliable) and then push to aggregate the results with machine-source metadata.
    • ASV's GUI is poor but works and we should avoid the temptation to Panel-ise it. :)
    • Most people care about benchmarking the history of a repo over released versions, but we care more about benchmarking various approaches
    • For testing, we could use tags and every month we have a shared fully pinned environment that we collectively test.
    • Mateusz: if you saturate the rendering pipeline then hover will never work. There's no merging of events in a queue, no multithreading, we just waiting until something is done to do the next thing. [Discussion about web workers and multithreading and implementing the comm protocol between main thread and workers. web workers have to have their own bundles, so bokehjs would need big changes. render to a canvas offspring in a worker and then transfer bitmap buffer over to main thread and compose on final canvas].. Mateusz thinks the benchmarking will lead to this because it will be clear that things like hover will be unusable as data scales and things take longer to compute. Maybe future grant proposal.
  • Mateusz:
    • [regarding independent subaxes and scale bar in Bokeh] - making progress and will have something to show early next week.. TODO: send out a request for feedback when PR is ready
  • Simon:
  • Demetris
    • benchmarking with Ian
    • wrote requirement notes for annotations
      • sync with Simon about them when he is back
    • providing updates for collaborators and investigating other viz approaches
    • will be incorporating real data into workflows
    • added initial bound ranges for rangetoollink (minimap)
      • TODO: DR write up requirements and suggestion for a better rangtool API - discuss at next meeting
    • We received feedback from collaborator not liking being able to pan/zoom outside of data range
      • Since 2016, there has been discussion of data-range-restricted interactivity being the default, but it hasn't been implemented https://github.com/holoviz/holoviews/issues/1019
      • Quirk: You can still drag the RangeTool beyond the bounds of the linked plot.
        • Mateusz: probably there is a difference between interactive behavior and non-interactive
        • Philipp: we need to clip at the bounds as well (?). API for hard bounds is the main issue.
        • TODO: DR create issue and assign to Philipp and Simon
    • Unexpected behavior: zooming out stops whenever it hits a single bound, rather than continuing to zoom out in the other directions.
      • TODO: DR try to create a pure Bokeh example showing what the unintuitive behavior is, then assign Mateusz
      • Mateusz: it does entire range at once so it will not work (as is).
  • Jim:
    • interacted with fastplotlib authors (Kushar, Caitlin) at SciPy
      • their demos were entirely in Jupyter - they had a remote machine rendering into a frame buffer that's embedded in their Jupyter notebook.
      • workflow targets are pretty similar to ours
      • they have no text rendering or axes yet (this is a huge lift). They are banking on pygfx having axes soon and they would just inherit that.
      • helpful to have as a demonstration of raw power and just to compare how quickly you can put pixels on screen
        • wouldn't be bad for us to to be within X% of theirs, as long as we are meeting the user-requirements
        • we could massively speed up Bokeh (e.g. by dropping axes or scale) but we would lose generality so
      • they are working with Almar Klein (from visPy, WebGL in bokeh)
      • Almar is now working on pygfx (based on newer tech like webGPU, Vulkan, etc)
      • Is there a use case for blitting an image or timeseries onto the screen and updating this fast? Is it worth the effort? Is it a big enough use case?
        • if yes, maybe a follow up funding opportunity - all the benefits of fastplotlib but with truly optimized Bokeg with webGPU (would take many years)
      • we don't need to be the fastest, just fast enough for what it's used for.. center things on user need.
    • Nepari
      • Being used alongside Jupyter.. so running a Jupyter-based workflow but then the rendering is happening in a separate window.
      • workflow targets are pretty similar to our imaging workflows

230706 HB4N CZI dev sync

  • Attendees: Demetris, Philipp, Ian, Mateusz, Simon
  • Reminder about goals for the grant
  • Discussion about timeline and deliverables for year 1
  • Simon will downstream Mateusz's upcoming work in Bokeh on scalebar and independent subaxes to HoloViews
  • Simon will work on 1D/2D annotations in the video-viewer and eeg-viewer workflows
  • Benchmarking
    • Ian has a Bokeh branch that writes to the console before and after a render so that playwright can capture time to render
    • Next milestone will be to get a testbench running with airspeed velocity
    • writing to the console more than a few times will degrade performance and cause variability, but just a couple writes shouldn't impact
      • We need the neuro repo to be pip installable in order to work with ASV
  • Scale bar (in Bokeh)
    • Mateusz: maybe 2 days of work
  • Independent subaxes: Mateusz's focus for July

230609 HB4N CZI dev sync

  • Attendees: Jim, Demetris, Philipp, Jean-Luc, Ian, Mateusz, Victoria
  • Discussed grant proposal goals, project phases, repo structure
  • Discussed current in-progress generalized workflows
  • Discussed project board, 'GOAL:' tasks and their related tags
  • Discussed HB4N-CZI Grant Hours Accounting spreadsheet
    • ⭐ TODO: Ian, Mateusz, Jean-Luc, Philipp, Jim please provide Demetris with feedback on est hours and task lead assignments
  • Demetris' next task is API iteration
    • ⭐ TODO: DR send Simon task for next API check
  • Philipp: live capture?
    • Roadmap includes streaming data as a future direction (so we'll see if we get to it), but maybe a more generalized demonstration of streaming display that is somewhat recording-device agnostic.
  • Philipp: unpack Large Data Handling Task?
    • Ian is lead because he core dev on both on Bokeh and Datashader
    • Jim: We go into this without any strict preconception, just with the scenario - what we're trying to optimize - and look at different approaches: Datashader, WebGL, Bokeh, LTTB. Probably some of these are good for certain situations, but the goal is for Ian to identify what those cases are and probably at some point get Mateusz to do fundamental data/array handling which would help multiple approaches. Benchmark and then sort out what to do on a case-by-case basis.
    • Mateusz: How large is large?
      • Different for each modality. Estimate: Ca-imaging ~50 GB; EEG ~0.5 GB; Ephys ~200 GB
    • Philipp: At what point do we take over the data handling for the different specialized use cases? Are we talking about custom readers, data formats, etc?
      • The separation of Generalized vs Specialized workflows is partly so that we can focus on the core task in our wheelhouse at first - visualizing data in familiar in memory forms. The idea is that once we solve the more generalized setup for the different modalities, then this work would selectively extend into more specialized use cases as needed. For instance, with MNE, we could potentially target a new backend for their raw.plot() that utilizes their existing data format and machinery as much as possible, while benefiting from any performance and API improvements we've made working directly from numpy arrays.
    • Philipp: To really address some of the more difficult use-cases (ephys), we need to consider streaming from an on-disk format sooner than later.
      • Jim: Topic for maybe next year: hierarchical data storage formats (xarray datatree) that store downsampled versions and allow for tiered (drill-down/up) access as you zoom. But for this year let's focus on existing file formats and improvements that we can make with existing infra/formats.
      • Demetris: also depends on modality... Minian pipeline (ca-imaging) uses xarray/zarr/dask storage/reading which would be more amenable to such hierarchical approaches and minor changes, but changing ephys/eeg disk formats and getting community adoption is infeasible.
  • Jean-Luc: How do we know what are appropriately generalized workflows?
    • Demetris: drawing on our intuition and the cross-section of what would be useful to our collaborators, the wider community, HoloViz, the CZI grant, and potential future collaborators.
  • Jean-Luc: Are there going to be rounds of feedback from the community?
    • Demetris: By around November of this year, I hope we have developed a set of viable workflows that our collaborators can begin to utilize. It's completely acceptable if there are certain limitations and if we need to acknowledge and apologize for some aspects of these workflows. However, I'd like us to establish a specific list of future plans and understand our constraints at that time. Essentially, we will be presenting ourselves to our collaborators in the fourth quarter of this year, welcoming their feedback. Simultaneously, we'll begin publicizing our efforts to the wider community.
      • ⭐ TODO: DR strategize publicizing our efforts to the wider community
  • Philipp: To what extent do we overlap or reimplement napari?
    • Demetris/Jim: Napari is a Python application for microscopy. The main differentiation lies in the intended usage. While napari is seen as a comprehensive suite users, our project aims to create modular tools that integrate into people's existing workflows to address a specific issue and then move on with their analysis. Despite these differences, the team expressed interest in cooperating with napari, acknowledging potential areas for coordination and integration to maximize user benefit.
  • Jean-Luc: other relevant software?
    • Available lists of software in different modalities are in the GitHub Wiki
  • Jean-Luc: How flexible is roadmap?
    • The specific roadmap dates on the Project Board are just suggestions.. the important aspect is that we have a first pass at as many of the identified goals as possible before the end of Q4 2023

230607 Cai Lab sync

  • Attendees: Demetris, Joe, Austin
  • Purpose: Sync on bottlenecks and development plans related to the minian pipeline
  • Intros about CZI Grant, HoloViz, Bokeh, Demetris,
  • Question about Preprocessing vs analysis: This grants current work is more relevant to preprocessing because we are focusing on raw or near-raw data visualization, but certainly relevant to downstream analysis steps and we can develop more analysis specific examples.
  • Question about multiple modality: We are certainly interested in visualizing different modalities (EMG, EEG, CaImaging, behavior) together (added to readme list of intended workflows).
    • TODO: Joe/Austin will send MM data! (typically used for sleep staging)
  • Question about plotly: Explained HoloViews backends.
    • Use of plotly is for 3D plots.
  • Comment HoloViews(bokeh) interactivity doesn't always work in vscode. They are using vscode over jupyter lab or notebook because of better git UI, remote ssh sessions, and better text editing.
    • TODO: DR follow up and ask Joe, Austin to report any issues on GitHub after they've tried the latest version of packages.
  • Priorities:
    • first: Video viewer workflow
    • second: Minian CNMF dashboard
  • Austin doesn't use the CNMF viewer because it takes a long time to load, although it would be extremely helpful if it did work. "It's the least used but could potentially be the most helpful". Right now they are setting parameters based on the first video chunk. Then apply those parameters to the rest of the videos and inspect the max projection to see if those parameters worked. If they need to adjust the parameters, they would need to run the whole pipeline again.
  • Walkthrough of holoviz-topics/neuro repo
    • Went through video-viewer readme, workflow
    • There was a comment that the video playback for them was not smooth. DR explained that running .compute() will load the xarray/dask dataset into memory and will cause the video to play more smoothly, but that required have the available memory.. A better solution would be smooth playback with chunked reading through some approaches we are considering as part of the large-data-handling task.
    • If Joe/Austin want to contribute directly, DR can give them access. Alternatively, they can fork and submit pull requests from that.
    • It's important for us to run the benchmarking and workflow iterations on real data as well
      • Sending data: What should they send?
        • There is one dataset that is 42 hours (non-continuous, recorded over a month) ~ 1.59 TB. This is not common.
          • One approach they've tried with this dataset is spatially aligning the first from each video and then concatenating the video and running part of the pipeline on the whole thing.
        • A more common, upper-end data duration is 1-2 hours. The average is closer to 20 minutes.
        • sending a 20 min, 1 hour, 2 hour would be great. Especially if it included simultaneous EEG (.edf) data and behavior data. TODO: DR remind about this. Also offer to just try downloading this from their server since this wouldn't be that much. We may still want the 42 hour dataset, but we can wait a bit on that, and if so, it probably should be mailed.
  • Temporal update plot in the Minian pipeline would also be really useful to improve. It's just a timeseries plot, but after some duration (~1 hr) it becomes unusable.
image
  • TODO: DR try to replicate issue with this app using a longer recording when real data is received.
  • TODO: DR add to readme workflow incubation list (DONE)

230515 Large Data Handling

  • Attendees: Jim, Demetris, Ian
  • Purpose: Sync with Ian about plans and start discussing approaches for handling large data

Element workflows: Demetris is actively working on these.

  • Ephys: ephys viewer (long duration and large number of timeseries vertically stacked)
  • Ephys: waveform viewer (large numbers of timeseries overlaid)
  • Ephys: spike raster (long duration and large number of series of timestamp markers vertically stacked)
  • EEG: eeg viewer (long duration and large number of timeseries vertically stacked)
  • Imaging: video viewer (large number of frames) Outline of key features and challenges:
  • For all element workflows: scale bar, large data handling, Bokeh? annotation tool, (later - streaming data)
  • For Ephys and EEG: linked selection from layout, 1D annotation
  • For Imaging: 2D annotation, linked selection from 2D ROI annotations to aggregated timeseries for each ROI.

Large data handling:

  • Main issues/questions per data type:
  • Videos:
    1. How fast can a user scrub/navigate through a video that is potentially bigger than available memory?
    2. Given 2D (x,y) annotated/selected ROIs (like circled neurons) in a particular frame or aggregate frame, how fast can we return the aggregated z-timeseries for each ROI?
  • Timeseries:
    1. How can users efficiently visualize some useful representation of data that is potentially bigger than memory?
    2. How can users smoothly (with low-latency) zoom/pan through the data (over x-time and over y-channel) and retain interactive tools (e.g. hover)?
  • Probable phases for large data handling task:
    • Setup benchmarking
      • Using simulated and real data (Demetris is actively working on this), create a benchmarking approach that captures not only the concrete (e.g. Python-JS comms) stuff, but more importantly the user experience.
        • Where should large real data go?
        • Ian is happy to receive the Imaging data offered by Cai lab.
        • Jim suggests maybe Ian eventually copying the data and sending it along to Demetris and/or some server place for LTS (not urgent). Maybe after this grant we can seek funds for maintenance of performance testing server.
      • We have to codify “useable by a user” and have a system to report how the metrics change with different iterations and approaches to large data handling.
      • This will likely include some semi-automated in-browser benchmarks like playwright testing that records frame rate (for the case of video scrubbing) given some type of simulated user interaction.
      • Hopefully the benchmarking infrastructure remains in place and useful beyond the grant timeline.
    • Find a solution for most of the dataset sizes (probably up to some point)
      • This involves understanding the distribution of dataset sizes (Demetris is actively working on this)
      • See more in ‘Potential solution approaches’ below
    • If there is one, find the point of diminishing returns with the solution from step #2 and then explore a complementary/compromise solution for dataset sizes beyond this point.
  • Potential solution approaches
    • Raw performance approaches:
      • Bokeh (non-WebGL)
      • Bokeh with WebGL
        • Recently, it has been suggested that anyone working with timeseries should be using Bokeh with WebGL.
        • Ian: Currently, no WebGL support for images/videos in Bokeh, but it would be a small effort to try this.
          • TODO: try out WebGL support for images/videos in Bokeh, at least for benchmarking purposes. Bill it to CZI.
      • Decimation
        • e.g. with HoloViews recent work with Largest Triangle Three Buckets (LTTB).
        • This approach is good for very long timeseries if you care about the envelope, but if you have brief, high-amplitude noise then this is suboptimal as you will only see the envelope rather than the actual signal.
          • follow up question to JB: so it would be good for raw ephys where we have brief, high-amplitude spikes?
          • Yes, it would preserve the spikes much better than random or strided sampling. Good point!
      • Datashading
        • Useful for viewing both the outliers and the central tendency in a long time series (not just the envelope) as well as to disambiguate a lot of overlapping timeseries with potentially a small number of groupings.
    • Orthogonal approaches:
      • Scrubber / Minimap:
        • In the case of a Minimap - see some useful derived representation that spans the full dataset and assists with drill-down within a separate, primary display window that shows a subset in focus.
        • A scrubber is just a representation of the length of the data.
        • This approach can be combined with any of the raw performance approaches above, although the size of the primary display window might depend on the raw performance approach being used.
        • Minimap implementation may involve some combination of progressive datashading, lazy rendering, strided sampling, or combined use of xarray datatree (see below). The approach for a minimap also depends on the on-disk format because if using something partitioned like zarr and try to do strided sampling, it might just read the whole dataset (e.g. one sample per partition) and lock up the UI in the process.
      • xarray datatree:
        • promising way to have an on-disk file format already at different resolutions - you prepare the data to support multi-scale rendering of the data later.
        • This would also combine nicely with any of the raw performance approaches; e.g. you could store datashader-aggregared versions or LTTB decimated versions.
        • It could also be combined with the scrubber/minimap approach.
      • Improvements on how arrays are passed between Python and Bokeh
        • This is a Mateusz task, but Ian should be aware and involved.
  • Hardware and user scenarios
    • DR: There is precedence for recommending researchers utilize a GPU/CUDA approach for other very popular neuro packages (e.g. deeplabcut) and researchers are generally fine with that.
    • Probably the hardware tier to initially aim for is something like top-of-the-line macbook pro specs.
    • Unless it’s clearly the only useable option, we shouldn’t assume every user has access to a HPC cluster.
      • Although a common scenario is that a lab has an enormous amount of data and it lives on some lab server and the researchers invest in server memory, cores, and decent GPU.
    • We could have to have HoloViz gridded data set guides that direct users toward certain approaches based on their type of data (videos, long but few timeseries, short but many timeseries, etc) and infrastructure, so it’s fine to have multiple solutions for large data handling.
  • Data
    • Imaging
      • specs:
        • Typically they record at 30fps
        • Most frame sizes are 608 x 608 pixels - some are 1k x 1k px
        • Videos are stored 1000 frames per .avi file.
        • A 10 minute video is around 4GB
        • Typically use FFV1 data compression (lossless)
        • Atypical longest recordings can last 48 hours (~1 TB)
      • Datasets
    • Additional imaging data and specs, including for EEG, intracranial electrophysiology (ephys) forthcoming from Demetris
Clone this wiki locally