Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a standard starfish.wdl that can be run with any recipe file #1364

Merged
merged 7 commits into from
May 20, 2019

Conversation

shanaxel42
Copy link
Collaborator

@shanaxel42 shanaxel42 commented May 17, 2019

Created a standard starfish.wdl file that can be used to process per FOV. It takes in as inputs
-path to experiment.json
-The number of fovs in the experiment
-the url path to a python recipe file

I think the easiest way to have user defined pipelines is to just allow them to upload a recipe file to their own repos, this way they don't need to make a PR to starfish to get a recipe in. Any recipe file just needs to implement a method called process_fov(field_num: int, experiement_str: str) that processes a single fov and returns a decoded intensity table. The starfish.wdl parallelizes per fov, downloads whatever user defined recipe file as recipe.py, imports it as a module called recipe and runs recipe.process_fov(field_num: int, experiement_str: str) it then saves each decoded intensity table and concatenates them all into decoded_concatenated.csv.

There's also three examples for the ISS spaceTX, ISS published, and merfish published pipelines.

I also deleted the outdated old wdl directory.

fixes: #1280 but probs needs some more documentation.

@codecov-io
Copy link

codecov-io commented May 17, 2019

Codecov Report

Merging #1364 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1364   +/-   ##
=======================================
  Coverage   89.05%   89.05%           
=======================================
  Files         147      147           
  Lines        5355     5355           
=======================================
  Hits         4769     4769           
  Misses        586      586

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30357c1...b1d76f9. Read the comment docs.

@shanaxel42 shanaxel42 requested a review from ambrosejcarr May 18, 2019 00:03
Copy link
Member

@ambrosejcarr ambrosejcarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great ideas! I've requested a few changes, but it should be fine to merge afterwards. This will need to be refactored when we successfully figure out per-round processing, but I think that PR-to-be is the right place to do that work. 👍

from starfish.types import Axes


def process_fov(field_num: int, experiement_str: str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def process_fov(field_num: int, experiement_str: str):
def process_fov(field_num: int, experiment_str: str):
"""Process a single field of view of ISS data
Parameters
----------
field_num : int
the field of view to process
experiment_str : int
path of experiment json file
Returns
-------
DecodedSpots :
tabular object containing the locations of detected spots.
"""

fov_str: str = f"fov_{int(field_num):03d}"

# load experiment
experiment = starfish.Experiment.from_json(experiement_str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
experiment = starfish.Experiment.from_json(experiement_str)
experiment = starfish.Experiment.from_json(experiment_str)

from starfish.types import Axes


def process_fov(field_num: int, experiement_str: str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def process_fov(field_num: int, experiement_str: str):
def process_fov(field_num: int, experiment_str: str):
"""Process a single field of view of ISS data
Parameters
----------
field_num : int
the field of view to process
experiment_str : int
path of experiment json file
Returns
-------
DecodedSpots :
tabular object containing the locations of detected spots.
"""

fov_str: str = f"fov_{int(field_num):03d}"

# load experiment
experiment = starfish.Experiment.from_json(experiement_str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
experiment = starfish.Experiment.from_json(experiement_str)
experiment = starfish.Experiment.from_json(experiment_str)

# load experiment
experiment = starfish.Experiment.from_json(experiement_str)

print(f"loading fov: {fov_str}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(f"loading fov: {fov_str}")
print(f"Loading fov: {fov_str}")

Other recipes had upper-case print outs.

ghp = Filter.GaussianHighPass(sigma=3)
high_passed = ghp.run(imgs, verbose=True, in_place=False)

print("deconvoling")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("deconvoling")
print("Deconvolve")

dpsf = Filter.DeconvolvePSF(num_iter=15, sigma=2, clip_method=Clip.SCALE_BY_CHUNK)
deconvolved = dpsf.run(high_passed, verbose=True, in_place=False)

print("guassian low pass")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("guassian low pass")
print("Gaussian Low Pass")

scaled = data / scale_factors[selector[Axes.ROUND.value], selector[Axes.CH.value]]
filtered_imgs.set_slice(selector, scaled, [Axes.ZPLANE])

print("decoding")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("decoding")
print("Decode")


String experiment
Int field_of_view
String recipe_file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
String recipe_file
File recipe_file

If you declare this a file, the WDL runner will localize it for you, then you can delete the wget line below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not work. The file has to be local already https://github.com/openwdl/wdl/blob/master/versions/1.0/SPEC.md#task-inputs

String recipe_file

command <<<
wget -O recipe.py ${recipe_file}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wget -O recipe.py ${recipe_file}

@shanaxel42 shanaxel42 requested a review from ttung May 20, 2019 17:44
@shanaxel42 shanaxel42 merged commit 445e40a into master May 20, 2019
@shanaxel42 shanaxel42 deleted the saxelrod-standard-wdl branch May 20, 2019 20:33

masking_radius = 15
print("Filter WhiteTophat")
filt = Filter.WhiteTophat(masking_radius, is_volume=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just do:

Suggested change
filt = Filter.WhiteTophat(masking_radius, is_volume=False)
filt = Filter.WhiteTophat(masking_radius=15, is_volume=False)

print("Filter WhiteTophat")
filt = Filter.WhiteTophat(masking_radius, is_volume=False)

filtered_imgs = filt.run(imgs, verbose=True, in_place=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving the registration to before filtering so you can do inplace processing.

registered_imgs = warp.run(filtered_imgs, transforms_list=transforms_list, in_place=False, verbose=True)

print("Detecting")
p = DetectSpots.BlobDetector(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
p = DetectSpots.BlobDetector(
detector = DetectSpots.BlobDetector(


decoded = experiment.codebook.decode_per_round_max(intensities)
df = decoded.to_decoded_spots()
return df
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add nl.


# find threshold
tmp = dots.sel({Axes.ROUND:0, Axes.CH:0, Axes.ZPLANE:0})
dots_threshold = np.percentile(np.ravel(tmp.xarray.values), 50)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is incompatible with the pipeline architecture. do we have an issue for fixing this?

glp = Filter.GaussianLowPass(sigma=1)
low_passed = glp.run(deconvolved, in_place=False, verbose=True)

scale_factors = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also incompatible with the pipeline infrastructure.


spot_intensities = initial_spot_intensities.loc[initial_spot_intensities[Features.PASSES_THRESHOLDS]]
df = spot_intensities.to_decoded_spots()
return df
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nl after this.


command <<<
python <<CODE
files = "${sep=' ' decoded_csvs}".strip().split()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's kind of concerning that this is a bunch of python code floating in the ether.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I thought about moving this in starfish and making a wdl_utils file? But I thought that might be kind of random if it was the only wdl related thing...down to do that though if you think it makes more sense

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define a strategy to run user-defined pipelines using WDL
4 participants