Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightr #267

Closed
13 of 19 tasks
Bisaloo opened this issue Nov 26, 2018 · 39 comments
Closed
13 of 19 tasks

lightr #267

Bisaloo opened this issue Nov 26, 2018 · 39 comments

Comments

@Bisaloo
Copy link
Member

Bisaloo commented Nov 26, 2018

Summary

  • What does this package do? (explain in 50 words or less):

lightr imports UV-VIS reflectance/transmission/absorbance proprietary file formats in R. It also allows the import of related metadata that are critical to ensure reproducibility but that are often discarded by other tools.

  • Paste the full DESCRIPTION file inside a code block below:
Package: lightr
Title: Read Spectrometric Data in R
Version: 0.0.0.9000
Authors@R: c(
    person("Hugo", "Gruson", role = c("cre", "aut"),
    email = REDACTED,
    comment = c(ORCID = "0000-0002-4094-1476")),
    person("Rafael", "Maia", role = "aut",
    email = "REDACTED",
    comment = c(ORCID = "0000-0002-7563-9795")),
    person("Thomas", "White", role = "aut",
    email = "REDACTED",
    comment = c(ORCID = "0000-0001-8493-9450"))
    )
Description: Parse various UV-VIS reflectance/transmittance/absorbance spectra
    file formats to extract spectral data and metadata.
Imports:
    pbmcapply,
    xml2
Suggests:
    covr,
    knitr,
    rmarkdown,
    spelling,
    testthat
URL: https://bisaloo.github.io/lightr, https://github.com/Bisaloo/lightr
BugReports: https://github.com/Bisaloo/lightr/issues
License: GPL (>=2)
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.1.1
Roxygen: list(markdown = TRUE)
Language: en-GB
VignetteBuilder: knitr
  • URL for the package (the development repository, not a stylized html page):

https://github.com/Bisaloo/lightr

  • Please indicate which category or categories from our package fit policies this package falls under and why? (e.g., data retrieval, reproducibility. If you are unsure, we suggest you make a pre-submission inquiry.):

    • data extraction, because the package parses multiple scientific data file formats
    • (reproducibility) because it also extracts metadata saved during the recording
  •   Who is the target audience and what are scientific applications of this package?  

People working with UV-VIS reflectance/transmittance/absorbance spectra, colour science (#colsci). People developing package to analyse spectral data with vision models (lightr has few dependencies). Even non-R users who do not own the proprietary software to convert the proprietary formats to csv files.

There is partial overlap with some other packages, as described in the README but to my knowledge, none of them have the same aims as lightr.

  •   If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Requirements

Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • has a CRAN and OSI accepted license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a vignette with examples of its essential functions and uses. [in progress]
  • has a test suite.
  • has continuous integration, including reporting of test coverage, using services such as Travis CI, Coveralls and/or CodeCov.
  • I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
    • The package has an obvious research application according to JOSS's definition.
    • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI:
    • (Do not submit your package separately to JOSS)
  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
    • The package is novel and will be of interest to the broad readership of the journal.
    • The manuscript describing the package is no longer than 3000 words.
    • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
    • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
    • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
    • (Please do not submit your package separately to Methods in Ecology and Evolution)

Detail

  • Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

  • Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:

Some function names do not meet rOpenSci criteria (e.g. getspec() and getmetadata()) but this is kept for backwards compatibility with the package pavo, from which lightr originated. Synonyms have been added (get_spec() and get_metadata()) but it is at the moment unlikely that the old names will be deprecated.

  • If this is a resubmission following rejection, please explain the change in circumstances:

  • If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:


As mentioned in the template, some pieces still need polishing, mainly the vignettes but before going for one last push, I'd like to know if you would be interested.

@sckott
Copy link
Contributor

sckott commented Nov 26, 2018

thanks for your pre-submission inquiry @Bisaloo We are discussing now and will get back to you soon

@karthik
Copy link
Member

karthik commented Nov 27, 2018

👋 @Bisaloo. I will be your editor for this submission. Stay tuned for next steps.

@karthik
Copy link
Member

karthik commented Dec 10, 2018

Editor checks:

  • Fit: The package meets criteria for fit and overlap
  • Automated tests: Package has a testing suite and is tested via Travis-CI or another CI service.
  • License: The package has a CRAN or OSI accepted license
  • Repository: The repository link resolves correctly
  • Archive (JOSS only, may be post-review): The repository DOI resolves correctly
  • Version (JOSS only, may be post-review): Does the release version given match the GitHub release (v1.0.0)?

Editor comments

Package looks to be in excellent shape and your test coverage is fantastic! Goodpractice had very little to say and the only notable suggestion is to drop spelling from your suggests. I'm looking for appropriate reviewers now but if you have any suggestions for independent reviewers, please let me know. 🙏

  ✖ fix this R CMD check ERROR: Package suggested but not
available: ‘spelling’ The suggested packages are required for a complete check. Checking can be attempted without them by setting the environment variable _R_CHECK_FORCE_SUGGESTS_ to a false value. See section ‘The DESCRIPTION file’ in the ‘Writing R Extensions’ manual.

Reviewers: TBD
Due date: TBD

@Bisaloo
Copy link
Member Author

Bisaloo commented Dec 11, 2018

Thank you for your comment!

I don't really understand why spelling is problematic. It's used in many packages on CRAN without any issue (https://cran.r-project.org/package=spelling). Could you expand a bit more why I should remove it please?

Also, is it okay if I keep working on master until reviewers are found?

@karthik karthik changed the title Pre-submission enquiry: lightr lightr Dec 12, 2018
@Bisaloo
Copy link
Member Author

Bisaloo commented Jan 21, 2019

@karthik, are you having trouble finding reviewers?

If so, I don't think it's necessary to find someone with a background in colour science or spectrometry. Anyone with experience with unusual file/data formats would probably be qualified. Some basic regex skills and knowledge of binary data (for parse_trm()) are a plus but not required IMO.

@karthik
Copy link
Member

karthik commented Jan 23, 2019

@Bisaloo I'm still having trouble recruiting appropriate reviewers (many declines). Do you have any independent suggestions that I could approach?

@Bisaloo
Copy link
Member Author

Bisaloo commented Jan 24, 2019

@karthik, thanks for the update 👍 .

Here are some people with a colour science background that code:

  • Felipe Gawryszewski is the author of the colourvision R package but I just had a look and he doesn't import spectra. The user has to provide a dataframe as far as I understand.
  • Cassie Stoddard developed the tetracolorspace Matlab program but I don't know how much she uses R and as far as I understand, tetracolorspace only imports a small subset of spectra files (those that are "csv-like")
  • As mentioned in the source code, one function in lightr is an almost direct translation from a matlab script I found online. I don't know the author at all (I don't even know if that email is still working) but they should be very qualified to review lightr (assuming they have some R knowledge).

Beyond that, as I said, I don't think it is required to have a background in colour science to review lightr. It is not (as opposed to colourvision or pavo) a package to perform colour vision models. It focuses on data import and manipulation so I'm confident anyone with a solid general knowledge of R could provide a good feedback.

@Bisaloo
Copy link
Member Author

Bisaloo commented Feb 4, 2019

Here is the profile for a potential reviewer IMO. Maybe you will know someone who fits this description:

Someone who is familiar with file manipulation (list.files(), path(), tempdir(), file_ext()) and import (unzip(), file(), readLines()) in R, including binary files (readBin(), intToUtf8()) and XML files (via the xml2 package). Some metadata is imported through string manipulation (grep(), gsub(), strsplit()) as well but the regex patterns generally remain simple.

I think it helps to think about this package as similar to any other file import package (such as rio). The data type is just accessory here.

I'll try to add a vignette to describe the structure / internal organisation of the package at some point, which should greatly help for the review.

@karthik
Copy link
Member

karthik commented Feb 8, 2019

Thanks you @Bisaloo
I have reached out to more folks and will update this thread very soon. Apologies again for this unusual delay.

@karthik
Copy link
Member

karthik commented Feb 12, 2019

After numerous delays (thanks for the patience), assigning reviewer 1.

@kjuraic
Deadline: March 5, 2019

@kjuraic Please email or reach out if you have any trouble during the review process. 🙏

@Bisaloo
Copy link
Member Author

Bisaloo commented Mar 22, 2019

@karthik, any news? Is there anything I can do on my end to help?

@karthik
Copy link
Member

karthik commented Mar 26, 2019

@kjuraic 👋Can you please update the thread with your review. Feel free to reach out to me if you've run into any issues. 🙏

@karthik
Copy link
Member

karthik commented Mar 26, 2019

@Bisaloo Thanks for checking in. Let me get an update from Krunoslav

@karthik
Copy link
Member

karthik commented Apr 25, 2019

Apologies for the unusual delays @Bisaloo. I am reassigning reviewers now.

@karthik
Copy link
Member

karthik commented May 21, 2019

Hello @Bisaloo
I am so sorry for all the delays. Since I have been unable to get a response from the reviewers I assigned, I am switching reviewers now and we'll fast track the review from this point forward. @jeroen will complete a technical review soon, and I along with a domain reviewer will also share a review. Thanks for your patience. 🙏

@Bisaloo
Copy link
Member Author

Bisaloo commented May 22, 2019

@karthik, thanks for the update! That sounds good.

@jeroen
Copy link
Member

jeroen commented May 28, 2019

Hi Hugo,

Thanks for submitting your package to rOpenSci!

I'm not an expert in spectrometry hence my review focuses on general R packaging practices and some implementation advice.

Overall

Overall the package is in good shape. The pkgdown site looks nice and provides a good starting point for new users. All functions include an example using test data in various formats which ship with the package. The package passes CMD check with high testing coverage.

Improvements

My main concern is about validation of the parsing functions. The package implements custom readers for a range of binary and xml based spectrometry formats, but there is little documentation on where to find the specification of these formats (if one exists), and how you check that the output of the parsers is accurate.

Writing parsers is increadibly difficult and it is easy to overlook edge cases or odd properties of a format. I don't know how complex these formats are, but from the raw output data it is difficult to judge the correctness. The parsing unit tests verify that the output contains data and metadata, but do not verify the data itself.

You could improve the package by trying to show (where possible) that the data from your examples corresponds to what one would expect to see. For example you could extend the examples by plotting or modelling the output data, such that it can be visually inspected. As a dummy user, I would enjoy an intro vignette with a worked example of reading and modelling spectral data, and seeing a pattern appear. This would also help with illustrating how lightr is used in combination with other R packages to perform a complete analysis.

You mention that there is some overlap between lightr and other spectrometry R packages. For the formats that are also supported by another package, you could consdider adding a unit test to check if both packages return the same data. If only proprietary reference software is available, perhaps you could show summary or visual statistics that allow potential users to compare the output with the other software.

Summary

The package is in good shape for a first release and I recommend it is onboarded. It can be further improved with validation of the parsers and an vignettes with worked examples showing how lightr synergies with other packages.

I think more potential users are willing to give the package a try when you show how lightr fits into a complete analysis, and make it convincing that the results correspond to those of other (proprietary) software.

@Bisaloo
Copy link
Member Author

Bisaloo commented Jun 5, 2019

Thank you for your review @jeroen! You bring up very valid points but I'm not yet quite sure if/how I can address all of them. I'll need a couple of days to think about it. Here is a first attempt at answering some of your concerns. I will also perform some changes on the packages following your recommendations whenever I get a moment (hopefully next week).

but there is little documentation on where to find the specification of these formats (if one exists)

To my knowledge, the only format that has been defined as a standard are jdx files. I do say it in the docs but I could try to think of a more prominent place where to put the reference:

https://github.com/Bisaloo/lightr/blob/b06c2b3c5512ff798c7ce82933c003bbdb4422bc/R/parse_jdx.R#L7-L9

The parsing unit tests verify that the output contains data and metadata, but do not verify the data itself.

You are correct that I don't compare the output to known results from an outside source but I however implemented regression tests to compare the output with that obtained from previous versions of lightr (https://github.com/Bisaloo/lightr/tree/master/tests/testthat/known_output)

You could improve the package by trying to show (where possible) that the data from your examples corresponds to what one would expect to see.

The main issue here is that other tools to extract this data, including proprietary tools discard a lot of information, as I mention in the README. The only way we know this data is correct is because we know how the measurements were made (i.e. we know the spectrometer model, the integration time, etc.) and how reflectance spectra should look for such an object. But I don't see a convenient way to demonstrate this to the user.

What I can do though is compare the output from:

for the same data (i.e. before and after export by official proprietary software). This is not straightforward because as mentioned previously, data is processed in an undocumented way while exported (metadata is partially discarded, spectral data is sometimes interpolated with an undocumented method, etc.).

For example you could extend the examples by plotting or modelling the output data, such that it can be visually inspected. As a dummy user, I would enjoy an intro vignette with a worked example of reading and modelling spectral data, and seeing a pattern appear. This would also help with illustrating how lightr is used in combination with other R packages to perform a complete analysis.

This is definitely a great idea and I will try to add this soon! An realistic example with measurements on one object and the results of the visual models (via pavo) next to a picture of this object could be a further way to convince the user that lightr at least produces data that is coherent with what we can observe with our naked eye. This would actually rely a bit more on the functionality from pavo than that from lightr but it doesn't matter.

For the formats that are also supported by another package, you could consdider adding a unit test to check if both packages return the same data.

Good point. I'll look into this.

If only proprietary reference software is available, perhaps you could show summary or visual statistics that allow potential users to compare the output with the other software.

This would be addressed in the intro vignette you propose.

I think more potential users are willing to give the package a try when you show [...] make it convincing that the results correspond to those of other (proprietary) software.

I'm wondering what would be the best way to convey this to the (potential) users. I doubt many users will dig into the tests folder. But I don't really see this as a vignette either. Maybe something derived from covrpage?


TODO:

  • Create intro vignette (with hummingbird data?)
  • Check tests from other packages and duplicate them in lightr
  • Compare output of lightr for "raw" vs "exported" OceanOptics files.
  • Compare output of lightr for "raw" vs "exported" Avantes files.
  • Add covrpage

@Bisaloo
Copy link
Member Author

Bisaloo commented Jul 25, 2019

Hi @karthik, just wondering what's the status here.

I can try to address the remaining concerns from @jeroen in the next couple of weeks, by providing a new vignette which:

  • shows that we get the same output with my parsers and the proprietary parsers from Avantes and OceanOptics
  • shows that the spectra match what we observe in human vision
  • demonstrates how this data can then be used with a vision modelling tool such as pavo

Would be that enough or should I wait for another review?

@karthik
Copy link
Member

karthik commented Aug 1, 2019

Hi @Bisaloo,
It would be good for you to address these comments now. I'll be your second reviewer and will provide my review within a week. Once you address any additional issues, we can proceed with the next steps.

PS: I'll be away from email till August 6.

@karthik
Copy link
Member

karthik commented Oct 18, 2019

Adding to Jeroen's feedback, here is my review. Most comments are minor, and once I see a response, I can move quickly to get this accepted.

Review of lightr package

The lightr package provides the ability to parse spectrometry data from different file formats and from multiple different scientific instruments. The alternatives are all proprietary software.

General comments

  • In the package down documentation, the two main functions are impossible to test without any example files to read from. When running this in my console, I get:
Warning message:
No files found. Try a different value for argument "ext". 

Which makes sense. At least in the examples, can you point it to some example files that you can include with the package itself?

  • Test coverage for the package is great! 🙌

  • In the function documentation, when describing specific instruments (e.g. OceanOptics, Avantes), link to those where possible.

  • Unless the package API is set in stone, I would recommend remaining some of the functions. Currently here is what is exported:

[1] "convert_tocsv"  "get_metadata"   "get_spec"       "parse_abs"     
 [5] "parse_generic"  "parse_jaz"      "parse_jazirrad" "parse_jdx"     
 [9] "parse_procspec" "parse_roh"      "parse_trm"      "parse_trt"     
[13] "parse_ttt"  

By adding some sort of standard prefix, e.g. lr_ it would allow for easy autocompletion in Rstudio and other text editors/IDEs. Functions like get_metadata are so generic that it will likely cause namespace conflicts with other packages.

Minor comments

  • Version number still shows as Version: 0.0.0.9000. You might want to bump the minor version at least with each major change so it's evident that the package is not super early in development. Leaving the version number at the devtools default does act as a signal.
  • The url field has two URLs:
    URL: https://bisaloo.github.io/lightr, https://github.com/Bisaloo/lightr
    Perhaps link to just one? Maybe the package down docs (which then link to the repo itself) so that the field is easy to parse for other purposes. @jeroen do you have other thoughts here on how it might impact the docs parser?
    -"If you can't find the best parse fror your specific file" README.md:119. fror should be for. You can check spelling on the whole package with spelling::spell_check_package()

@Bisaloo
Copy link
Member Author

Bisaloo commented Oct 19, 2019

Thank you for your comments @karthik 👍! please see my changes below:

General comments

  • In the package down documentation, the two main functions are impossible to test without any example files to read from. When running this in my console, I get:
Warning message:
No files found. Try a different value for argument "ext". 

Which makes sense. At least in the examples, can you point it to some example files that you can include with the package itself?

ropensci/lightr@2beb31e

  • Test coverage for the package is great! 🙌

  • In the function documentation, when describing specific instruments (e.g. OceanOptics, Avantes), link to those where possible.

ropensci/lightr@22ee916

Unless the package API is set in stone, I would recommend remaining some of the functions. Currently here is what is exported:

[1] "convert_tocsv"  "get_metadata"   "get_spec"       "parse_abs"     
[5] "parse_generic"  "parse_jaz"      "parse_jazirrad" "parse_jdx"     
[9] "parse_procspec" "parse_roh"      "parse_trm"      "parse_trt"     
[13] "parse_ttt"  

By adding some sort of standard prefix, e.g. lr_ it would allow for easy autocompletion in Rstudio and other text editors/IDEs. Functions like get_metadata are so generic that it will likely cause namespace conflicts with other packages.

Would you do this for all exported functions or just the three main ones which are the most likely to have name conflicts (get_spec(), get_metadata() and convert_tocsv())?

Minor comments

  • Version number still shows as Version: 0.0.0.9000. You might want to bump the minor version at least with each major change so it's evident that the package is not super early in development. Leaving the version number at the devtools default does act as a signal.

ropensci/lightr@b915992

  • The url field has two URLs:
    URL: https://bisaloo.github.io/lightr, https://github.com/Bisaloo/lightr
    Perhaps link to just one? Maybe the package down docs (which then link to the repo itself) so that the field is easy to parse for other purposes. @jeroen do you have other thoughts here on how it might impact the docs parser?

Actually, the recommended practice for pkgdown is to have both URL. The github.com URL is required to auto-link to the github repo and fill out the bug report field (in the sidebar and the github logo in the top navbar). The github.io is required for cross-linking from other packages (https://pkgdown.r-lib.org/articles/linking.html#across-packages).

  • "If you can't find the best parse fror your specific file" README.md:119. fror should be for. You can check spelling on the whole package with spelling::spell_check_package()

ropensci/lightr@515d193

@karthik
Copy link
Member

karthik commented Oct 19, 2019

Would you do this for all exported functions or just the three main ones which are the most likely to have name conflicts (get_spec(), get_metadata() and convert_tocsv())?

Generally for all exported functions unless you have a strong reason not to.

@Bisaloo
Copy link
Member Author

Bisaloo commented Oct 20, 2019

Ok, thanks! Done in ropensci/lightr@a0ed298.

@karthik
Copy link
Member

karthik commented Oct 21, 2019

Thanks @Bisaloo! I'll do a run through tomorrow before I sign off.

@karthik
Copy link
Member

karthik commented Oct 21, 2019

To run the examples in your README, you'll need a slight change to locate the inst/testdata/ files

lr_get_metadata(system.file("testdata/procspec_files" , package = "lightr"),
                ext = "ProcSpec")

Same for the example below.

@Bisaloo
Copy link
Member Author

Bisaloo commented Oct 21, 2019

Right, it makes more sense since users have to install the package to run this anyways 👍. It's done!

@karthik
Copy link
Member

karthik commented Oct 21, 2019

Congrats @, your submission has been approved! 🎉 Thank you for submitting and to @jeroen for a thorough and timely review. To-dos:

  • Transfer the repo to the rOpenSci organization under "Settings" in your repo. I have invited you to a team that should allow you to do so. You'll be made admin once you do.

  • Add the rOpenSci footer to the bottom of your README

[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)

  • Fix any links in badges for CI and coverage to point to the ropensci URL. (We'll turn on the services on our end as needed)

Welcome aboard!

@karthik
Copy link
Member

karthik commented Oct 21, 2019

@Bisaloo Once you transfer, the documentation will also be auto generated at docs.ropensci.org/lightr

You'll need to update that URL in your DESCRIPTION (and the top of your repo) as well after the transfer.

@Bisaloo
Copy link
Member Author

Bisaloo commented Oct 21, 2019

It's done. Thanks @karthik and @jeroen for your helpful comments!

@karthik
Copy link
Member

karthik commented Oct 21, 2019

Thanks @Bisaloo! 🙏

1 similar comment
@karthik
Copy link
Member

karthik commented Oct 21, 2019

Thanks @Bisaloo! 🙏

@karthik karthik closed this as completed Oct 21, 2019
@Bisaloo
Copy link
Member Author

Bisaloo commented Oct 21, 2019

Should I go ahead and submit this to JOSS once I add a DOI?

@karthik
Copy link
Member

karthik commented Oct 21, 2019

You can add a DOI after the JOSS submission is accepted. So go ahead and submit there. In the meantime, you can compile your own paper here: http://whedon.theoj.org/ and fix any mistakes or issues such as broken references.

Also reading over your current paper, you provide no examples of a scientific application. That will be necessary for this software to get accepted to JOSS.

@Bisaloo
Copy link
Member Author

Bisaloo commented Oct 21, 2019

Also reading over your current paper, you provide no examples of a scientific application. That will be necessary for this software to get accepted to JOSS.

Okay, thanks for the advice! I'll do this tomorrow then.

@karthik
Copy link
Member

karthik commented Oct 21, 2019

If it helps you can read over some accepted JOSS papers about R packages: https://joss.theoj.org/papers/in/R to get a sense of what level of detail is needed there. I'm also an editor at JOSS and can provide feedback if necessary.

@jeroen
Copy link
Member

jeroen commented Oct 23, 2019

Docs are now live on https://docs.ropensci.org/lightr/

@Bisaloo
Copy link
Member Author

Bisaloo commented Oct 24, 2019

@karthik, I had a look at the most recent R papers and they don't seem too provide many more examples of applications. Do you have something specific in mind that would make a valuable addition?

@karthik
Copy link
Member

karthik commented Oct 25, 2019

I think it would be useful to use a test dataset + lightr to show what can be done as a small example. Here is an example. If you submit as is, that's the first question an editor will as you about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants