From ff63c567144ba2672b46dd27cbae72e63ae525e4 Mon Sep 17 00:00:00 2001 From: Ben Dichter Date: Thu, 18 Jul 2024 10:45:02 -0400 Subject: [PATCH] update upload page to point to the NWB GUIDE for conversion and for upload --- docs/13_upload.md | 144 ++++++++++++++++++++++------------------------ 1 file changed, 68 insertions(+), 76 deletions(-) diff --git a/docs/13_upload.md b/docs/13_upload.md index 3d56f902..bf13ca22 100644 --- a/docs/13_upload.md +++ b/docs/13_upload.md @@ -1,91 +1,83 @@ # Creating Dandisets and Uploading Data -To create a new Dandiset and upload your data, you need to have a DANDI account. +This page provides instructions for creating a new Dandiset and uploading data to DANDI. + +## **Prerequisites** +1. **Convert data to NWB.** Your data should already be in NWB format (2.1+). We suggest beginning the conversion process using only a small amount of data so that common issues may be spotted earlier in the process. + This step can be complex depending on your data. Consider using the following tools: + 1. [NWB Graphical User Interface for Data Entry (GUIDE)](https://nwb-guide.readthedocs.io/en/stable/) is a cross-platform desktop application for converting data from common proprietary formats to NWB and uploading it to DANDI. + 2. [NeuroConv](https://neuroconv.readthedocs.io/) is a Python library that automates conversion to NWB from a variety of popular formats. See the [Conversion Gallery](https://neuroconv.readthedocs.io/en/main/conversion_examples_gallery/index.html) for example conversion scripts. + 3. [PyNWB](https://pynwb.readthedocs.io/en/stable/) and [MatNWB](https://github.com/NeurodataWithoutBorders/matnwb) are APIs in Python and MATLAB that allow full flexibility in reading and writing data. ([PyNWB tutorials](https://pynwb.readthedocs.io/en/stable/tutorials/index.html), [MatNWB tutorials](https://github.com/NeurodataWithoutBorders/matnwb?tab=readme-ov-file#tutorials)) + 4. [NWB Overview Docs](https://nwb-overview.readthedocs.io) points to more tools helpful for working with NWB files. + + Feel free to [reach out to us for help](https://github.com/dandi/helpdesk/discussions). + +1. **Choose a server.** + - **Production server**: https://dandiarchive.org. This is the main server for DANDI and should be used for sharing neuroscience data. + When you create a Dandiset, a permanent ID is automatically assigned to it. + This Dandiset can be fully public or embargoed according to NIH policy. + All data are uploaded as draft and can be adjusted before publishing on the production server. + - **Development server**: https://gui-staging.dandiarchive.org. This server is for testing and learning how to use DANDI. + It is not recommended for sharing data, but is recommended for testing the DANDI CLI and GUI or as a testing platform for developers. + Note that the development server should not be used to stage your data. + + The below instructions will alert you to where the commands for interacting with these two different servers differ slightly. +1. **Register for DANDI and copy the API key.** To create a new Dandiset and upload your data, you need to have a DANDI account. + * If you do not already have an account, see [Create a DANDI Account](./16_account.md) page for instructions. + * Once you are logged in, copy your API key. + Click on your user initials in the top-right corner after logging in. + Production (https://dandiarchive.org) and staging (https://gui-staging.dandiarchive.org) servers have different API keys and different logins. + * Store your API key somewhere that the CLI can find it; see ["Storing Access Credentials"](#storing-access-credentials) below. -## Create a Dandiset and Add Data - -You can create a new Dandiset at https://dandiarchive.org. This Dandiset can be fully -public or embargoed -according to NIH policy. -When you create a Dandiset, a permanent ID is automatically assigned to it. -To prevent the production server from being inundated with test Dandisets, we encourage developers to develop -against the development server (https://gui-staging.dandiarchive.org/). Note -that the development server -should not be used to stage your data. All data are uploaded as draft and can be adjusted before publishing on -the production server. The development server is primarily used by users learning to use DANDI or by developers. - -The below instructions will alert you to where the commands for interacting with these -two different servers differ slightly. +### **Data upload/management workflow** -### **Setup** +The NWB GUIDE provides a graphical interface for inspecting and validating NWB files, as well as for uploading data to +DANDI. See the [NWB GUIDE Dataset Publication Tutorial](https://nwb-guide.readthedocs.io/en/latest/tutorials/dataset_publication.html) for more information. -1. To create a new Dandiset and upload your data, you need to have a DANDI account. See the [Create a DANDI Account](./16_account.md) page. -1. Log in to DANDI and copy your API key. Click on your user initials in the - top-right corner after logging in. Production (dandiarchive.org) and staging (gui-staging.dandiarchive.org) servers - have different API keys and different logins. -1. Locally: - 1. Create a Python environment. This is not required, but strongly recommended; e.g. [miniconda](https://conda. - io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands), - [virtualenv](https://docs.python.org/3/library/venv.html). - 2. Install the DANDI CLI into your Python environment: +The below instructions show how to do the same thing programmatically using the command line interface (CLI). +The CLI approach may be more suitable for users who are comfortable with the command line or who need to automate the process, or for advanced use-cases. - pip install -U dandi +1. **Create a new Dandiset.** + * Click `NEW DANDISET` in the Web application (top right corner) after logging in. + * You will be asked to enter basic metadata: a name (title) and description (abstract) for your dataset. + * After you provide a name and description, the dataset identifier will be created; we will call this ``. +1. Check your files for [NWB Best Practices](https://nwbinspector.readthedocs.io/en/dev/best_practices/best_practices_index.html) with [NWBInspector](https://nwbinspector.readthedocs.io/en/dev/user_guide/user_guide_index.html). + Run NWB Inspector programmatically. Install the Python library (`pip install -U nwbinspector`) and run: - 3. Store your API key somewhere that the CLI can find it; see ["Storing - Access Credentials"](#storing-access-credentials) below. + nwbinspector --config dandi + + If the report is too large to efficiently navigate in your console, you can save a report using -### **Data upload/management workflow** + nwbinspector --config dandi --report-file-path .txt + + For more details and other options, run: + + nwbinspector --help -1. Register a Dandiset to generate an identifier. You will be asked to enter - basic metadata: a name (title) and description (abstract) for your dataset. - Click `NEW DANDISET` in the Web application (top right corner) after logging in. - After you provide a name and description, the dataset identifier will be created; - we will call this ``. -1. NWB format: - 1. Convert your data to NWB 2.1+ in a local folder. Let's call this ``. - We suggest beginning the conversion process using only a small amount of data so that common issues may be spotted earlier in the process. - This step can be complex depending on your data. - [NeuroConv](https://neuroconv.readthedocs.io/) automates - conversion to NWB from a variety of popular formats. - [nwb-overview.readthedocs.io](https://nwb-overview.readthedocs.io) - points to more tools helpful for working with NWB files, and [BIDS - converters](https://bids.neuroimaging.io/benefits.html#converters) - if you are preparing a BIDS dataset containing NWB files. - Feel free to [reach out to us for help](https://github.com/dandi/helpdesk/discussions). - 2. Check your files for [NWB Best Practices](https://nwbinspector.readthedocs.io/en/dev/best_practices/best_practices_index.html) by installing - the [NWBInspector](https://nwbinspector.readthedocs.io/en/dev/user_guide/user_guide_index.html) (`pip install -U nwbinspector`) and running - - nwbinspector --config dandi - - 3. Thoroughly read the NWBInspector report and try to address as many issues as possible. **DANDI will prevent validation and upload of any issues - labeled as level 'CRITICAL' or above when using the `--config dandi` option.** + Thoroughly read the NWBInspector report and try to address as many issues as possible. + **DANDI will prevent validation and upload of any issues labeled as level 'CRITICAL' or above when using the `--config dandi` option.** See ["Validation Levels for NWB Files"](./135_validation.md) for more information about validation criteria for uploading NWB - files and which are deemed critical. We recommend regularly running the inspector early in the process to generate the best NWB files possible. - Note that some autodetected violations, such as `check_data_orientation`, may be safely ignored in the event - that the data is confirmed to be in the correct form; this can be done using either the `--ignore ` flag or a config file. See [the NWBInspector CLI documentation](https://nwbinspector.readthedocs.io/en/dev/user_guide/using_the_command_line_interface.html) for more details and other options, or type `nwbinspector --help`. - If the report is too large to efficiently navigate in your console, you can save a report using - - nwbinspector --config dandi --report-file-path .txt - - 4. Once your files are confirmed to adhere to the Best Practices, perform an official validation of the NWB files by running: `dandi validate --ignore DANDI.NO_DANDISET_FOUND `. - **If you are having trouble with validation, make sure the conversions were run with the most recent version of `dandi`, `PyNWB` and `MatNWB`.** - 5. Now, prepare and fully validate again within the dandiset folder used for upload: - - dandi download https://dandiarchive.org/dandiset//draft - cd - dandi organize -f dry - dandi organize - dandi validate . - dandi upload - - Note that the `organize` steps should not be used if you are preparing a BIDS dataset with the NWB files. - Uploading to the development server is controlled via `-i` option, e.g. - `dandi upload -i dandi-staging`. - Note that validation is also done during `upload`, but ensuring compliance using `validate` prior upload helps avoid interruptions of the lengthier upload process due to validation failures. - 6. Add metadata by visiting your Dandiset landing page: - `https://dandiarchive.org/dandiset//draft` and clicking on the `METADATA` link. + files and which are deemed critical. We recommend regularly running the inspector early in the process to generate the best NWB files possible. Note that some auto-detected violations, such as `check_data_orientation`, may be safely ignored in the event + that the data is confirmed to be in the correct form. See [the NWBInspector CLI documentation](https://nwbinspector.readthedocs.io/en/dev/user_guide/using_the_command_line_interface.html) for more information. +1. Once your files are confirmed to adhere to the Best Practices, perform an official validation of the NWB files by running: `dandi validate --ignore DANDI.NO_DANDISET_FOUND `. + **If you are having trouble with validation, make sure the conversions were run with the most recent version of `dandi`, `PyNWB` and `MatNWB`.** +1. Upload the data to DANDI. This can either be done through the NWB GUIDE, or programmatically: + + dandi download https://dandiarchive.org/dandiset//draft + cd + dandi organize -f dry + dandi organize + dandi validate . + dandi upload + + Note that the `organize` steps should not be used if you are preparing a BIDS dataset with the NWB files. + Uploading to the development server is controlled via `-i` option, e.g. + `dandi upload -i dandi-staging`. + Note that validation is also done during `upload`, but ensuring compliance using `validate` prior upload helps avoid interruptions of the lengthier upload process due to validation failures. +1. Add metadata by visiting your Dandiset landing page: + `https://dandiarchive.org/dandiset//draft` and clicking on the `METADATA` link. If you have an issue using the Python CLI, see the [Dandi Debugging section](./15_debugging.md).