Skip to content

Commit

Permalink
Merge pull request #143 from dandi/update-upload-page
Browse files Browse the repository at this point in the history
update upload page
  • Loading branch information
kabilar committed Jul 19, 2024
2 parents f08eae4 + 17b9804 commit 79d4818
Showing 1 changed file with 81 additions and 83 deletions.
164 changes: 81 additions & 83 deletions docs/13_upload.md
Original file line number Diff line number Diff line change
@@ -1,93 +1,91 @@
# Creating Dandisets and Uploading Data

To create a new Dandiset and upload your data, you need to have a DANDI account.

## Create a Dandiset and Add Data

You can create a new Dandiset at https://dandiarchive.org. This Dandiset can be fully
public or embargoed
according to NIH policy.
When you create a Dandiset, a permanent ID is automatically assigned to it.
To prevent the production server from being inundated with test Dandisets, we encourage developers to develop
against the development server (https://gui-staging.dandiarchive.org/). Note
that the development server
should not be used to stage your data. All data are uploaded as draft and can be adjusted before publishing on
the production server. The development server is primarily used by users learning to use DANDI or by developers.

The below instructions will alert you to where the commands for interacting with these
two different servers differ slightly.

### **Setup**

1. To create a new Dandiset and upload your data, you need to have a DANDI account. See the [Create a DANDI Account](./16_account.md) page.
1. Log in to DANDI and copy your API key. Click on your user initials in the
top-right corner after logging in. Production (dandiarchive.org) and staging (gui-staging.dandiarchive.org) servers
have different API keys and different logins.
1. Locally:
1. Create a Python environment. This is not required, but strongly recommended; e.g. [miniconda](https://conda.
io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands),
[virtualenv](https://docs.python.org/3/library/venv.html).
2. Install the DANDI CLI into your Python environment:

pip install -U dandi

3. Store your API key somewhere that the CLI can find it; see ["Storing
Access Credentials"](#storing-access-credentials) below.

### **Data upload/management workflow**

1. Register a Dandiset to generate an identifier. You will be asked to enter
basic metadata: a name (title) and description (abstract) for your dataset.
Click `NEW DANDISET` in the Web application (top right corner) after logging in.
After you provide a name and description, the dataset identifier will be created;
we will call this `<dataset_id>`.
1. NWB format:
1. Convert your data to NWB 2.1+ in a local folder. Let's call this `<source_folder>`.
We suggest beginning the conversion process using only a small amount of data so that common issues may be spotted earlier in the process.
This step can be complex depending on your data.
[NeuroConv](https://neuroconv.readthedocs.io/) automates
conversion to NWB from a variety of popular formats.
[nwb-overview.readthedocs.io](https://nwb-overview.readthedocs.io)
points to more tools helpful for working with NWB files, and [BIDS
converters](https://bids.neuroimaging.io/benefits.html#converters)
if you are preparing a BIDS dataset containing NWB files.
Feel free to [reach out to us for help](https://github.com/dandi/helpdesk/discussions).
2. Check your files for [NWB Best Practices](https://nwbinspector.readthedocs.io/en/dev/best_practices/best_practices_index.html) by installing
the [NWBInspector](https://nwbinspector.readthedocs.io/en/dev/user_guide/user_guide_index.html) (`pip install -U nwbinspector`) and running
This page provides instructions for creating a new Dandiset and uploading data to DANDI.

## **Prerequisites**
1. **Convert data to NWB.** You should start by converting your data to NWB format (2.1+). We suggest beginning the conversion process using only a small amount of data so that common issues may be spotted earlier in the process.
This step can be complex depending on your data. Consider using the following tools:

nwbinspector <source_folder> --config dandi
1. **[NWB Graphical User Interface for Data Entry (GUIDE)](https://nwb-guide.readthedocs.io/en/stable/)** is a cross-platform desktop application for converting data from common proprietary formats to NWB and uploading it to DANDI.
2. **[NeuroConv](https://neuroconv.readthedocs.io/)** is a Python library that automates conversion to NWB from a variety of popular formats. See the [Conversion Gallery](https://neuroconv.readthedocs.io/en/main/conversion_examples_gallery/index.html) for example conversion scripts.
3. **[PyNWB](https://pynwb.readthedocs.io/en/stable/)** and **[MatNWB](https://github.com/NeurodataWithoutBorders/matnwb)** are APIs in Python and MATLAB that allow full flexibility in reading and writing data. ([PyNWB tutorials](https://pynwb.readthedocs.io/en/stable/tutorials/index.html), [MatNWB tutorials](https://github.com/NeurodataWithoutBorders/matnwb?tab=readme-ov-file#tutorials))
4. **[NWB Overview Docs](https://nwb-overview.readthedocs.io)** points to more tools helpful for working with NWB files.

Feel free to [reach out to us for help](https://github.com/dandi/helpdesk/discussions).

3. Thoroughly read the NWBInspector report and try to address as many issues as possible. **DANDI will prevent validation and upload of any issues
labeled as level 'CRITICAL' or above when using the `--config dandi` option.**
2. **Choose a server.**
- **Production server**: https://dandiarchive.org. This is the main server for DANDI and should be used for sharing neuroscience data.
When you create a Dandiset, a permanent ID is automatically assigned to it.
This Dandiset can be fully public or embargoed according to NIH policy.
All data are uploaded as draft and can be adjusted before publishing on the production server.
- **Development server**: https://gui-staging.dandiarchive.org. This server is for testing and learning how to use DANDI.
It is not recommended for sharing data, but is recommended for testing the DANDI CLI and GUI or as a testing platform for developers.
Note that the development server should not be used to stage your data.

The below instructions will alert you to where the commands for interacting with these two different servers differ slightly.

3. **Register for DANDI and copy the API key.** To create a new Dandiset and upload your data, you need to have a DANDI account.
* If you do not already have an account, see [Create a DANDI Account](./16_account.md) page for instructions.
* Once you are logged in, copy your API key.
Click on your user initials in the top-right corner after logging in.
Production (https://dandiarchive.org) and staging (https://gui-staging.dandiarchive.org) servers have different API keys and different logins.
* Store your API key somewhere that the CLI can find it; see ["Storing Access Credentials"](#storing-access-credentials) below.

## **Data upload/management workflow**

The NWB GUIDE provides a graphical interface for inspecting and validating NWB files, as well as for uploading data to
DANDI. See the **[NWB GUIDE Dataset Publication Tutorial](https://nwb-guide.readthedocs.io/en/latest/tutorials/dataset_publication.html)** for more information.

The below instructions show how to do the same thing programmatically using the command line interface (CLI).
The CLI approach may be more suitable for users who are comfortable with the command line or who need to automate the process, or for advanced use-cases.

1. **Create a new Dandiset.**
* Click `NEW DANDISET` in the Web application (top right corner) after logging in.
* You will be asked to enter basic metadata: a name (title) and description (abstract) for your dataset.
* After you provide a name and description, the dataset identifier will be created; we will call this `<dataset_id>`.
1. **Check your files for [NWB Best Practices](https://nwbinspector.readthedocs.io/en/dev/best_practices/best_practices_index.html).**
Run [NWB Inspector](https://nwbinspector.readthedocs.io/en/dev/user_guide/user_guide_index.html) programmatically. Install the Python library (`pip install -U nwbinspector`) and run:

nwbinspector <source_folder> --config dandi

If the report is too large to efficiently navigate in your console, you can save a report using

nwbinspector <source_folder> --config dandi --report-file-path <report_location>.txt
For more details and other options, run:

nwbinspector --help

Thoroughly read the NWBInspector report and try to address as many issues as possible.
**DANDI will prevent validation and upload of any issues labeled as level 'CRITICAL' or above when using the `--config dandi` option.**
See
["Validation Levels for NWB Files"](./135_validation.md) for more information about validation criteria for
uploading NWB
files and which are deemed critical. We recommend regularly running the inspector early in the process to generate the best NWB files possible.
Note that some autodetected violations, such as `check_data_orientation`, may be safely ignored in the event
that the data is confirmed to be in the correct form; this can be done using either the `--ignore <name_of_check_to_suppress>` flag or a config file. See [the NWBInspector CLI documentation](https://nwbinspector.readthedocs.io/en/dev/user_guide/using_the_command_line_interface.html) for more details and other options, or type `nwbinspector --help`.
If the report is too large to efficiently navigate in your console, you can save a report using

nwbinspector <source_folder> --config dandi --report-file-path <report_location>.txt

4. Once your files are confirmed to adhere to the Best Practices, perform an official validation of the NWB files by running: `dandi validate --ignore DANDI.NO_DANDISET_FOUND <source_folder>`.
**If you are having trouble with validation, make sure the conversions were run with the most recent version of `dandi`, `PyNWB` and `MatNWB`.**
5. Now, prepare and fully validate again within the dandiset folder used for upload:

dandi download https://dandiarchive.org/dandiset/<dataset_id>/draft
cd <dataset_id>
dandi organize <source_folder> -f dry
dandi organize <source_folder>
dandi validate .
dandi upload

Note that the `organize` steps should not be used if you are preparing a BIDS dataset with the NWB files.
Uploading to the development server is controlled via `-i` option, e.g.
`dandi upload -i dandi-staging`.
Note that validation is also done during `upload`, but ensuring compliance using `validate` prior upload helps avoid interruptions of the lengthier upload process due to validation failures.
6. Add metadata by visiting your Dandiset landing page:
`https://dandiarchive.org/dandiset/<dataset_id>/draft` and clicking on the `METADATA` link.

If you have an issue using the Python CLI, see the [Dandi Debugging section](./15_debugging.md).
files and which are deemed critical. We recommend regularly running the inspector early in the process to generate the best NWB files possible. Note that some auto-detected violations, such as `check_data_orientation`, may be safely ignored in the event
that the data is confirmed to be in the correct form. See [the NWBInspector CLI documentation](https://nwbinspector.readthedocs.io/en/dev/user_guide/using_the_command_line_interface.html) for more information.

1. **Install the [DANDI Client](https://pypi.org/project/dandi/).**
pip install -U dandi
1. **Validate NWB files.** Once your files are confirmed to adhere to the Best Practices, perform an official validation of the NWB files by running:
dandi validate --ignore DANDI.NO_DANDISET_FOUND <source_folder>
**If you are having trouble with validation, make sure the conversions were run with the most recent version of `dandi`, `PyNWB` and `MatNWB`.**
1. **Upload the data to DANDI.** This can either be done through the NWB GUIDE, or programmatically:

dandi download https://dandiarchive.org/dandiset/<dataset_id>/draft
cd <dataset_id>
dandi organize <source_folder> -f dry
dandi organize <source_folder>
dandi validate .
dandi upload

Note that the `organize` steps should not be used if you are preparing a BIDS dataset with the NWB files.
Uploading to the development server is controlled via `-i` option, e.g.
`dandi upload -i dandi-staging`.
Note that validation is also done during `upload`, but ensuring compliance using `validate` prior to upload helps avoid interruptions of the lengthier upload process due to validation failures.
1. **Add metadata to the Dandiset.** Visit your Dandiset landing page:
`https://dandiarchive.org/dandiset/<dataset_id>/draft` and click on the `METADATA` link.

If you have an issue using the DANDI Client, see the [DANDI Debugging section](./15_debugging.md).

## Storing Access Credentials

Expand Down

0 comments on commit 79d4818

Please sign in to comment.