-
Notifications
You must be signed in to change notification settings - Fork 4
Create dataset_description.xlsx
Under this feature, SODA lets you quickly and accurately prepare the dataset_description metadata file which is mandatory for all SPARC datasets. SODA provides a convenient interface, which is more intuitive than the Excel spreadsheet template. It also makes use of information from your dataset on Pennsieve and the SPARC Airtable sheet to help you populate some of the fields easily. The expected structure of this file, generated automatically by SODA, is explained in our corresponding "How to" page if you would like to learn about it.
Step 1:
Currently, there are 2 ways to start creating your dataset_description.xlsx file with SODA:
1A. Start from scratch with SODA
1B. Continue working on an existing, locally-stored dataset_description.xlsx file (Coming soon)
If you start with:
(1A): You can go straight to step 2.
If you start with:
(1B): You will be able to specify the location of the existing dataset_descriptionfile. Clicking "Confirm" after specifying the file path will load the information from the file onto SODA.
Step 2:
The subsequent interface divides the dataset description file into six convenient sections to facilitate your task. Go through them successively and populate the various fields as indicated (Mandatory fields are indicated in the user interface and also bolded below):
- Dataset Information (high-level information about your dataset):
- Name: Descriptive title for the dataset. This field should match exactly with your dataset on Pennsieve. To select a dataset from Pennsieve, click on the "Help me select my dataset from Pennsieve" and you can select it from your list of Pennsieve datasets (to create a new Pennsieve dataset, see "Create a new dataset" section).
- Description: Brief description of the study and the dataset. If you select a dataset from Pennsieve, this will be populated automatically from your dataset subtitle on Pennsieve for your convenience (see "Add metadata to dataset" feature for adding a description).
- Study Information (high-level information about the study of your dataset):
- Keywords: A set of 3-5 keywords (other than those used in the name and description) that will help in searching your dataset once published
- Number of subjects: The number of unique subjects in this dataset, should match the subjects metadata file. Must be greater or equal to 1.
- Number of samples: The number of unique samples in this dataset, should match the samples metadata file. Set to zero if there are no samples.
- Award and Contributor Information (information about your SPARC award and the contributors to your dataset)
-
SPARC Award number associated with this dataset: Type the SPARC award associated with your dataset. If you need help with your award information, click "Help me with my award number and contributor information". If you have an Airtable account connected to SODA, you will be able to select your SPARC award from a dropdown list.
-
Contributor Information (information about the contributors to your dataset): Click "Add a contributor" to start adding contributors to your dataset description file.
-
Provide information about any contributor to the dataset. Note that the "Contributor" list is compiled from the SPARC Airtable sheet based on the SPARC award selected. Select one Contributor to get the ORCID ID, Contributor Affiliation, and Contributor Role populated automatically (if specified in the SPARC Airtable Sheet). Select "I want to add a contributor not listed above" in the footer of the popup if you'd like to enter a Contributor name manually (although we suggest entering them directly in the SPARC Airtable).
-
Check "Is Contact person?" if the contributor is a contact person for the dataset. At least one and only one of the contributors should be the contact person.
-
Click "Add" to add the contributor to SODA's contributor table. Each contributor added to the table will be added to the dataset description file when it is generated.
-
- Protocol Information (information about protocol(s) related to this dataset)
Click on "Add a protocol" to start adding protocol information related to your dataset and specify the following:
-
Protocol URL: This refers to the Protocol.io URL for the protocol title. In SODA, when users select a protocol title in the previous field (Protocol title), the protocol location or link will be automatically filled out for this field.
-
Protocol description: Optionally provide a short description of the protocol link.
- Completeness Information (Optional) (information about potential relation with other dataset(s))
- Completeness of dataset: Is the data set as uploaded complete or is it part of an ongoing study? Select "hasNext" to indicate that you expect more data on different subjects as a continuation of this study. Select “hasChildren” to indicate that you expect more data on the same subjects or samples derived from those subjects. Leave empty if none.
- Parent dataset(s): If this is a part of a larger dataset, or references subjects or samples from a parent dataset, select the prior dataset. You need only the last dataset, not all datasets. If samples and subjects are from multiple parent datasets please select them all. Leave empty if none.
- Title for a complete dataset: Give a provisional title for the entire dataset. Leave empty if not applicable.
- Additional Information (Optional)
-
Other funding sources: Specify other funding sources, if any. Hit 'Enter' on your keyboard after typing each. Leave empty if none.
-
Acknowledgments: Specify any acknowledgments beyond funding and contributors. Leave empty if none.
-
Additional link(s) (information about article(s) related to this dataset): Click "Add additional link" button to start adding additional articles related to your dataset. Specify the following:
- Link type: Select the nature of the link among
- Originating Article DOI: DOIs of published articles that were generated from this dataset
- Additional links: URLs of additional resources used by this dataset (e.g., a link to a code repository)
- Link: Enter the link
- Link description: Optionally provide a short description of the link.
- Click on "Add" to register a specified link to SODA's link table. All links and descriptions added to the table will be included in your dataset description file when it is generated.
- Link type: Select the nature of the link among
- After you complete all steps, click on "Generate" to generate your dataset description file. A warning message may show up if any mandatory fields are missing. You may decide to go back and address the issues or generate the file anyway (and address the issues later).
- In the contributors' table, you can drag and drop rows to organize contributors in the order that they should appear in the dataset_description file. You can also remove/edit one with the respective delete/edit button.
Organize and submit SPARC datasets with SODA
Connect your Pennsieve account with SODA
Upload a local dataset to Pennsieve
Connect your Airtable account with SODA
Create dataset_description.xlsx
Submit for pre-publishing review
Installing the Pennsieve Agent
Pennsieve Agent is already running
Sending log files to SODA Team
Issues regarding hidden files or folders
How to structure the submission metadata file
How to structure the dataset_description metadata file
How to structure the subjects metadata file
How to structure the samples metadata file