Skip to content

Dataverse upload

John Hoffer edited this page Jun 21, 2023 · 5 revisions

Setting up

Here are the steps needed to prepare to upload to the Dataverse with AWS.

  • Uploading SOURCE-ZIP-FILE to the minerva-dataverse-uploads bucket on AWS S3
  • Creating an AWS EC2 instance (a cloud server) with an associated security group
  • Creating/Locating your user-specific private key file

Connecting to AWS EC2

Open a terminal. Giving the path to YOUR-PRIVATE-KEY.pem, which you have just downloaded, connect to YOUR-EC2-PREFIX.compute-1.amazonaws.com with ssh:

ssh -i YOUR-PRIVATE-KEY.pem ec2-user@YOUR-EC2-PREFIX.compute-1.amazonaws.com

Now, you're connected to your EC2 instance

Uploading from AWS EC2 to Dataverse

While connected to EC2, install java and download the uploader on the EC2 instance:

sudo yum install java
wget https://github.com/GlobalDataverseCommunityConsortium/dataverse-uploader/releases/download/v1.1.0/DVUploader-v1.1.0.jar

Run aws configure, giving with your AWS ACCESS KEY and AWS SECRET KEY when asked. Then, copy SOURCE-ZIP-FILE from AWS S3. This assumes that the source zip file is already available on AWS within the minerva-data-uploads S3 bucket.

aws s3 cp s3://minerva-dataverse-uploads/SOURCE-ZIP-FILE SOURCE-ZIP-FILE

Upload, substituting YOUR-DATAVERSE-API-KEY, TARGET-DOI and SOURCE-ZIP-FILE

java -jar DVUploader-v1.1.0.jar -directupload -key=YOUR-DATAVERSE-API-KEY -did=doi:TARGET-DOI -server=https://dataverse.harvard.edu SOURCE-ZIP-FILE

Once upload completes, please follow the remaining steps to clean up your AWS resources

This is needed to avoid accruing costs for unused resources

Terminating the EC2 instance

Open EC2 controls from AWS management

Visit this link to start, then open the page to manage all active EC2 services.

AWS services

Find your EC2 instance

EC2 controls

Open your EC2 instance's security group in a new tab

Find security group

In the previous tab, terminate your EC2 instance

Terminate instance

Now, return to the tab for your Security Group.

Deleting the Security Group

Delete any Inbound Rules

Delete Inbound Rules

Delete any Outbound Rules

Delete Outbound Rules

Delete the security group

Delete security group

If you see this, the EC2 instance hasn't terminated yet

Warning against deletion

Delete the security group

Delete security group

Security group deleted

Deleted

Now, the EC2 is terminated and the Security group is deleted

Clone this wiki locally