Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRSA (Trusted Remote Storage Agent) and variable-level metadata upload #5213

Open
akio-sone opened this issue Oct 18, 2018 · 16 comments
Open
Assignees
Labels
Feature: File Upload & Handling Type: Feature a feature request User Role: Depositor Creates datasets, uploads data, etc.

Comments

@akio-sone
Copy link
Contributor

Related issues

Related Documents about TRSA

permission required

public

@pdurbin
Copy link
Member

pdurbin commented Nov 28, 2018

@akio-sone @jonc1438 I mentioned you over at #4821 (comment) but I wanted to highlight that for Make Data Count, we plan to only count downloads that go through Glassfish which means that downloads from TRSA, downloads from rsync, and downloads directly from Swift won't be counted.

@jonc1438
Copy link

jonc1438 commented Nov 28, 2018 via email

@pdurbin
Copy link
Member

pdurbin commented Nov 28, 2018

@jonc1438 ah, ok, so it sounds like data from a TRSA will still go through Glassfish. This is sort of like how an S3 download works. The user interacts with Glassfish so we'll count the download even if the file ultimately gets downloaded directly from S3 (if dataverse.files.s3-download-redirect is set to true). Thanks!

@pdurbin
Copy link
Member

pdurbin commented May 3, 2019

@jonc1438 @akio-sone @donsizemore great meeting this week. Would it be helpful if I stubbed out a workflow diagram similar to the one at http://guides.dataverse.org/en/4.13/admin/make-data-count.html#architecture ? I know you have a nice diagrams at http://cyberimpact.us/architecture-overview/ and http://cyberimpact.us/dataverse-trusted-remote-storage-agent-update/ but @kcondon and I were talking about a diagram that's a little lower level, about communication back and forth between the different components, like that Make Data Count diagram. Please let me know. Thanks.

@djbrooke
Copy link
Contributor

djbrooke commented May 3, 2019

@jonc1438 @akio-sone @donsizemore @pdurbin please coordinate this with @scolapasta. Thanks!

@jonc1438
Copy link

jonc1438 commented May 3, 2019 via email

@scolapasta
Copy link
Contributor

@pdurbin @jonc1438 I think a diagram could help us wrap our thoughts around this for our future discussions. Once you get started, let me know, and I can help add my understanding.

@pdurbin
Copy link
Member

pdurbin commented Aug 7, 2019

I'm currently reviewing pull request #6068 by @akio-sone (I'm about 20% through it, reading from top to bottom) and I have a few comments and questions:

  • Is this the right issue to look at (and some of the links above, some of which are 404s) to understand the context of the pull request?
  • I'm thinking that perhaps I should start that diagram we talked about above.
  • What do we call this work from a "software features" perspective? I recently revised https://dataverse.org/software-features and will put a screenshot below. Is "TRSA" a (future) feature of Dataverse? I generally try to use friendlier terms when talking about features.
  • How do we plan to document this work in the guides?

Screen Shot 2019-08-07 at 1 27 58 PM

@pdurbin
Copy link
Member

pdurbin commented Aug 7, 2019

I'm thinking that perhaps I should start that diagram we talked about above.

@jonc1438 @akio-sone @donsizemore I just created the following diagram when reviewing #6068 and I could use some help with it.

trsa

Here's the "source" for the diagram (.txt added to upload to this issue): trsa.uml.txt

Here's how I create a png from it:

java -jar /tmp/plantuml.jar -tpng trsa.uml

I'm basing this on what I'm seeing in pull request #6068 rather than any diagrams I've seen elsewhere. I figure we can update the diagram as more components are added. Apologies for all my misunderstanding of the various components. Please help me make corrections and please let me know if I should add this to Akio's branch.

@jonc1438
Copy link

jonc1438 commented Aug 7, 2019 via email

@pdurbin
Copy link
Member

pdurbin commented Nov 12, 2019

@jonc1438 it makes sense. Thanks! 😄 To be clear, I'm not talking about writing a lot of documentation. I'm talking about some diagrams similar to the ones you've been putting on the IMPACT blog.

I recently stumbled upon https://github.com/OdumInstitute/trsa-web/blob/jee8line/src/main/resources/doc/uml-diagrams-trsa-web.puml by @akio-sone and it looks great! It creates 9 diagrams but here's the big one that's pretty much exactly what I was asking for. It's a much better version of what I was trying to do myself above without knowing all the moving pieces. 😄

uml-diagrams-trsa-web

@pacian
Copy link

pacian commented Apr 13, 2022

Hello,
We are very much interested in TRSA. Can you explain where you are in this task? Denmark select Dataverse to be the TDR for the country, and at some point in time, we will be very interested in where the development is concerning TRSA.

@pdurbin
Copy link
Member

pdurbin commented Apr 13, 2022

@pacian hi! It looks like my last comment was in 2019. You might want to check out the talk by @jonc1438 at the 2020 community meeting: https://youtu.be/LHyiA3JeiwE?t=1466

I'll let others who are closer to the TRSA project give an update on what's new since then.

Exciting news about Denmark! Thanks!

@akio-sone
Copy link
Contributor Author

@pacian @pdurbin Odum's Dataverse fork has a branch named trsa-api that has a new API endpoint to receive/save the payload of metadata from a TRSA instance without invoking the ingest. The latest update is based on version 5.10.1. Since the branch includes features that do not immediately benefit Dataverse per se, we are working with @qqmyers to sieve out essential changes from the current modifications and later merge these essential ones into the develop branch of Dataverse. As for TRSA itself, its source tree is available from Odum's github site and it is undergoing a major UI makeover, hopefully, to be committed to develop branch soon.

@pacian
Copy link

pacian commented Apr 29, 2022

Thank you very much for the update. It looks like we may have several Storage Locations to be used as remote storage

@pdurbin
Copy link
Member

pdurbin commented Oct 10, 2022

My understanding is that this "remote storage" PR that just shipped with 5.12 should help TRSA:

From the excellent "Adding Lots of Zeros to the Size of Datafiles" talk by @qqmyers in June:

Screen Shot 2022-10-10 at 7 54 50 AM

@pdurbin pdurbin moved this to Community Backlog (Phil) in IQSS Dataverse Project Nov 2, 2022
@pdurbin pdurbin added Type: Feature a feature request User Role: Depositor Creates datasets, uploads data, etc. labels Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: File Upload & Handling Type: Feature a feature request User Role: Depositor Creates datasets, uploads data, etc.
Projects
None yet
Development

No branches or pull requests

6 participants