-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: unpublished files can be downloaded by users with the File Downloader role (DownloadFile permission) on that file or inherited from dataset or dataverse #2648
Comments
Please write unit tests for testing the permissions, instead of going by beliefs.
Yes! In fact, please start writing down the way things should work, i.e. the requirements, let the accountable people sign them off, then write the tests, see that the tests fail in the current situation and then start fixing the code. (I've been saying this for a while and it's not meant to critique individuals, but testing is essential in these parts of the code.) |
@bencomp I'm not aware of a requirements doc for this part of the code. When you ask for documentation are you asking for developer-related documentation in the form of requirements? Or are you asking for end user documentation in the form additions to the User Guide. Do know that #2653 was opened yesterday for the User Guide that says, "Permissions: write up more information about each specific permission." The permission in question for this issue is the "DownloadFile" permission. When you say unit tests do you really mean integration tests? I'm pretty sure one would need a running system (Glassfish, Postgres, etc.) to test this issue. With enough time I could write failing integration tests that demonstrate the issue (there are many edge cases though) and then fix the code to make the tests pass. I'd be starting from scratch though, so it would be time consuming. @bencomp I'm glad you're championing a culture of automated testing because it's an interest of mine as well. I've written some integration tests at https://github.com/IQSS/dataverse/tree/v4.2/src/test/java/edu/harvard/iq/dataverse/api but they are not executed automatically on every build like the unit tests are. I run them when I think of them. And actually, I can't even too many at once because of database deadlocks: #2460 |
@pdurbin I don't think you need a running system to test enforcement of permissions, because that happens in the Dataverse code, doesn't it? You might need to create some My suggestion about writing documentation first was aimed at preventing being 6 months into production and then find out different people understand core functionality differently. From the comments on #2645 I understand that this is the case. Also: how can you mark this issue solved if you don't test for every (edge) case? Can you at least start with the case that fails right now? |
@bencomp the main edge cases to test for this issue are when a permission is inherited, when it is implicit vs. explicit. That's why I wrote "either directly or through inheritance, it is believed." Both implicit and explicit permissions should be tested for this issue. Oh, and groups vs. users are edge cases. There are a variety of types of groups (explicit, IP, Shib). https://github.com/bencomp/dataverse/blob/4.3-dev/src/test/java/edu/harvard/iq/dataverse/authorization/DataverseRolePermissionHelperTest.java looks interesting but it doesn't test the "Data Access" API ( i.e. http://localhost:8080/api/access/datafile/12 ). I'm all for refactoring API code to make it testable by unit tests but I'm honestly not sure how to do this. @bencomp if you can show the way, please do! As I mentioned at http://irclog.iq.harvard.edu/dataverse/2015-10-20 perhaps http://guides.dataverse.org/en/latest/developers/testing.html should be updated to communicate to developers what's expected with regard to automated testing (unit tests and integration tests). Also, it would be great to have integration tests run on every commit at https://travis-ci.org/IQSS/dataverse but I haven't figured out how to do this. Right now Travis only runs unit tests, not integration tests. |
So it looks like every download is a call to For unit tests, you could mock parts of the code that go into the entity management, or provide a mock entity manager that has every possible situation that you want to support in Dataverse.
whether the one definitive authorisation check
I think the edge cases are the combinations that should throw the exceptions. All valid combinations are just cases that need to be handled correctly. The case at hand is of course the combination
The rule "should the download be allowed?" should be answered "no" under these conditions. As I said, you want to break up this rule. You should have a unit test for the publication status check and another one for the file restriction check. Then you can treat many cases as a single one:
|
I couldn't agree more. Pinging @michbarsinai for feedback on this great writeup since he implemented the permission system. Thanks @bencomp ! |
Note that manual setup is required in the GUI due to #2497
@landreev as we discussed I'm passing this issue to you to look at the commit I just pushed (852742e) which is a no-op (no change in behavior) but adds more logging and javadoc in the area of the bug. The commit also includes an updated quick-and-dirty script with three test scenarios (but more should be written): https://github.com/IQSS/dataverse/blob/5a235110a482b41ea81c1a51fb5a3146121ba5c0/scripts/issues/2648/reproduce |
…lished version could be downloaded);
OK, I believe it should be working properly now, but please test using your scripts. if (permissionService.on(df).has(Permission.DownloadFile)) { |
I'm on 3ae2013 and I'm getting an exception when I try to restrict a file. Here's a stack trace about "org.eclipse.persistence.exceptions.OptimisticLockException Exception Description: The object [edu.harvard.iq.dvn.core.study.FileMetadata[id=1]] cannot be merged because it has changed or been deleted since it was last read": stack.txt |
I just dropped my database and tried again to restrict a file but I can't via the GUI (and it's not possible via the API per #2497). As with the previous stack trace I uploaded with my last comment, this one also ended with "Caused by: java.lang.NullPointerException at edu.harvard.iq.dataverse.DatasetPage.populateDatasetUpdateFailureMessage(DatasetPage.java:2247)" This worked before recent changes by @landreev and @sekmiller . Do either of you know what's going on? If you can reproduce this it probably deserves a separate issue. "Can't restrict files" or whatever. |
Kevin, please re-test this. Both your ("exotic") case, of a non-owner user with the VIewUnpublished permission (they should be able to download non-restricted unpublished files). And see if everything download permissions-related is still working properly. |
@landreev Hold the presses, found another issue: can download restricted file through API when have only view unpublished ds perms. Also no request access button appears in UI in this case. Additional set up info: restricted file: 50by1000.tab, file id=2700 Given the above, this API call should return 403: Otherwise initial problem and all likely cases works as expected. |
OK, yeah, just as I suspected - only the filemetadata is marked as restricted; but not the file: SELECT restricted FROM filemetadata WHERE datafile_id=2700 This is how DatasetPage works (only marks the filemetadata, when you restrict the file; the file object becomes restricted only when the dataset is published). This is done on purpose, actually. Because when you have a published version with a non-restricted file, and you try to restrict it - we only want it to be restricted in the next DRAFT version. And we don't want it to affect the unrestricted status of that published file immediately. But only when it's finalized, by publishing the DRAFT. Of course whoever implemented it didn't think of this exotic case - an unpublished file that only exists in a DRAFT, which another user has been given permission to view. So it looks like for such files we DO want to mark the DataFile object as restricted right away. |
@kcondon: |
Could you say that the restriction status of any file (i.e. |
@landreev |
OK, should now be working, for reals! |
@bencomp: |
Please write an automated test and use a function table listing all inputs and corresponding outputs to help prevent overlooking anything. :) |
Note: for API data access testin, use both session and API key authentication. |
@bencomp |
Closing |
OK, this issue was a total disaster. I feel bad about it. Of course it didn't even have any business being in 4.2.1. I picked it up as a result of misunderstanding; I thought I was being asked to just add 3 lines to the Access class. But once I realized it was a bigger issue, I should've undone any changes and tagged it 4.3 (this had nothing to do with performance; and the original bug was narrow in scope and there was a workaround...). Because in the end I was working on it in a hurry and did more damage than good... In my defense, I hadn't done much work on permissions before, except of hooking up this API method to the PermissionService originally. The only positive side here is that in the process we've realized/been reminded of how messy that code was, that the business logic for these permission lookups is spread all over the app, that there are no automated tests for this, etc. I'll be opening more tickets for all these issues, hopefully we'll address them soon. |
As clarified in c8e49f2 the File Downloader role should only apply to published files. Currently, if a user (or group, presumably) is assigned the DownloadFile permission to an unpublished file (either directly or through inheritance, it is believed), that user can download the file if he or she knows the file id. That is to say, if a user clicks a link to http://localhost:8080/api/access/datafile/12 (for example) and is logged in to Dataverse, the file can be downloaded even if hasn't been published. This is not the intended behavior.
The intention is that a dataset is being prepared and files are being set to restricted, the File Downloader role can be assigned ahead of time, before the dataset is published. The role should have no affect, however, until the dataset is published, as discussed at #2645. (This should probably be documented.)
Testing of this issue should include assigning the DownloadFile permission at various levels such as at the file itself, at the file's parent dataset, and at the dataset's parent dataverse. Testing should include assignment of the role to both users and groups.
The text was updated successfully, but these errors were encountered: