Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with handling spaces in file/directory names #229

Closed
phillipross opened this issue Apr 16, 2017 · 2 comments
Closed

Problems with handling spaces in file/directory names #229

phillipross opened this issue Apr 16, 2017 · 2 comments
Assignees

Comments

@phillipross
Copy link
Contributor

I have files and directory stored in manta that actually have spaces. It seems node-manta tools allow this (I populated the directories with muntar). The node-manta tools mls and mfind properly list the files with spaces too:

$ mls ~~/stor/ica/images_2016_1022/abh  | grep " " | head -2
Adelaide images/
April 2014/
$ mfind ~~/stor/ica/images_2016_1022/abh  | grep " " | head -2
/philross/stor/ica/images_2016_1022/abh/Adelaide images
/philross/stor/ica/images_2016_1022/abh/April 2014

Unfortunately, the java-manta client seems to return these paths with spaces replaced by plus symbols. Trying to request one of these paths then fails as the path doesn't actually exist with plus sybols... it exists with spaces.

The solution isn't as simple as URLEncode.encode(path, 'UTF8') either since that will replace the path separators with %2F.

@phillipross
Copy link
Contributor Author

It appears MantaUtils.formatPath method is using URLEncode.encode which will change the spaces to plus symbols and manta will store the paths with these characters as plus symbols. I think they should be encoded to %20 instead of the plus symbol.

The following shows a series of tests I did with curl:

$ manta "/$MANTA_USER/stor/java-manta-testing" -XPUT -H"content-type: application/json; type=directory"
$ manta "/$MANTA_USER/stor/java-manta-testing"
$ manta "/$MANTA_USER/stor/java-manta-testing/a b c" -XPUT -H"content-type: application/json; type=directory"
{"BadHTTPRequest": "Request was malformed"}
$ manta "/$MANTA_USER/stor/java-manta-testing/a+b%20c" -XPUT -H"content-type: application/json; type=directory"
$ manta "/$MANTA_USER/stor/java-manta-testing"
{"name":"a+b c","type":"directory","mtime":"2017-04-16T04:44:05.852Z"}

In the output, you can see that creating directories with spaces in the path makes manta unhappy. But plus symbols are indeed stored as plus symbols, while %20 chars are stored as spaces.

@cburroughs cburroughs self-assigned this Apr 17, 2017
cburroughs added a commit to cburroughs/java-manta that referenced this issue Apr 21, 2017
When Manta paths appear as part of the of a url (which they do on most
api requests), reserved characters must be encoded just like in any
other url.

Paths were encoded inconsistently (in that they were sometimes
encoded, sometimes not, and sometimes encoded twice).  When paths were
encoded, spaces were not encoded correctly (turned to `+`) because
despite the tantalizing name `java.net.URLEncoder` does does not
encode strings for inclusion in a url.

The contract for `MantaObject.getPath` has been clarified. It should
always return the "real" (un-encoded) path.

ref TritonDataCenter#229 TritonDataCenter#230 231
cburroughs added a commit to cburroughs/java-manta that referenced this issue Apr 21, 2017
When Manta paths appear as part of the of a url (which they do on most
api requests), reserved characters must be encoded just like in any
other url.

Paths were encoded inconsistently (in that they were sometimes
encoded, sometimes not, and sometimes encoded twice).  When paths were
encoded, spaces were not encoded correctly (turned to `+`) because
despite the tantalizing name `java.net.URLEncoder` does does not
encode strings for inclusion in a url.

The contract for `MantaObject.getPath` has been clarified. It should
always return the "real" (un-encoded) path.

ref TritonDataCenter#229 TritonDataCenter#230 TritonDataCenter#231
cburroughs added a commit to cburroughs/java-manta that referenced this issue Apr 24, 2017
When Manta paths appear as part of the of a url (which they do on most
api requests), reserved characters must be encoded just like in any
other url.

Paths were encoded inconsistently (in that they were sometimes
encoded, sometimes not, and sometimes encoded twice).  When paths were
encoded, spaces were not encoded correctly (turned to `+`) because
despite the tantalizing name `java.net.URLEncoder` does does not
encode strings for inclusion in a url.

The contract for `MantaObject.getPath` has been clarified. It should
always return the "real" (un-encoded) path.

ref TritonDataCenter#229 TritonDataCenter#230 TritonDataCenter#231
cburroughs added a commit that referenced this issue Apr 24, 2017
When Manta paths appear as part of the of a url (which they do on most
api requests), reserved characters must be encoded just like in any
other url.

Paths were encoded inconsistently (in that they were sometimes
encoded, sometimes not, and sometimes encoded twice).  When paths were
encoded, spaces were not encoded correctly (turned to `+`) because
despite the tantalizing name `java.net.URLEncoder` does does not
encode strings for inclusion in a url.

The contract for `MantaObject.getPath` has been clarified. It should
always return the "real" (un-encoded) path.

ref #229 #230 #231
@cburroughs
Copy link
Contributor

Thank you for the detailed report. In master spaces are now encoded as spaces (and match the node behavior).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants