Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not parse MediaType #6663

Closed
asialjim opened this issue Aug 2, 2023 · 5 comments
Closed

Could not parse MediaType #6663

asialjim opened this issue Aug 2, 2023 · 5 comments
Labels
Milestone

Comments

@asialjim
Copy link

asialjim commented Aug 2, 2023

When GuavaAPI parse Content-Type String to GuavaMediaType, there exception happen:
Maven Dependency:

       <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>32.1.2-jre</version>
        </dependency>

Code:

        MediaType parse = MediaType.parse("multipart/form-data;charset=UTF-8;boundary =--jelsaflflksafjel");
        System.out.println(parse);

Exception:

/home/asialjim/.jdks/corretto-17.0.7/bin/java -javaagent:/home/asialjim/.local/share/JetBrains/Toolbox/apps/IDEA-U/ch-0/232.8660.185/lib/idea_rt.jar=43365:/home/asialjim/.local/share/JetBrains/Toolbox/apps/IDEA-U/ch-0/232.8660.185/bin -Dfile.encoding=UTF-8 -classpath /home/asialjim/IdeaProjects/test/guava-test/target/classes:/home/asialjim/.m2/repository/com/google/guava/guava/32.1.2-jre/guava-32.1.2-jre.jar:/home/asialjim/.m2/repository/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar:/home/asialjim/.m2/repository/com/google/guava/listenablefuture/9999.0-empty-to-avoid-conflict-with-guava/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar:/home/asialjim/.m2/repository/com/google/code/findbugs/jsr305/3.0.2/jsr305-3.0.2.jar:/home/asialjim/.m2/repository/org/checkerframework/checker-qual/3.33.0/checker-qual-3.33.0.jar:/home/asialjim/.m2/repository/com/google/errorprone/error_prone_annotations/2.18.0/error_prone_annotations-2.18.0.jar:/home/asialjim/.m2/repository/com/google/j2objc/j2objc-annotations/2.8/j2objc-annotations-2.8.jar io.github.asialjim.frame.remote.Main
Exception in thread "main" java.lang.IllegalArgumentException: Could not parse 'multipart/form-data;charset=UTF-8;boundary =--jelsaflflksafjel'
	at com.google.common.net.MediaType.parse(MediaType.java:1085)
	at io.github.asialjim.frame.remote.Main.main(Main.java:8)
Caused by: java.lang.IllegalStateException
	at com.google.common.base.Preconditions.checkState(Preconditions.java:496)
	at com.google.common.net.MediaType$Tokenizer.consumeCharacter(MediaType.java:1123)
	at com.google.common.net.MediaType.parse(MediaType.java:1063)
	... 1 more

When I delete the space charactor between 'boundary' and '=---' like this;

        MediaType parse = MediaType.parse("multipart/form-data;charset=UTF-8;boundary=--jelsaflflksafjel");
        System.out.println(parse);

Then It's worked!

By the way, This Content-Type: 'multipart/form-data;charset=UTF-8;boundary =--jelsaflflksafjel' with space charactor was generated by trip framework

@cgdecker
Copy link
Member

cgdecker commented Aug 2, 2023

From what I can tell, that Content-Type is illegal according to RFC 2045, which MediaType conforms to. A property for a media type is of the form attribute=value and the attribute cannot contain a space under any circumstances. The value can only contain a space if it's quoted.

A space seems to be allowed between the ; and the parameter though, e.g. multipart/form-data; charset=UTF-8; boundary=--jelsaflflksafjel. That said, in the time I've been looking at the RFCs I haven't seen anything that obviously states that spaces are allowed there, making me wonder if the RFCs might technically allow spaces around any parts of the syntax (in which case you could also legally write multipart / form-data, and the space after boundary could also be considered skippable rather than an invalid part of the parameter's attribute.

One other thing: the class Javadoc for MediaType states:

Note that this specifically does not represent the value of the MIME Content-Type header and as such has no support for header-specific considerations such as line folding and comments.

So trying to directly parse the value of a Content-Type header to a MediaType may not be the right thing to do anyway unless the API returning the string is accounting for and removing things like comments and line folding.

@cgdecker
Copy link
Member

cgdecker commented Aug 2, 2023

Digging around a little more, it looks like according to RFC 822, whitespace should be allowed around the = in a parameter (as well as around the / between the type and subtype, technically at least), though I'm still not 100% clear on that.

@cgdecker cgdecker added the type=defect Bug, not working as expected label Aug 2, 2023
@asialjim
Copy link
Author

asialjim commented Aug 3, 2023

Indeed, according to RFC 2045, that content type is illegal, but when I tried to parse it using the corresponding API of the Spring web framework, it also worked properly

        org.springframework.http.MediaType mediaType = org.springframework.http.MediaType.valueOf("multipart/form-data;charset=UTF-8;boundary =--jelsaflflksafjel");
        System.out.println(mediaType.getCharset());
        System.out.println(mediaType);
        mediaType = org.springframework.http.MediaType.valueOf("multipart/form-data; charset = UTF-8;boundary =--jelsaflflksafjel");
        System.out.println(mediaType.getCharset());
        System.out.println(mediaType);

Console Output

UTF-8
multipart/form-data;charset=UTF-8;boundary=--jelsaflflksafjel
UTF-8
multipart/form-data;charset=UTF-8;boundary=--jelsaflflksafjel

So I believe that if we can handle this defect, it will be more robust

@cgdecker
Copy link
Member

cgdecker commented Aug 3, 2023

As I mentioned in my second comment, I think my initial read was wrong and it actually is legal according to RFC 2045. I'm thinking it should be fine to allow whitespace before and after any of the separators, /, ;, and =.

@asialjim
Copy link
Author

asialjim commented Aug 4, 2023

Yes, it is.

copybara-service bot pushed a commit that referenced this issue Aug 7, 2023
…either side of the `=` separator in a parameter (`attribute=value`) as well as on either side of the `/` between the type and subtype (because nothing in the specification suggests that it be treated differently than the `;` or the `=`).

[RFC 822](https://datatracker.ietf.org/doc/html/rfc822), which specifies the notation used to describe the syntax of a media type in [RFC 2045](https://datatracker.ietf.org/doc/html/rfc2045#section-5.1), seems to [indicate](https://datatracker.ietf.org/doc/html/rfc822#section-3.1.4) that spaces should be allowed here (and in fact if it didn't, nothing in the specification would allow them around the `;` either).

Fixes #6663.

RELNOTES=`net`: Made `MediaType.parse` allow and skip over whitespace around the `/` and `=` separator tokens in addition to the `;` separator for which it was already being allowed.
PiperOrigin-RevId: 553222041
@cpovirk cpovirk added this to the 32.1.3 milestone Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment