Fisehara/decode uri before parsing #77

fisehara · 2024-01-03T15:16:39Z

No description provided.

thgreasi

lgtm but I would prefer Page to make the call since I'm not familiar w/ this codebase.
The only weird case I could think was whether it's fine to decode the url part before the ?.

Page-

I still want to be confident that the changed parsing matches what the odata spec expects, but at least with the change as-is there's a bunch of pegjs level stuff to improve

odata-parser.pegjs

Page- · 2024-01-04T14:19:47Z

odata-parser.pegjs

@@ -641,19 +642,17 @@ DurationNumber =
 Text =
 	// This regex is equivalent to `(!ReservedUriComponent)`
 	text:$[^:/?#\[\]@!$*&()+,;=]*


Suggested change

text:$[^:/?#\[\]@!$*&()+,;=]*

$[^:/?#\[\]@!$*&()+,;=]*

odata-parser.pegjs

Page- · 2024-01-04T14:24:31Z

odata-parser.pegjs

@@ -551,7 +552,7 @@ SubPathSegment =
 ResourceName =
 	// This regex is equivalent to `!(ReservedUriComponent / [ %])`
 	resourceName:$[^:/?#\[\]@!$*&()+,;= %]+
-	{ return decodeURIComponent(resourceName) }
+	{ return resourceName }


Suggested change

{ return resourceName }

Page- · 2024-01-04T14:24:35Z

odata-parser.pegjs

@@ -551,7 +552,7 @@ SubPathSegment =
 ResourceName =
 	// This regex is equivalent to `!(ReservedUriComponent / [ %])`
 	resourceName:$[^:/?#\[\]@!$*&()+,;= %]+


Suggested change

resourceName:$[^:/?#\[\]@!$*&()+,;= %]+

$[^:/?#\[\]@!$*&()+,;= %]+

odata-parser.pegjs

Page- · 2024-01-04T14:26:16Z

odata-parser.pegjs


 Sign =
 		'+'
-	/	'%2B'
 		{ return '+' }
 	/	'-'
 	/	''

 // TODO: This should really be done treating everything the same, but for now this hack should allow FF to work passably.
 Apostrophe =


This should probably be in-lined now that it only checks a single character and there (probably?) won't be any benefit from memoizing, and definitely no benefit from deduplicating

This comment still stands

Parser is not parsing for escaped characters like %27='(' or %28=')' Change-type: patch Signed-off-by: fisehara <harald@balena.io>

fisehara · 2024-01-05T10:27:58Z

@Page-
The OData spec defines that the percent-decode should happen only on:

path segment
query option name
query option value

Given this syntax:

odata-parser/odata-parser.pegjs

Lines 116 to 133 in 6f762e8

    
           QueryOptions = 
        
           	options:QueryOption|1..,'&'| 
        
           	{ return CollapseObjectArray(options) } 
        
           QueryOption = 
        
           		Dollar 
        
           		@(	SelectOption 
        
           		/	FilterByOption 
        
           		/	ExpandOption 
        
           		/	SortOption 
        
           		/	TopOption 
        
           		/	SkipOption 
        
           		/	CountOption 
        
           		/	InlineCountOption 
        
           		/	FormatOption 
        
           		) 
        
           	/	OperationParam 
        
           	/	ParameterAliasOption

The QueryOption should first split into QueryName and QueryValue and percent-decode both before parsing the data specific queryValue.
Should we implement it this way?

Page- · 2024-01-05T11:55:10Z

odata-parser.pegjs

 	)

 spaces =
-	space*
+	[\ +]*


There's no need to escape ?

Suggested change

[\ +]*

[ +]*

Page- · 2024-01-05T11:58:03Z

odata-parser.pegjs

@@ -583,7 +583,7 @@ Duration =
 	'duration'
 	Apostrophe
 	// Sign must appear first if it appears
-	sign:Sign
+	sign:$[+\-]?


Fwiw the - doesn't actually need to be escaped if it's the first or last character in the character class

Page- · 2024-01-05T11:59:39Z

odata-parser.pegjs

@@ -132,7 +134,6 @@ QueryOption =

 Dollar '$ query options' =


Since this only checks a single character it makes sense to in-line imo

Page- · 2024-01-05T11:59:57Z

odata-parser.pegjs


 Sign =
 		'+'
-	/	'%2B'
 		{ return '+' }
 	/	'-'
 	/	''

 // TODO: This should really be done treating everything the same, but for now this hack should allow FF to work passably.
 Apostrophe =


This comment still stands

Page- · 2024-01-05T12:01:42Z

odata-parser.pegjs


-space =


Fwiw I don't mind leaving this as an alias to avoid issues where in future people may add an extra thing that wants to check for spaces but don't realize that in a url a + can be equivalent to a space and so only check ' ' instead of [ +]. Unless of course there's a significant performance benefit

Page- · 2024-01-05T12:03:46Z

Yeah, I think we should match what the OData spec expects @fisehara because the better we match the more tooling/documentation/etc will be cross compatible

Signed-off-by: fisehara <harald@balena.io>

fisehara · 2024-01-09T08:37:34Z

Pre-percent decoding will violate OData parsing, case:

odata separates on '&' and '='
If a request percent encodes '&' to '%26' in a filter string and pre-decoding is applied a violating & is in the request

As of know this module is the most loaded module in pinejs stack and an OData / percent-encoded parity change would increase the load by >= 10% which is not desirable.
Postponing work on spec parity for now.

fisehara marked this pull request as draft January 3, 2024 15:17

fisehara force-pushed the fisehara/decode-uri-before-parsing branch from 4107312 to 3143f0c Compare January 4, 2024 09:32

fisehara requested a review from Page- January 4, 2024 09:33

fisehara force-pushed the fisehara/decode-uri-before-parsing branch 5 times, most recently from 7973d74 to b08feb7 Compare January 4, 2024 10:24

fisehara marked this pull request as ready for review January 4, 2024 10:24

fisehara linked an issue Jan 4, 2024 that may be closed by this pull request

Support all RFC3986 § 2.2 reserved characters and § 2.1 percent-encoding uppercase and lowercase #73

Open

flowzone-app bot enabled auto-merge January 4, 2024 10:26

fisehara force-pushed the fisehara/decode-uri-before-parsing branch from b08feb7 to 86af243 Compare January 4, 2024 11:37

thgreasi reviewed Jan 4, 2024

View reviewed changes

fisehara force-pushed the fisehara/decode-uri-before-parsing branch 3 times, most recently from 38816c1 to 721f272 Compare January 4, 2024 14:18

Page- reviewed Jan 4, 2024

View reviewed changes

Decode escaped URI

6870d86

Parser is not parsing for escaped characters like %27='(' or %28=')' Change-type: patch Signed-off-by: fisehara <harald@balena.io>

fisehara force-pushed the fisehara/decode-uri-before-parsing branch 2 times, most recently from 5b7895c to 6f762e8 Compare January 5, 2024 10:25

fisehara disabled auto-merge January 5, 2024 10:28

fisehara requested a review from Page- January 5, 2024 10:28

Page- reviewed Jan 5, 2024

View reviewed changes

fisehara added 2 commits January 5, 2024 15:02

Optimize Parser Syntax

084193a

Signed-off-by: fisehara <harald@balena.io>

Manual Percent Decoding

cdc901d

Signed-off-by: fisehara <harald@balena.io>

fisehara force-pushed the fisehara/decode-uri-before-parsing branch from 6f762e8 to cdc901d Compare January 5, 2024 15:56

flowzone-app bot enabled auto-merge January 5, 2024 15:58

fisehara closed this Jan 9, 2024

auto-merge was automatically disabled January 9, 2024 08:37
Pull request was closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fisehara/decode uri before parsing #77

Fisehara/decode uri before parsing #77

fisehara commented Jan 3, 2024

thgreasi left a comment

Page- left a comment

Page- Jan 4, 2024

Page- Jan 4, 2024

Page- Jan 4, 2024

Page- Jan 4, 2024

Page- Jan 5, 2024

fisehara commented Jan 5, 2024

Page- Jan 5, 2024

Page- Jan 5, 2024

Page- Jan 5, 2024

Page- Jan 5, 2024

Page- Jan 5, 2024

Page- commented Jan 5, 2024

fisehara commented Jan 9, 2024

	resourceName:$[^:/?#\[\]@!$*&()+,;= %]+
	$[^:/?#\[\]@!$*&()+,;= %]+

Fisehara/decode uri before parsing #77

Fisehara/decode uri before parsing #77

Conversation

fisehara commented Jan 3, 2024

thgreasi left a comment

Choose a reason for hiding this comment

Page- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fisehara commented Jan 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Page- commented Jan 5, 2024

fisehara commented Jan 9, 2024