Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(outputs.iotdb): Handle paths that contain illegal characters #14519

Merged
merged 12 commits into from
Jan 23, 2024

Conversation

giovanni-bellini-argo
Copy link
Contributor

Summary

Support for IoTDB ILLEGAL_PATH error when tags contains an illegal character, explained more in detail in issue.

Checklist

  • No AI generated code was used in this PR

Related issues

resolves #14518

@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Jan 2, 2024

Thanks so much for the pull request!
🤝 ✒️ Just a reminder that the CLA has not yet been signed, and we'll need it before merging. Please sign the CLA when you get a chance, then post a comment here saying !signed-cla

Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Looking at the IoTBD this does appear to be the suggested way of handling special characters in the path.

Can you break out the regex checking into a new function and please add tests for that function with both cases including special characters? My only concern with these types of changes is future changes down the road or corner cases :)

Thanks!

@powersj powersj changed the title Update iotdb.go to support paths that contain illegal characters fix(outputs.iotdb): Handle paths that contain illegal characters Jan 2, 2024
@powersj powersj added the waiting for response waiting for response from contributor label Jan 2, 2024
@srebhan
Copy link
Member

srebhan commented Jan 3, 2024

@giovanni-bellini-argo please also sign the CLA!

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jan 3, 2024
@srebhan srebhan added fix pr to fix corresponding bug area/iot New plugins or features relating to IoT monitoring plugin/output 1. Request for new output plugins 2. Issues/PRs that are related to out plugins labels Jan 3, 2024
@giovanni-bellini-argo
Copy link
Contributor Author

!signed-cla

Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm my understanding after playing with this a bit more: A path is made up of three elements:

  • root - reserved word
  • storage group
  • child - one or more items separated by periods

The storage group can only have:

The storage group name can only be characters, numbers and underscores.

I think this is what you are currently testing for. However, the child elements seem to allow a greater set of elements. This means the following are valid without escaping them.

root.foo.3
root.foo.()

Is that your understanding as well? I've been trying this with version 0.13.0, so that may play into this as well.

@giovanni-bellini-argo
Copy link
Contributor Author

giovanni-bellini-argo commented Jan 3, 2024

We are testing on V1.2.2 and could not create child nodes with any other characters outside of underscores and alphanumeric characters.

even tho the documentation states otherwise

@powersj
Copy link
Contributor

powersj commented Jan 3, 2024

Thanks!

We are testing on V1.2.2 and could not create child nodes with any other characters outside of underscores and alphanumeric characters.

I am glad we are trying on different versions. Unfortunately, having different behavior on different versions does mean we would need to make this opt-in as well. We cannot be breaking existing users on versions that do not run into this.

How we typically handle this is add a config option, something like:

# Mode to sanitize tags
# By default, tags are not sanitized. Possible options:
#   1.2 - only allows alpha/numeric and underscores, and non-numeric values. Otherwise, 
#         backticks are applied
sanitize_tag = ""

Then if that value is set to "1.2", you can apply your sanitization function.

I would still like to see a specific test as well that only tests the sanitize function with a various test cases.

Does that make sense?

@powersj
Copy link
Contributor

powersj commented Jan 3, 2024

I've also asked over in the iotdb repo for guidance on valid characters:

apache/iotdb#11840

Co-authored-by: SeanGaluzzi <SeanGaluzzi@users.noreply.github.com>
@giovanni-bellini-argo
Copy link
Contributor Author

a list of updates:

  1. added the option as per your request and made some researches: from version 1.x.x they completely changed the handling of nodes names which can apparently take only letters, numbers and underscores but can't be composed of only numbers.
  2. made some more checks to throw an error if root (which is a reserved kayword) is used and if ` is misused
  3. added the specific tests cases.

@powersj
Copy link
Contributor

powersj commented Jan 5, 2024

Thanks for the updates and doing that research. Almost there!

  1. I do not see your name in our CLA signed sheet. Can you please sign it?
  2. Looks like there a couple in lint issues:
plugins/outputs/iotdb/iotdb.go:277:28  gosimple  S1007: should use raw string (`...`) with regexp.Compile to avoid having to escape twice
plugins/outputs/iotdb/iotdb.go:310:5   revive    var-naming: don't use underscores in Go names; var tag_value should be tagValue
  1. And a test issue with your new tests, specifically: TestTagsHandling/treat_tags_as_device_IDs

  2. Please update the sample config and README with the new config option for sanitize_tag.

Happy to help with any of these updates, let me know if you do want assistence.

@giovanni-bellini-argo
Copy link
Contributor Author

giovanni-bellini-argo commented Jan 7, 2024

as per the CLA i did sign it on google (first time doing this so i cheched out online and found most use the google db) i did also read your guide about contributions and followed the link to sign the CLA on your site but the form has some iessues:

image

much appreciated your help.

- fixed lint issues
- fixed failing tests
- added documentation
@powersj
Copy link
Contributor

powersj commented Jan 11, 2024

Hey @giovanni-bellini-argo,

That seems to be your browser saying it is blocked by your content security policy. Can you please try a different browser?

Thanks!

@giovanni-bellini-argo
Copy link
Contributor Author

@powersj u seem to be right XD.
signed it

Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, thanks again for the re-work, I think we are nearly there!

plugins/outputs/iotdb/README.md Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/sample.conf Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working through this!

@powersj powersj added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Jan 18, 2024
@powersj powersj assigned srebhan and unassigned powersj Jan 18, 2024
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this nice PR @giovanni-bellini-argo! I have some small comments, which moving the regexp-compilation out of the Write() path being the most important one...

plugins/outputs/iotdb/README.md Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
Comment on lines 269 to 277
matchUnsopportedCharacter, _ := regexp.Compile("[^0-9a-zA-Z_:@#${}\x60]")

regex := []*regexp.Regexp{matchUnsopportedCharacter}
regexArray = append(regexArray, regex...)

// from version 1.x.x IoTDB changed the allowed keys in nodes
case "1.0", "1.1", "1.2", "1.3":
matchUnsopportedCharacter, _ := regexp.Compile("[^0-9a-zA-Z_\x60]")
matchNumericString, _ := regexp.Compile(`^\d+$`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, use MustCompile and initialize regexArray in Init()!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why initializing regexArray in Init()?
u think it would be better to include it in the struct for future changes?

and i would also delete those useless variables and directly append the regex to the array.

regexArray := []*regexp.Regexp{} // array of compiled regex patterns

	switch s.SanitizeTags {
	case "0.13":
		regex := []*regexp.Regexp{
			regexp.MustCompile("[^0-9a-zA-Z_:@#${}\x60]"),
		}
		regexArray = append(regexArray, regex...)

	// from version 1.x.x IoTDB changed the allowed keys in nodes
	case "1.0", "1.1", "1.2", "1.3":
		regex := []*regexp.Regexp{
			regexp.MustCompile("[^0-9a-zA-Z_\x60]"),
			regexp.MustCompile(`^\d+$`),
		}

		regexArray = append(regexArray, regex...)
	default:
		return tag, nil
	}

the idea is to make it easy to undertand and modify

plugins/outputs/iotdb/sample.conf Outdated Show resolved Hide resolved
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your update @giovanni-bellini-argo! Please try to compile the regular-expressions only once as suggested in the code! If you need help, I can provide a patch...

plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @giovanni-bellini-argo! I would just un-export the member variable and fix the typos and we are good to go. Tried to do it in my suggestions to ease your life but no sure I got all of them... :-)

plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
plugins/outputs/iotdb/iotdb.go Outdated Show resolved Hide resolved
@giovanni-bellini-argo
Copy link
Contributor Author

done everything, thanks for the patience and help 😎

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work and patience with my requests @giovanni-bellini-argo! :-)

@telegraf-tiger
Copy link
Contributor

@srebhan srebhan merged commit 4c1d8e3 into influxdata:master Jan 23, 2024
26 checks passed
@github-actions github-actions bot added this to the v1.29.3 milestone Jan 23, 2024
powersj pushed a commit that referenced this pull request Jan 29, 2024
)

Co-authored-by: SeanGaluzzi <SeanGaluzzi@users.noreply.github.com>
Co-authored-by: SeanGaluzzi <sean.galuzzi@argo.consulting>
(cherry picked from commit 4c1d8e3)
hhiroshell pushed a commit to hhiroshell/telegraf that referenced this pull request Feb 1, 2024
…luxdata#14519)

Co-authored-by: SeanGaluzzi <SeanGaluzzi@users.noreply.github.com>
Co-authored-by: SeanGaluzzi <sean.galuzzi@argo.consulting>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/iot New plugins or features relating to IoT monitoring fix pr to fix corresponding bug plugin/output 1. Request for new output plugins 2. Issues/PRs that are related to out plugins ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IoTDB device_id [ILLEGAL_PATH(509)] Exception occurred: insertRecord failed
4 participants