Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Merge function type and sort enumeration #177

Merged
merged 5 commits into from
Oct 12, 2024

Conversation

vanderpol
Copy link
Member

Adding a merge function which joins together multiple elements into a single variable, with an optional delimiter, and optional sorting. Had to add a new sort enumeration which may need to be discussed.

@vanderpol vanderpol requested a review from solind October 10, 2024 17:02
@@ -938,6 +939,16 @@
<xsd:group ref="oval-def:ComponentGroup"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="MergeFunctionType">
<xsd:annotation>
<xsd:documentation>The merge function takes one or more components and merges them together into a single string, with an optional delimiter. For example if data from one registry multi-size value contains values of "abc" and "def". The merge function will resolve to a local_variable with the value of 'abcdef'. If an optional delimiter of ',' was used, the merge function would resolve to a local_variable with the value of 'abc,'def'</xsd:documentation>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edited slightly:

The merge function takes one or more components and merges them together into a single string value, optionally including a delimiter string. For example, if data from one registry multi-size value contains values of "abc" and "def", the merge function operated on those values would resolve to a string with the value of 'abcdef'. If an optional delimiter of ',' was used, the merge function would resolve to a value of 'abc,'def'.

<xsd:documentation>The SortEnumeration simple type defines basic sorting operations. Currently 'none', 'alphabetical' and 'alphanumerical' are defined..</xsd:documentation>
</xsd:annotation>
<xsd:restriction base="xsd:string">
<xsd:enumeration value="none">
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't call this "none", I would call this "document_order". After all, two interpreters using the same system-characteristics file as input should both generate identical results.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't entirely sure what your "document_order" meant, as long as we explain it as the order of the items/values in OVAL system characteristics, I would agree.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document order means... this is an XML document. The order is the order in which the associated elements appear in the document.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my brain, this all happens in memory structures, long before we end up writing it to XML, so that was my disconnect.

<xsd:documentation>No sorting is performed, the values will remain in the order they came in as.</xsd:documentation>
</xsd:annotation>
</xsd:enumeration>
<xsd:enumeration value="alphabetical">
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call this "lexical", which I believe is more standard terminology, and typically stands in contrast with "natural" ordering.

For example, given the strings {"a1, "a2", "a12", "1", "2", "12"}, in ascending order*, lexical sorting yields:

"1", "12", "2", "a1", "a2", "a12"

And natural sorting yields:

"1", "2", "12", "a1", "a12", "a2"

  • this reminds me, there is a difference between the sort type and the order (ascending vs. descending)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with that, will update

<xsd:documentation>Sort alphabetical, useful for pure string lists</xsd:documentation>
</xsd:annotation>
</xsd:enumeration>
<xsd:enumeration value="alphanumerical">
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different from "natural"?

Copy link

@solind solind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe you need a SortEnumeration (lexical, natural, numeric) and an OrderEnumeration (ascending/descending)

Where numeric would generate an error if it encounters a string value that cannot be converted to a base-10 number (float).

@vanderpol
Copy link
Member Author

I think maybe you need a SortEnumeration (lexical, natural, numeric) and an OrderEnumeration (ascending/descending)

Where numeric would generate an error if it encounters a string value that cannot be converted to a base-10 number (float).

I wasn't sure about a numeric sort, as accidental string data could cause all kinds of errors, but an alphanumeric (or natural), would handle them both. I'm not 100% certain on the need for alphanumeric vs natural, likely just natural. I can easily add, and content authors beware.

Agree on OrderEnumeration, will add.

@solind
Copy link

solind commented Oct 10, 2024

As for numerical vs. natural, it's just a thought. You could effectively achieve the same thing using a variable datatype, like you did. 😀

@vanderpol
Copy link
Member Author

So for the OrderEnumeration, I'm pondering the interaction between that and "document" sort? I guess we can just document that the OrderEnumeration has no influence on the 'document' sort, as 'document' sort, isn't sorting at all? Default OrderEnumeration should be set to "ascending".

@vanderpol
Copy link
Member Author

Here's some updated sample results, I've added a recursive Registry test, a numeric sort test using variables, and I threw in a WMI57 test to show it works with records.
SCC-5.10.1_Beta1_2024-10-11_084515_OVAL-Results_merge_test_content.xml.txt

@vanderpol vanderpol requested a review from solind October 11, 2024 12:49
@solind
Copy link

solind commented Oct 11, 2024

So for the OrderEnumeration, I'm pondering the interaction between that and "document" sort? I guess we can just document that the OrderEnumeration has no influence on the 'document' sort, as 'document' sort, isn't sorting at all? Default OrderEnumeration should be set to "ascending".

OrderEnumeration values would be "ascending" and "descending". I guess in the case of a "document_order" sort value, this would be ignored.

ETA: which I guess is what you're proposing, so we agree! Also yes to the default.

@solind
Copy link

solind commented Oct 11, 2024

Here's some updated sample results, I've added a recursive Registry test, a numeric sort test using variables, and I threw in a WMI57 test to show it works with records. SCC-5.10.1_Beta1_2024-10-11_084515_OVAL-Results_merge_test_content.xml.txt

Naturally, a little digression here... I noticed this item in your results:

<win-sc:registry_item id="10" status="exists">
  <win-sc:hive>HKEY_LOCAL_MACHINE</win-sc:hive>
  <win-sc:key>SYSTEM\CurrentControlSet\Control\Session Manager</win-sc:key>
  <win-sc:name>ExcludeFromKnownDlls</win-sc:name>
  <win-sc:last_write_time datatype="int">133730601810000000</win-sc:last_write_time>
  <win-sc:type>reg_multi_sz</win-sc:type>
  <win-sc:value></win-sc:value>
  <win-sc:value></win-sc:value>
  <win-sc:windows_view>64_bit</win-sc:windows_view>
</win-sc:registry_item>

I see two empty values, which in XML parlance means strings of zero length. I assume this happened because a REG_SZ is terminated by a pair of NULL (0x00) bytes. IIRC, Joval would interpret this as a non-valued key, and represent that by creating a single item value like this:

  <win-sc:value status="does not exist"/>

This is important, because empty strings are still string values that exist.

@vanderpol
Copy link
Member Author

vanderpol commented Oct 11, 2024

Very astute observation in the results, and given the lack of multi-sz content in the wild we may have overlooked this. Below is an export from my registry, so just to confirm you would say that this should be a DNE on the value for ExcludeFromKnownDlls?

In looking back in our code revision history, a former developer decided at least 12 years ago that we should intentionally mark this scenario as exists. Not saying it's 'right', but it was intentionally done, if misguided. Shows how frequently multi-sz keys are used in content more than anything I'd guess.

Windows Registry Editor Version 5.00;

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager]
"AutoChkTimeout"=dword:00000008
"BootExecute"=hex(7):61,00,75,00,74,00,6f,00,63,00,68,00,65,00,63,00,6b,00,20,
00,61,00,75,00,74,00,6f,00,63,00,68,00,6b,00,20,00,2a,00,00,00,00,00
"BootShell"=hex(2):25,00,53,00,79,00,73,00,74,00,65,00,6d,00,52,00,6f,00,6f,00,
74,00,25,00,5c,00,73,00,79,00,73,00,74,00,65,00,6d,00,33,00,32,00,5c,00,62,
00,6f,00,6f,00,74,00,69,00,6d,00,2e,00,65,00,78,00,65,00,00,00
"CriticalSectionTimeout"=dword:00278d00
"ExcludeFromKnownDlls"=hex(7):00,00

@solind
Copy link

solind commented Oct 11, 2024

so just to confirm you would say that this should be a DNE on the value for ExcludeFromKnownDlls?

Yes, I would. I believe we even had a discussion about this on the old OVAL mailing list.

@vanderpol
Copy link
Member Author

@solind any other thoughts on the PR? I feel like (after a lot of improvements based on your feedback, which I greatly appreciate) that we are pretty much good to go on this?

@@ -938,6 +939,17 @@
<xsd:group ref="oval-def:ComponentGroup"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="MergeFunctionType">
<xsd:annotation>
<xsd:documentation>The merge function takes one or more components and merges them together into a single string value, optionally including a delimiter string. For example, if data from one registry multi-size value contains values of "abc" and "def", the merge function operated on those values would resolve to a string with the value of 'abcdef'. If an optional delimiter of ',' was used, the merge function would resolve to a value of 'abc,'def'.</xsd:documentation>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REG_MULTI_SZ doesn't actually mean "multi-size", it means multiple null-terminated strings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I swear I know what I'm doing from time to time... probably would have been faster to have you write this spec from the start.

</xsd:enumeration>
<xsd:enumeration value="natural">
<xsd:annotation>
<xsd:documentation>Sort like PHP's natsort https://www.php.net/manual/en/function.natsort.php</xsd:documentation>
Copy link

@solind solind Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Natural sort order is a pretty established computing concept, with implementations in most programming languages and on most operating systems. So, I think this might be a better external URL to reference: https://en.wikipedia.org/wiki/Natural_sort_order

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Copy link

@solind solind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only comments are documentation-related. LMK if you agree with them.

@vanderpol vanderpol requested a review from solind October 11, 2024 23:12
@vanderpol
Copy link
Member Author

Fixed documentation per your suggestions. Hoping this is now done.

Copy link

@solind solind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@vanderpol vanderpol merged commit 83ef99f into OVAL-Community:develop Oct 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants