-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding documentation about the rest of the classes involved on generating the CSharpAPI #529
Changes from 12 commits
074b800
8b7f416
266952c
0b17a0b
98ffc79
6854c9a
5f52db8
c6ee265
dc01215
781293d
db2592d
311eb2a
8e91165
6524fd3
6f45e69
7f5ac2b
43f6540
6484255
cb96a7f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,6 +26,7 @@ | |
|
||
namespace Microsoft.ML.Runtime.Data | ||
{ | ||
/// <include file='doc.xml' path='doc/members/member[@name="NAFilter"]'/> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💭 Seems strange that these are in an external file instead of defined here in code... #Resolved There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
public sealed class NAFilter : FilterBase | ||
{ | ||
private static class Defaults | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,14 +29,14 @@ | |
|
||
namespace Microsoft.ML.Runtime.Data | ||
{ | ||
/// <summary> | ||
/// TermTransform builds up term vocabularies (dictionaries). | ||
/// Notes: | ||
/// * Each column builds/uses exactly one "vocabulary" (dictionary). | ||
/// * Output columns are KeyType-valued. | ||
/// * The Key value is the one-based index of the item in the dictionary. | ||
/// * Not found is assigned the value zero. | ||
/// </summary> | ||
|
||
// TermTransform builds up term vocabularies (dictionaries). | ||
// Notes: | ||
// * Each column builds/uses exactly one "vocabulary" (dictionary). | ||
// * Output columns are KeyType-valued. | ||
// * The Key value is the one-based index of the item in the dictionary. | ||
// * Not found is assigned the value zero. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'll have to be careful here. Conceptually, the key-values are logically starting at 0, but physically valid values start at 1. I feel like this might not be the best place to talk about key-values unless you're really going to go into them, since otherwise it may be confusing. #Pending There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Leaving it as is, since it is just a code comment for us. In reply to: 202454960 [](ancestors = 202454960) |
||
/// <include file='doc.xml' path='doc/members/member[@name="TextToKey"]/*' /> | ||
public sealed partial class TermTransform : OneToOneTransformBase, ITransformTemplate | ||
{ | ||
public abstract class ColumnBase : OneToOneColumn | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
<?xml version="1.0" encoding="utf-8" ?> | ||
<doc> | ||
<members> | ||
<member name="NAFilter"> | ||
<summary> | ||
Removes missing values from vector type columns. | ||
</summary> | ||
<remarks> | ||
This transform emoves the entire row if any of the input columns have a missing value in that row. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
ypo #Closed |
||
This preprocessing is required for many ML algorithms that cannot work with missing values. | ||
Useful if any missing entry invalidates the entire row. | ||
If the <see cref="Microsoft.ML.Runtime.Data.NAFilter.Defaults.Complement"/> is set to true, this transform would do the exact opposite, | ||
it will keep only the rows that have missing values. | ||
</remarks> | ||
<seealso cref="Microsoft.ML.Runtime.Data.MetadataUtils.Kinds.HasMissingValues"></seealso> | ||
<example> | ||
<code> | ||
pipeline.Add(new MissingValuesRowDropper("Column1")); | ||
</code> | ||
</example> | ||
</member> | ||
|
||
<member name="NAHandle"> | ||
<summary> | ||
Handle missing values by replacing them with either the default value or the indicated value. | ||
</summary> | ||
<remarks> | ||
This transform handles missing values in the input columns. For each input column, it creates an output column | ||
where the missing values are replaced by one of these specified values: | ||
<list type="bullet"> | ||
<item><description>The default value of the appropriate type.</description></item> | ||
<item><description>The mean value of the appropriate type.</description></item> | ||
<item><description>The max value of the appropriate type.</description></item> | ||
<item><description>The min value of the appropriate type.</description></item> | ||
</list> | ||
<para>The last three work only for numeric/TimeSpan/DateTime kind columns.</para> | ||
<para> The output column can also optionally include an indicator vector for which slots were missing in the input column. | ||
This can be done only when the indicator vector type can be converted to the input column type, i.e. only for numeric columns. | ||
</para> | ||
<para> | ||
When computing the mean/max/min value, there is also an option to compute it over the whole column instead of per slot. | ||
This option has a default value of true for variable length vectors, and false for known length vectors. | ||
It can be changed to true for known length vectors, but it results in an error if changed to false for variable length vectors. | ||
</para> | ||
</remarks> | ||
<seealso cref=" Microsoft.ML.Runtime.Data.MetadataUtils.Kinds.HasMissingValues"/> | ||
<seealso cref="Microsoft.ML.Data.DataKind"/> | ||
<example> | ||
<code> | ||
pipeline.Add(new MissingValueHandler("FeatureCol", "CleanFeatureCol") { ReplaceWith = NAHandleTransformReplacementKind.Mean }); | ||
</code> | ||
</example> | ||
</member> | ||
|
||
</members> | ||
</doc> |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,7 +20,7 @@ public interface IFastTreeTrainerFactory : IComponentFactory<ITrainer> | |
{ | ||
} | ||
|
||
/// <include file='./doc.xml' path='docs/members/member[@name="FastTree"]/*' /> | ||
/// <include file='doc.xml' path='doc/members/member[@name="FastTree"]/*' /> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ❓ Why did this one lose the |
||
public sealed partial class FastTreeBinaryClassificationTrainer | ||
{ | ||
[TlcModule.Component(Name = LoadNameValue, FriendlyName = UserNameValue, Desc = Summary)] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Why make this change? I would expect one of the following:
href
specified but no explicit text)<see href="http://epubs.siam.org/doi/pdf/10.1137/1.9781611972740.53">Reservoir-base Random Sampling with Replacement from Data Stream</a>
<see href="http://epubs.siam.org/doi/pdf/10.1137/1.9781611972740.53">Reservoir-base Random Sampling with Replacement from Data Stream (PDF, Proceedings of the 2004 SIAM International Conference on Data Mining)</a>
📝
<see href
is a well-supported form for external links in documentation comments, more so than<a href
. #ResolvedThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't seem see contains any href:
https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/xmldoc/see
And this was a bit unsettling:
https://stackoverflow.com/questions/6960426/c-sharp-xml-documentation-website-link
In reply to: 202773275 [](ancestors = 202773275)