-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OneHotEncoding sample #2779
OneHotEncoding sample #2779
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,62 @@ | ||||||||
using System; | ||||||||
using System.Collections.Generic; | ||||||||
using Microsoft.ML.Data; | ||||||||
using static Microsoft.ML.Transforms.OneHotEncodingTransformer; | ||||||||
|
||||||||
namespace Microsoft.ML.Samples.Dynamic | ||||||||
{ | ||||||||
public static class OneHotEncodingTransform | ||||||||
{ | ||||||||
public static void Example() | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it come from an existing test? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
please link it to the extension it documents through the node. See the other extension methods xml doc. |
||||||||
{ | ||||||||
// Create a new ML context, for ML.NET operations. It can be used for exception tracking and logging, | ||||||||
// as well as the source of randomness. | ||||||||
var ml = new MLContext(); | ||||||||
|
||||||||
// Get a small dataset as an IEnumerable and convert it to an IDataView. | ||||||||
IEnumerable<SamplesUtils.DatasetUtils.SampleInfertData> data = SamplesUtils.DatasetUtils.GetInfertData(); | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
var There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
think we want to deprecate the infert datasets from the samples; as it is a sensitive one. Is it possible to use one of the other dataset snippets? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can use Adult dataset that exists in DatasetUtils In reply to: 261049116 [](ancestors = 261049116) |
||||||||
var trainData = ml.Data.LoadFromEnumerable(data); | ||||||||
|
||||||||
// Preview of the data. | ||||||||
// | ||||||||
// Age Case Education Induced Parity PooledStratum RowNum ... | ||||||||
// 26 1 0-5yrs 1 6 3 1 ... | ||||||||
// 42 1 0-5yrs 1 1 1 2 ... | ||||||||
// 39 1 0-5yrs 2 6 4 3 ... | ||||||||
// 34 1 0-5yrs 2 4 2 4 ... | ||||||||
// 35 1 6-11yrs 1 3 32 5 ... | ||||||||
|
||||||||
// A pipeline for one hot encoding the Education column. | ||||||||
var pipeline = ml.Transforms.Categorical.OneHotEncoding("EducationOneHotEncoded", "Education", OutputKind.Bag); | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Need an empty line here. |
||||||||
// Fit to data. | ||||||||
var transformer = pipeline.Fit(trainData); | ||||||||
|
||||||||
// Get transformed data | ||||||||
var transformedData = transformer.Transform(trainData); | ||||||||
|
||||||||
// Getting the data of the newly created column, so we can preview it. | ||||||||
var encodedColumn = transformedData.GetColumn<float[]>(ml, "EducationOneHotEncoded"); | ||||||||
|
||||||||
// A small printing utility. | ||||||||
Action<string, IEnumerable<float[]>> printHelper = (colName, column) => | ||||||||
{ | ||||||||
foreach (var row in column) | ||||||||
{ | ||||||||
for (var i = 0; i < row.Length; i++) | ||||||||
Console.Write($"{row[i]} "); | ||||||||
Console.WriteLine(); | ||||||||
} | ||||||||
}; | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider moving this to SampleUtils.ConsoleUtils |
||||||||
|
||||||||
printHelper("Education", encodedColumn); | ||||||||
|
||||||||
// data column obtained post-transformation. | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||||
// 1 0 0 0 ... | ||||||||
// 1 0 0 0 ... | ||||||||
// 1 0 0 0 ... | ||||||||
// 1 0 0 0 ... | ||||||||
// 0 1 0 0 ... | ||||||||
// .... | ||||||||
} | ||||||||
} | ||||||||
} | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please also add a sample for the overload version with ColumnOptions and call it OneHotEncodingWithOptions.cs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename this to OneHotEncoding (identical to the API extension method that it correspond to)