LightGBM #392

codemzs · 2018-06-22T07:17:40Z

LightGBM integration. This change adds API for LightGBM binary and multiclass classifier.

fixes #391

…eeded by lightgbm native code.

TomFinley · 2018-06-22T14:15:05Z

This is pretty fantastic @codemzs, and I see the test even on Mac is working! That's great. #Closed

TomFinley · 2018-06-22T14:15:18Z

build.sh

@@ -2,6 +2,12 @@

 set -e

+if [ "$(uname)" == "Darwin" ]; then
+    #Mac OS X platform


Mac OS X platform [](start = 5, length = 17)

Very minor note, I think they've deprecated the name "Mac OS X" in favor of "macOS". #Resolved

TomFinley · 2018-06-22T14:22:33Z

build.sh

@@ -2,6 +2,12 @@

 set -e

+if [ "$(uname)" == "Darwin" ]; then
+    #Mac OS X platform
+	yes | /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"


yes | /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" [](start = 1, length = 104)

What happens with this, if Homebrew is already installed? Homebrew is pretty popular -- I expect the proportion of people on Mac that are trying to build us that also already have homebrew to be pretty high.

Now, let's consider the set of people that do not have it installed. The potential installation of a package management system on someone's machine as part of the simple build system, I wonder if that's perhaps going a bit too far. This is actually changing the user's system... and indeed with this yes command we've actually explicitly taken away that choice. This concerns me. #Resolved

I think this should be added to the machine setup doc instead. We shouldn’t be installing things during the build scripts. #Resolved

TomFinley · 2018-06-22T14:23:07Z

test/Microsoft.ML.Predictor.Tests/TestPredictors.cs

+        [TestCategory("LightGBM")]
+        public void LightGBMClassificationTest()
+        {
+            RunMTAThread(() =>


RunMTAThread [](start = 12, length = 12)

What is the purpose of this? #Resolved

Helper to create a background thread and wait for it to finish.

In reply to: 197461317 [](ancestors = 197461317)

Thanks. What I mean is, what is the purpose of your call of it? I was not asking what the method does. I am asking why you are calling it. What goes wrong if you just run it in the current thread?

In reply to: 197570408 [](ancestors = 197570408,197461317)

We could do that but I think this function also catches any exception thrown by the test and using this function to run tests seems to be the pattern.

In reply to: 197575746 [](ancestors = 197575746,197570408,197461317)

TomFinley · 2018-06-22T14:23:59Z

test/Microsoft.ML.Predictor.Tests/TestPredictors.cs

+        [Fact]
+        [TestCategory("Binary")]
+        [TestCategory("LightGBM")]
+        public void LightGBMClassificationTest()


LightGBMClassificationTest [](start = 20, length = 26)

Should we port over more tests? I'm pretty sure we had more than this in the repo we're migrating from. #Resolved

I can try but my concern is some of them might depend on datasets that we may not have in this repo for legal reasons.

In reply to: 197461623 [](ancestors = 197461623)

I see. That seems like a very valid concern, but rather than leaving the code untested, should we not write tests for things like regression, ranking, multiclass that we currently are using for the other tests that exist in ML.NET?

In reply to: 197570510 [](ancestors = 197570510,197461623)

We certainly should and I'm going to try. If the dataset is not available I will try to use an existing one and change the pipeline.

In reply to: 197575933 [](ancestors = 197575933,197570510,197461623)

Got everything except ranking for which we need to find a new dataset. I will open an issue and skip the test for ranking. I did swap housing dataset for wine for regression tests.

In reply to: 197577162 [](ancestors = 197577162,197575933,197570510,197461623)

eerhardt · 2018-06-22T16:30:14Z

src/Microsoft.ML.LightGBM/Microsoft.ML.LightGBM.csproj

+  </ItemGroup>
+
+  <ItemGroup>
+    <PackageReference Include="LightGBM" Version="2.1.2.2" />


You’ll need to add a new nupkg project for this. Not all users will need/want LightGBM, so we shouldn’t require everyone to depend on/deploy it. #Resolved

Can you please clarify? My understanding is LightGBM will always be part of ML.NET

In reply to: 197500282 [](ancestors = 197500282)

Part of ML.NET, but in a separate nuget so those that don't need it don't get it. There would be the ML.NET nuget containing the core components... but then also and the ML.NET LightGBM nuget, the ML.NET OnnxExport nuget, the ML.NET FastTree nuget, the ML.NET CNTK nuget, the ML.NET torch nuget, etc.

In reply to: 197566747 [](ancestors = 197566747,197500282)

Ivanidzo4ka · 2018-06-22T19:24:54Z

src/Microsoft.ML.LightGBM/Microsoft.ML.LightGBM.csproj

+  </ItemGroup>
+
+  <ItemGroup>
+    <PackageReference Include="LightGBM" Version="2.1.2.2" />


2.1.2.2 [](start = 50, length = 7)

should we put this into build\Dependencies.props ? #Resolved

Ivanidzo4ka · 2018-06-22T19:25:13Z

src/Microsoft.ML.LightGBM/LightGbmArguments.cs

+using Microsoft.ML.Runtime.EntryPoints;
+using Microsoft.ML.Runtime.Internal.Internallearn;
+using Microsoft.ML.Runtime.LightGBM;
+using System.Collections.Generic;


order #Resolved

Ivanidzo4ka · 2018-06-22T19:26:27Z

src/Microsoft.ML/CSharpApi.cs

+
+    namespace Trainers
+    {
+        public enum LightGbmArgumentsEvalMetricType


LightGbmArgumentsEvalMetricType [](start = 20, length = 31)

No changes in core-ep.json? #Pending

apparently not but should investigate why

In reply to: 197546404 [](ancestors = 197546404)

Ivanidzo4ka · 2018-06-22T19:27:39Z

src/Microsoft.ML.LightGBM/LightGbmArguments.cs

+                public double RegAlpha = 0;
+
+                [Argument(ArgumentType.AtMostOnce,
+                    HelpText = "Control the balance of positive and negative weights, useful for unbalanced classes. A typical value to consider: sum(negative cases) / sum(positive cases).")]


can you break this sentence into two lines? it's super long #Resolved

Ivanidzo4ka · 2018-06-22T19:31:24Z

build.sh

@@ -2,6 +2,12 @@

 set -e

+if [ "$(uname)" == "Darwin" ]; then
+    #Mac OS X platform
+	yes | /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"


yes | /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" [](start = 1, length = 104)

Can we have comment why we doing this?
I would assume this is LightGBM dependency, but I would prefer to make this statement here, instead of going through file history #Resolved

Ivanidzo4ka · 2018-06-22T19:37:03Z

src/Microsoft.ML.LightGBM/LightGbmArguments.cs

+
+        public sealed class TreeBooster : BoosterParameter<TreeBooster.Arguments>
+        {
+            [TlcModule.Component(Name = "gbdt", FriendlyName = "Tree Booster", Desc = "Traditional Gradient Boosting Decision Tree.")]


"gbdt" [](start = 40, length = 6)

We use it 3 times, any chance you want to make it public const string? Same for FriendlyName, accross all boosters #Resolved

Ivanidzo4ka · 2018-06-22T19:38:42Z

src/Microsoft.ML.LightGBM/LightGbmBinaryTrainer.cs

+    /// </summary>
+    public static partial class LightGbm
+    {
+        [TlcModule.EntryPoint(Name = "Trainers.LightGbmBinaryClassifier", Desc = "Train an LightGBM binary class model", UserName = LightGbmBinaryTrainer.Summary, ShortName = LightGbmBinaryTrainer.ShortName)]


long line, can we split it? #Resolved

Ivanidzo4ka · 2018-06-22T19:40:05Z

src/Microsoft.ML.LightGBM/LightGbmBinaryTrainer.cs

+            return LearnerEntryPointsUtils.Train<LightGbmArguments, CommonOutputs.BinaryClassificationOutput>(host, input,
+                () => new LightGbmBinaryTrainer(host, input),
+                getLabel: () => LearnerEntryPointsUtils.FindColumn(host, input.TrainingData.Schema, input.LabelColumn),
+                getWeight: () => LearnerEntryPointsUtils.FindColumn(host, input.TrainingData.Schema, input.WeightColumn));


getWeight: () => LearnerEntryPointsUtils.FindColumn(host, input.TrainingData.Schema, input.WeightColumn) [](start = 16, length = 104)

LightGbmArguments is LearnerInputBaseWithGroupId, but we don't pass getGroup function... #Resolved

Not sure what you mean but this is really a port of LightGBM so please keep comments within the scope of the PR>

In reply to: 197549696 [](ancestors = 197549696)

Ivanidzo4ka · 2018-06-22T19:40:36Z

src/Microsoft.ML.LightGBM/LightGbmBinaryTrainer.cs

+        public const string RegistrationName = "LightGBMBinaryPredictor";
+        private static VersionInfo GetVersionInfo()
+        {
+            // REVIEW tfinley(guoke): can we decouple the version from FastTree predictor version ?


tfinley(guoke) [](start = 22, length = 14)

clean review comments from internal aliases #Resolved

Ivanidzo4ka · 2018-06-22T19:41:57Z

src/Microsoft.ML.LightGBM/LightGbmMulticlassTrainer.cs

+    /// </summary>
+    public static partial class LightGbm
+    {
+        [TlcModule.EntryPoint(Name = "Trainers.LightGbmClassifier", Desc = "Train an LightGBM multi class model", UserName = LightGbmMulticlassTrainer.Summary, ShortName = LightGbmMulticlassTrainer.ShortName)]


[TlcModule.EntryPoint(Name = "Trainers.LightGbmClassifier", Desc = "Train an LightGBM multi class model", UserName = LightGbmMulticlassTrainer.Summary, ShortName = [](start = 0, length = 172)

really long lane #Resolved

Ivanidzo4ka · 2018-06-22T19:42:41Z

src/Microsoft.ML.LightGBM/LightGbmMulticlassTrainer.cs

+            return LearnerEntryPointsUtils.Train<LightGbmArguments, CommonOutputs.MulticlassClassificationOutput>(host, input,
+                () => new LightGbmMulticlassTrainer(host, input),
+                getLabel: () => LearnerEntryPointsUtils.FindColumn(host, input.TrainingData.Schema, input.LabelColumn),
+                getWeight: () => LearnerEntryPointsUtils.FindColumn(host, input.TrainingData.Schema, input.WeightColumn));


getWeight [](start = 16, length = 9)

add getGroup: #Resolved

I'm not sure if there is a need to.

In reply to: 197550320 [](ancestors = 197550320)

Ivanidzo4ka · 2018-06-22T19:43:39Z

src/Microsoft.ML.LightGBM/LightGbmRankingTrainer.cs

+        public static CommonOutputs.RankingOutput TrainRanking(IHostEnvironment env, LightGbmArguments input)
+        {
+            Contracts.CheckValue(env, nameof(env));
+            var host = env.Register("TrainLightGBM");


TrainLightGBM [](start = 37, length = 13)

would be nice to have specific to kind channel, other than same string across all 4 trainers. #WontFix

eerhardt · 2018-06-25T15:04:57Z

src/Microsoft.ML.LightGBM/Microsoft.ML.LightGBM.csproj

+  </PropertyGroup>
+
+  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|AnyCPU'">
+    <AllowUnsafeBlocks>true</AllowUnsafeBlocks>


The AllowUnsafeBlocks property can be moved into the unconditional PropertyGroup above. It keeps the .csproj simple. #Resolved

eerhardt · 2018-06-25T15:49:29Z

src/Microsoft.ML.LightGBM/WrappedLightGbmDataset.cs

+            {
+                double*[] ptrArrayValues = new double*[numCol];
+                int*[] ptrArrayIndices = new int*[numCol];
+                Parallel.For(0, numCol, i =>


There appears to be a few misuses of Parallel.For in this code.

Usually if the operation inside the loop is very small, the overhead of doing the Thread management and communication outweighs any parallel savings. In this case, it is over 2x slower using Parallel.For than just a simple for loop. I write a simple benchmark of the two: https://gist.github.com/eerhardt/99071a3b6b00ddfe5ae02d9f85d4f9d6.

Results:

BenchmarkDotNet=v0.10.14, OS=Windows 10.0.17134 Intel Core i7-6700 CPU 3.40GHz (Skylake), 1 CPU, 8 logical and 4 physical cores .NET Core SDK=2.1.300 [Host] : .NET Core 2.1.0 (CoreCLR 4.6.26515.07, CoreFX 4.6.26515.06), 64bit RyuJIT DefaultJob : .NET Core 2.1.0 (CoreCLR 4.6.26515.07, CoreFX 4.6.26515.06), 64bit RyuJIT

Method Mean Error StdDev

Original 62.12 us 1.2207 us 1.199 us

NoParallel 27.06 us 0.5554 us 1.196 us

This is great! Thank you @eerhardt for going an extra mile to validate parallel for loop.

In reply to: 197847135 [](ancestors = 197847135)

eerhardt · 2018-06-25T15:49:55Z

src/Microsoft.ML.LightGBM/WrappedLightGbmDataset.cs

+            }
+            finally
+            {
+                Parallel.For(0, numCol, i =>


This use of Parallel.For should probably be removed. #Resolved

eerhardt · 2018-06-25T15:51:59Z

src/Microsoft.ML.LightGBM/LightGbmTrainerBase.cs

+            int[] nonZeroCntPerColumn = new int[catMetaData.NumCol];
+            int estimateNonZeroCnt = (int)(numSampleRow * density);
+            estimateNonZeroCnt = Math.Max(1, estimateNonZeroCnt);
+            Parallel.For(0, catMetaData.NumCol, i =>


Here's another case where Parallel.For is probably slowing the operation down. #Resolved

TomFinley · 2018-06-25T17:14:43Z

build.sh

@@ -2,6 +2,12 @@

 set -e

+if [ "$(uname)" == "Darwin" ]; then
+    #Mac OS X platform
+	yes | /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"


[](start = 0, length = 1)

Consistency with tabs spaces, and general indentation. Note that below you have two spaces for indentation. Here you have either four spaces (on one of the new lines) or one tab (on two of the new lines). #Resolved

glebuk · 2018-06-25T17:57:07Z

src/Microsoft.ML.LightGBM/WrappedLightGbmBooster.cs

+                        var leafOutput = Str2DoubleArray(kvPairs["leaf_value"], ' ');
+                        if (leafOutput[0] != 0)
+                        {
+                            // Convert Constant tree to Two-leaf tree, avoid being filter by TLC.


TLC [](start = 93, length = 3)

another mention of this. #Resolved

There are plenty of mentions of TLC across ML.NET codebase and we should open a separate PR and address this issue.

In reply to: 197886955 [](ancestors = 197886955)

glebuk · 2018-06-25T17:57:48Z

    [Fact(Skip = "Need CoreTLC specific baseline update")]

TLC - should we update throughout? #WontFix

Refers to: test/Microsoft.ML.Predictor.Tests/TestPredictors.cs:101 in f35aee4. [](commit_id = f35aee4, deletion_comment = False)

glebuk · 2018-06-25T17:58:15Z

src/Microsoft.ML.LightGBM/LightGbmArguments.cs

+
+        public sealed class TreeBooster : BoosterParameter<TreeBooster.Arguments>
+        {
+            [TlcModule.Component(Name = "gbdt", FriendlyName = "Tree Booster", Desc = "Traditional Gradient Boosting Decision Tree.")]


TlcModule [](start = 13, length = 9)

Another vestigial name, perhaps rename in different PR. #Resolved

codemzs · 2018-06-25T18:11:02Z

    [Fact(Skip = "Need CoreTLC specific baseline update")]

Sure but it would be its own change. It is not in the scope of this change which is merely porting LGBM to ML.NET.

In reply to: 400040559 [](ancestors = 400040559)

Refers to: test/Microsoft.ML.Predictor.Tests/TestPredictors.cs:101 in f35aee4. [](commit_id = f35aee4, deletion_comment = False)

…work.

eerhardt · 2018-06-26T14:27:16Z

test/Microsoft.ML.Predictor.Tests/TestAutoInference.cs

@@ -25,6 +26,11 @@ public TestAutoInference(ITestOutputHelper helper)
        [TestCategory("EntryPoints")]
        public void TestLearn()
        {
+            //Skip this test for macOS until engineering system installs OpenMP dependency for 


In my experience, it is best to log an issue for things like this, and then add a link to the issue in the code. That way we can track removing it.
And when someone stumbles across this test being disabled, they can go find out the status by looking at the issue. #Resolved

@codemzs mentioned at the end of yesterday he would, so he probably will today. #Resolved

Done. Refer to #417 #Resolved

Thanks sir

In reply to: 198242943 [](ancestors = 198242943)

eerhardt · 2018-06-26T14:29:46Z

src/Microsoft.ML.LightGBM/Microsoft.ML.LightGBM.csproj

+  <ItemGroup>
+    <ProjectReference Include="..\Microsoft.ML.Core\Microsoft.ML.Core.csproj" />
+    <ProjectReference Include="..\Microsoft.ML.Data\Microsoft.ML.Data.csproj" />
+    <ProjectReference Include="..\Microsoft.ML.FastTree\Microsoft.ML.FastTree.csproj" />


Why does LightGBM have a dependency on FastTree? #Resolved

As you might imagine, there were a number of existing facilities for FastTree predictors (both in this code, and elsewhere). So when Goulin and Taifeng were first coming up with this, they chose to actually return an actual FastTree predictor, rather than invent their own. The shared format between the two learners has many benefits, but it does result in this dependency unfortunately. A cleaner solution might have been to factor out the code so that it is outside the FastTree project, but that code is so awful (still 99% of it in its original MSR state) that possibly they found this a daunting prospect. (Certainly I would. :) )

In reply to: 198167265 [](ancestors = 198167265)

That is right, the predictor is shared between FastTree and LightGBM, refer to regressiontree.cs in FastTree project.

In reply to: 198213703 [](ancestors = 198213703,198167265)

TomFinley · 2018-06-26T16:33:15Z

test/Microsoft.ML.Predictor.Tests/TestAutoInference.cs

@@ -11,6 +11,7 @@
 using Microsoft.ML.Runtime.PipelineInference;
 using Xunit;
 using Xunit.Abstractions;
+using System.Runtime.InteropServices;



Keep sorted if you don't mind. #Resolved

TomFinley

Thanks @codemzs

…to lightgbm

eerhardt

The general infrastructure changes look good from my side.

* Bump master to v0.3 (#269) * RocketEngine fix for selecting top learners (#270) * Changes to RocketEngine to fix take top k logic. * Add namespace information to allow file to reference correct version of Formatting object. * small code cleanup (#271) * Preparation for syncing sources with internal repo (#275) * make class partial so I can add constuctor in separate file. add constructros for testing * formatting * Changes to use evaluator metrics names in PipelineSweeperSupportedMetrics. Made the private const strings in two classes public. (#276) * add missing subcomponents to sweepers (#278) * add missing subcomponents * right one * more cleanup * remove lotus references. (#252) * Random seed and concurrency for tests (#277) * first attempt * add comments * specify seed for random. make constructor internal. * Fix SupportedMetric.ByName() method (#280) * Fix for SupportedMetric.ByName() method. Include new unit test for function. * Fix for SupportedMetric.ByName() method. Include new unit test for function. * Fix for SupportedMetric.ByName() method. Include new unit test for function. * Removed unnecessary field filter, per review comment. * ML.NET-242: FastTreeRanking per-iteration loss metrics are empty (#289) When training a FastTreeRanker using the `testFrequency` parameter, it is expected that NDCG is prented every testFrequency iterations. However, instead of NDCG, only empty strings are printed. The root cause was that the MaxDCG property of the dataset was never calculated, so the NDCG calculation is aborted, leaving an empty string as a result. This PR fixes the problem by computing the MaxDCG for the dataset when the Tests are defined (so that if the tests are not defined, the MaxDCG will never be calculated). Closes #242 * Fixed typo in the method summary (#296) * Remove stale line of code from test. (#297) * Update release notes link to use aka.ms. (#294) Our release notes link is broken because the `Documentation` was renamed to `docs`. Fix this for the future to use a redirection link. * Add release notes for ML.NET 0.2 (#301) * Add release notes for ML.NET 0.2 * Adding release note about TextLoader changes and additional issue/PR references * Addressing comments: fixing typos, changing formatting, and adding references * Get the cross validation macro to work with non-default column names (#291) * Add label/grou/weight column name arguments to CV and train-test macros * Fix unit test. * Merge. * Update CSharp API. * Fix EntryPointCatalog test. * Address PR comments. * update sample in README.MD with 0.2 features. (#304) * update sample with new text loader API. * update with 0.2 stuff. * OVA should respect normalization in underlying learner (#310) * Respect normalization in OVA. * some cleanup * fix copypaste issues * Export to ONNX and cross-platform command-line tool to script ML.NET training and inference (#248) * Export to ONNX and Maml cross-platform executable. * Add Cluster evaluator (#316) * Add Cluster evaluator * fix copypaste * address comments * formatting * Fixes locale dependent test output comparisons (#109) The tests do not pass on systems with locale other than en-US. The error happens since the results are written to files and the contents of the files are compared to set of correct results produced under en-US locale. The fix is to imbue en-US culture to the test thread so that results will be output in format that is comparable with the test format. This patch fixes only tests, but do not guarantee calculation will be correct in production systems using a locale different than en-US. In particular, there can be problems in reading data and then conversing data from characters to numeric format. Fixes #74 * Add PartitionedFileLoader (#61) * Remove unexisting project from solution (#335) * GetSummaryDataView/Row implementation for Pca and Linear Predictors (#185) * Implement `ICanGetSummaryAsIDataView` on `PcaPredictor` class * Implement `ICanGetSummaryAsIRow` on `LinearPredictor` class * Disable ordinary least squares by removing the entry point (#286) * Disable ols by temporarily removing the entry point. It may be added again once we figure out how to ship MKL as part of this project. * add append function to pipeline (#284) Add `Append` function to pipeline for more fluent API than that allowed by `Add` * Removed field/column name checking of input type in TextLoader. (#327) * fix namespace issue in CSharpGenerator and some refactoring (#339) fix namespace issue and refactoring * Using named-tuple in OneToOneTransforms' constructor to make API more readable. (#324) * Minor formatting in CollectionDataSourceTests.cs (#348) * Create CalibratedPredictor instead of SchemaBindableCalibratedPredictor (#338) `CalibratorUtils.TrainCalibrator` and `TrainCalibratorIfNeeded` now creates `CalibratedPredictor` instead of `SchemaBindableCalibratedPredictor` whenever the predictor implements `IValueMapper`. * Remove reference and dependency on System.ValueTuple (#351) * Add link to samples (#355) * Use HideEnumValueAttribute for both manifest and C# API generation. (#356) * Use HideEnumValueAttribute for both manifest and C# API generation. * Unhide NAReplaceTransform.ReplacementKind.SpecifiedValue. This may require some other PR to resolve the corresponding issues. * Move the NuGet package build files into a TFM specific directory. (#370) When installing Microsoft.ML on an unsupported framework (like net452), it is currently getting installed successfully. However, users should be getting an error stating that net452 is not supported by this package. The cause is the build files exist for any TFM, which NuGet interprets as this package supports any TFM. Moving the build files to be consistent with the 'lib' folder support. Fix #357 * `Stream` subclasses now have `Close` call `base.Close` to ensure disposal. (#369) * Subclasses of `Stream` now have `Close` call `base.Close` to ensure disposal. * Add DeleteOnClose to File opening. * Remove explicit delete of file. * Remove explicit close of substream. * Since no longer deleting explicitly, no longer need `_overflowPath` member. * Return distinct array of ParameterSet when ProposeSweep is called (#368) * Changed List to HashSet to ensure that there are no duplicates * Update fast tree argument help text (#372) * Update fast tree argument help text * Update wording * Update API to fix test * Update core manifest JSON to update help text * Combine multiple tree ensemble models into a single tree ensemble (#364) * Add a way to create a single tree ensemble model from multiple tree ensemble models. * Address PR comments, and fix bugs in serializing/deserializing RegressionTrees. * Address PR comments. * add pipelineitem for Ova (#363) add pipelineitem for Ova * Fix CV macro to output the warnings data view properly. (#385) * Link to an example on using converting ML.NET model to ONNX. (#386) * Adding documentation about entry points, and entry points graphs: EntryPoints.md and GraphRunner.md (#295) * Adding EntryPoints.md and GraphRunner.md * addressing PR feedback * Updating the title of the GraphRunner.md file * adressing Tom's feedback * adressing feedback * code formatting for class names * Addressing Gal's comments * Adding an example of an entry point. Fixing casing on ML.NET * fixing link * Adding LDA Transform (#377) * Revert to using the native code (#413) Corrects an unintentional "typo" in FastTreeRanking.cs where there was mistakenly a USE_FASTTREENATIVE2 instead of USE_FASTTREENATIVE. This resulted in some obscure hidden ranking options (distance weighting, normalize query lambdas, and a few others) being unavailable. These are important for some applications. * LightGBM (#392) * LightGBM and test. * add test baselines and nuget source for lightGBM binaries. * Add entrypoint for lightGBM. * add unsafe flag for release build. * update nuget version. * make lightgbm test single threaded. * install gcc on OS machines to resolve dependencies on openmp thatis needed by lightgbm native code. * PR comments. Leave BREW and GCC in bash script to verify macOS tests work. * remove brew and gcc from build script. * PR feedback. * disable test on macOS. * disable test on macOS. * PR feedback. * Adding Factorization Machines (#383) * Adding Factorization Machines * ONNX API documentation. (#419) * ONNX API documentation. * Bring ensembles into codebase (#379) Introduce Ensemble codebase * enable macOS tests for LightGBM. (#422) * Create a shorter temp file name for model loading. (#397) Create a shorter temp file name for model loading, as well as remove the potential for a race condition among multiple openings by using the creation of a lock file. * removing extraneous character that broke the linux build, and with it unecessary cmake version requirement (#425) * EvaluatorUtils to handle label column of type key without text key values (#394) * Fix EvaluatorUtils to handle label column of type key without text key values. * Removing non source files from solution (#362)

* LightGBM and test. * add test baselines and nuget source for lightGBM binaries. * Add entrypoint for lightGBM. * add unsafe flag for release build. * update nuget version. * make lightgbm test single threaded. * install gcc on OS machines to resolve dependencies on openmp thatis needed by lightgbm native code. * PR comments. Leave BREW and GCC in bash script to verify macOS tests work. * remove brew and gcc from build script. * PR feedback. * disable test on macOS. * disable test on macOS. * PR feedback.

codemzs added 4 commits June 21, 2018 23:36

LightGBM and test.

be4c61d

add test baselines and nuget source for lightGBM binaries.

38531ba

Add entrypoint for lightGBM.

d4afe1c

add unsafe flag for release build.

1a5f61d

This was referenced Jun 22, 2018

C# .Net interface microsoft/LightGBM#763

Closed

[Feature]C#/.NET Bindings and Nuget package? microsoft/LightGBM#74

Closed

Nuget package for dotnet Core 2.0 microsoft/LightGBM#1405

Closed

codemzs added 3 commits June 22, 2018 00:46

update nuget version.

58fe9d1

make lightgbm test single threaded.

eb9cefb

install gcc on OS machines to resolve dependencies on openmp thatis n…

f35aee4

…eeded by lightgbm native code.

codemzs requested review from Ivanidzo4ka and zeahmed June 22, 2018 10:16

TomFinley reviewed Jun 22, 2018

View reviewed changes

eerhardt reviewed Jun 22, 2018

View reviewed changes

Ivanidzo4ka reviewed Jun 22, 2018

View reviewed changes

eerhardt reviewed Jun 25, 2018

View reviewed changes

TomFinley reviewed Jun 25, 2018

View reviewed changes

glebuk reviewed Jun 25, 2018

View reviewed changes

codemzs added 2 commits June 25, 2018 17:11

PR comments. Leave BREW and GCC in bash script to verify macOS tests …

e593935

…work.

remove brew and gcc from build script.

fd003b8

TomFinley mentioned this pull request Jun 26, 2018

Revert to using native code for FastTree ranking gradient #413

Merged

codemzs added 3 commits June 25, 2018 21:34

PR feedback.

4051788

disable test on macOS.

7f613be

disable test on macOS.

ec326dd

eerhardt reviewed Jun 26, 2018

View reviewed changes

TomFinley reviewed Jun 26, 2018

View reviewed changes

codemzs mentioned this pull request Jun 26, 2018

Enable LightGBM macOS tests #417

Closed

PR feedback.

793e7f5

TomFinley approved these changes Jun 26, 2018

View reviewed changes

Merge branch 'master' of https://github.com/dotnet/machinelearning in…

695cea4

…to lightgbm

eerhardt approved these changes Jun 26, 2018

View reviewed changes

codemzs merged commit 0a349f8 into dotnet:master Jun 26, 2018

codemzs deleted the lightgbm branch June 26, 2018 20:50

codemzs restored the lightgbm branch June 26, 2018 22:08

codemzs deleted the lightgbm branch June 26, 2018 22:14

TomFinley mentioned this pull request Jun 30, 2018

Update documentation for LightGBM and add missing binary references to console app. #452

Merged

eerhardt mentioned this pull request Jul 2, 2018

Isolate ONNX implementations in separate DLL #462

Merged

ghost locked as resolved and limited conversation to collaborators Mar 30, 2022

Method	Mean	Error	StdDev
Original	62.12 us	1.2207 us	1.199 us
NoParallel	27.06 us	0.5554 us	1.196 us

LightGBM #392

LightGBM #392

Conversation

codemzs commented Jun 22, 2018

TomFinley commented Jun 22, 2018 • edited Loading

TomFinley Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

TomFinley Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

eerhardt Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

TomFinley Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codemzs Jun 25, 2018 • edited Loading

Choose a reason for hiding this comment

TomFinley Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomFinley Jun 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eerhardt Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomFinley Jun 22, 2018 • edited Loading

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ivanidzo4ka Jun 22, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

eerhardt Jun 25, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

eerhardt Jun 25, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

codemzs Jun 25, 2018 • edited Loading

Choose a reason for hiding this comment

eerhardt Jun 25, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

eerhardt Jun 25, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

TomFinley Jun 25, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

glebuk Jun 25, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glebuk commented Jun 25, 2018 • edited by codemzs Loading

glebuk Jun 25, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

codemzs commented Jun 25, 2018

eerhardt Jun 26, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

TomFinley Jun 26, 2018 • edited by codemzs Loading

Choose a reason for hiding this comment

codemzs Jun 26, 2018 • edited Loading

TomFinley commented Jun 22, 2018 •

edited

Loading

TomFinley Jun 22, 2018 •

edited by codemzs

Loading

TomFinley Jun 22, 2018 •

edited by codemzs

Loading

eerhardt Jun 22, 2018 •

edited by codemzs

Loading

TomFinley Jun 22, 2018 •

edited by codemzs

Loading

codemzs Jun 25, 2018 •

edited

Loading

TomFinley Jun 22, 2018 •

edited by codemzs

Loading

TomFinley Jun 22, 2018 •

edited

Loading

eerhardt Jun 22, 2018 •

edited by codemzs

Loading

TomFinley Jun 22, 2018 •

edited

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

Ivanidzo4ka Jun 22, 2018 •

edited by codemzs

Loading

eerhardt Jun 25, 2018 •

edited by codemzs

Loading

eerhardt Jun 25, 2018 •

edited by codemzs

Loading

codemzs Jun 25, 2018 •

edited

Loading

eerhardt Jun 25, 2018 •

edited by codemzs

Loading

eerhardt Jun 25, 2018 •

edited by codemzs

Loading

TomFinley Jun 25, 2018 •

edited by codemzs

Loading

glebuk Jun 25, 2018 •

edited by codemzs

Loading

glebuk commented Jun 25, 2018 •

edited by codemzs

Loading

glebuk Jun 25, 2018 •

edited by codemzs

Loading

eerhardt Jun 26, 2018 •

edited by codemzs

Loading

TomFinley Jun 26, 2018 •

edited by codemzs

Loading

codemzs Jun 26, 2018 •

edited

Loading

eerhardt Jun 26, 2018 •

edited by codemzs

Loading

TomFinley Jun 26, 2018 •

edited by codemzs

Loading