Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Fix missing images in NNI_AutoFeatureEng.md #1979

Merged
merged 1 commit into from
Jan 22, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions docs/en_US/CommunitySharings/NNI_AutoFeatureEng.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ with a very reasonable underlying framework design.

Each new user could do AutoFeatureENG with NNI easily and efficiently. To exploring the AutoFeatureENG capability, downloads following required files, and then run NNI install through pip.

![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%201.jpg)
![](https://pic3.zhimg.com/v2-8886eea730cad25f5ac06ef1897cd7e4_r.jpg)
NNI treats AutoFeatureENG as a two-steps-task, feature generation exploration and feature selection. Feature generation exploration is mainly about feature derivation and high-order feature combination.

## 04 Feature Exploration
Expand All @@ -53,7 +53,7 @@ For feature derivation, NNI offers many operations which could automatically gen

Search space could be defined in a **JSON file**: to define how specific features intersect, which two columns intersect and how features generate from corresponding columns.

![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%202.jpg)
![](https://pic1.zhimg.com/v2-3c3eeec6eea9821e067412725e5d2317_r.jpg)

The picture shows us the procedure of defining search space. NNI provides count encoding for 1-order-op, as well as cross count encoding, aggerate statistics (min max var mean median nunique) for 2-order-op.

Expand All @@ -73,13 +73,13 @@ The purpose of Exploration is to generate new features. You can use **get_next_p

To avoid feature explosion and overfitting, feature selection is necessary. In the feature selection of NNI-AutoFeatureENG, LightGBM (Light Gradient Boosting Machine), a gradient boosting framework developed by Microsoft, is mainly promoted.

![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%205.jpg)
![](https://pic2.zhimg.com/v2-7bf9c6ae1303692101a911def478a172_r.jpg)

If you have used **XGBoost** or **GBDT**, you would know the algorithm based on tree structure can easily calculate the importance of each feature on results. LightGBM is able to make feature selection naturally.

The issue is that selected features might be applicable to *GBDT* (Gradient Boosting Decision Tree), but not to the linear algorithm like *LR* (Logistic Regression).

![](https://github.com/JSong-Jia/Pic/blob/master/images/pic%206.jpg)
![](https://pic4.zhimg.com/v2-d2f919497b0ed937acad0577f7a8df83_r.jpg)

## 06 Summary

Expand Down