๋ก์ง์คํฑ ํ๊ท(Logistic Regression)๋ ์ ํ ํ๊ท ๋ฐฉ์์ ์์ฉํด ๋ถ๋ฅ (Classification)์ ์ ์ฉํ ๋ชจ๋ธ์ ๋๋ค. ๋ก์ง์คํฑ ํ๊ท๋ ๊ฐ๋จํ๋ฉด์๋ ํ๋ผ๋ฏธํฐ ์๊ฐ ๋ง์ง ์์ ๋น ๋ฅด๊ฒ ์์ธกํ ์ ์์ต๋๋ค. ์) ์ฃผ์ฐจ์ฅ์ ๋น์๋ฆฌ๊ฐ ์๋์ง๋ฅผ ์ถ์
๋ก์ง์คํฑ ํ๊ท๋ Sigmoid function์ ํ์ฉํด ํ๊น๊ฐ์ ํฌํจ๋ ํ๋ฅ ์ ์์ธกํฉ๋๋ค.
ํผ์ ํธ๋ก (Perceptron)๊ณผ ๋น์ทํ ํน์ง์ ๊ฐ์ง๋๋ค.
- ์ถ๋ ฅ๊ณผ ๋ณ๋๋ก ์ถ๋ ฅ๊ฐ์ ํด๋นํ๋ ํด๋์ค์ ์ํ ํ๋ฅ ์ ๊ณ์ฐํ ์ ์์ต๋๋ค.
- ์จ๋ผ์ธ ํ์ต๊ณผ ๋ฐฐ์น ํ์ต์ ๋ชจ๋ ์ฌ์ฉ ๊ฐ๋ฅํ๋ค.
- ์์ธก ์ฑ๋ฅ์ ๋ณดํต์ด์ง๋ง ํ์ต ์๋๊ฐ ๋น ๋ฅด๋ค
- ๊ณผ์ ํฉ์ ๋ฐฉ์งํ๋ ๊ท์ ํญ์ด ์ถ๊ฐ๋์ด ์๋ค.
- ํนํ ์ถ๋ ฅ์ ํ๋ฅ ์ ๊ณ์ฐํ ์ ์๋ค๋ ํน์ฑ์ ๊ฐ์ง๊ณ ์์ด์ ๊ด๊ณ ๋ก ์ธํ ํด๋ฆญ ์์ธก๊ณผ ๊ฐ์ ๋ถ์ผ์๋ ํ์ฉ๋๋ค.
- Logistic Regression์ ์ ํ ๋ถ๋ฆฌ ๊ฐ๋ฅํ ๋์์ ๋ถ๋ฆฌํ๋ ์๊ณ ๋ฆฌ์ฆ์ด๋ฏ๋ก ๊ฒฐ์ ๊ฒฝ๊ณ๊ฐ ์ง์ ์ ๋๋ค.
- ํ์ฑํํจ์๋ก Sigmoid ํจ์ ๋๋ Logistic sigmoid function์ ์๋๋ค.
- ์์คํจ์๋ก ๊ต์ฐจ ์ํธ๋กํผ ์ค์ฐจ ํจ์(Cross-entropy error function)์ ์ฌ์ฉํฉ๋๋ค.
- ๊ท์ ํญ(Regularization term)์ด ์ถ๊ฐ๋์ด์ ๊ณผ์ ํฉ์ ๋ฐฉ์งํ ์ ์์ต๋๋ค.
- ์จ๋ผ์ธํ์ต๊ณผ ๋ฐฐ์นํ์ต์ ๋ชจ๋ ์ ์ฉํ ์ ์์ต๋๋ค.
Logistric Regression์ Linear Regression๊ณผ ๋น์ทํ ๋ฐฉ๋ฒ์ผ๋ก ๋์ํฉ๋๋ค. Logistic Regression์ ๊ฐ ํน์ฑ(feature)์ ๋ํด ๋ชจ๋ธ ์ ํ๋๋ฅผ ์ต๋ํํ๋ ์ ์ ํ ๊ฐ์ค์น(weight) ๋๋ ๊ณ์(coefficient)๋ฅผ ์ฐพ์ต๋๋ค. ์ ํํ๊ท์ฒ๋ผ ๊ฐ ํญ์ ํฉ์ ๊ทธ๋๋ก ์ถ๋ ฅํ๋ ๋์ Logistic Regression์ Sigmoid Function์ ์ ์ฉํฉ๋๋ค.
Linear regression gives you a continuous output, but logistic regression provides a constant output. An example of the continuous output is house price and stock price. Example's of the discrete output is predicting whether a patient has cancer or not, predicting whether the customer will churn. Linear regression is estimated using Ordinary Least Squares (OLS) while logistic regression is estimated using Maximum Likelihood Estimation (MLE) approach.
logistic_regression.ipynb๋ฅผ ์๋์ ๊ฐ์ด ์ค๋ช ํฉ๋๋ค.
์๋์ ๊ฐ์ด pandas๋ฅผ ํตํด fish data๋ฅผ ๋ก๋ฉํฉ๋๋ค. ์ด ๋ฐ์ดํฐ๋ ์บ๊ธ์ Red Wine Quality์ ์์ต๋๋ค. ๋ํ, fish_csv_data๋ ํผ์ ๊ณต๋ถํ๋ ๋จธ์ ๋ฌ๋+๋ฅ๋ฌ๋์์ ๊ฐ์ ธ์์ต๋๋ค.
import pandas as pd
fish = pd.read_csv('https://bit.ly/fish_csv_data')
fish.head()
์๋์ ๊ฐ์ fish ๋ฐ์ดํฐ๋ฅผ ๋ก๋ํ์์ต๋๋ค. ํ์ค์ ์(z)๋ "z = a * Weight + b * Length + c * Diagonal + d * Height + e * Width + f"์ ๊ฐ์ด ํํ๋ฉ๋๋ค.
์ด๋ ์ฝ์ด์จ fish ๋ฐ์ดํฐ๋ "Bream", "Roach", "Whitefish", "Parkki", "Perch", "Pike", "Smelt"์ด ์์ต๋๋ค.
Numpyํฌ๋งท์ผ๋ก ๋ณํํฉ๋๋ค.
fish_input = fish[['Weight','Length','Diagonal','Height','Width']].to_numpy()
print(fish_input[:5])
[[242. 25.4 30. 11.52 4.02 ]
[290. 26.3 31.2 12.48 4.3056]
[340. 26.5 31.1 12.3778 4.6961]
[363. 29. 33.5 12.73 4.4555]
[430. 29. 34. 12.444 5.134 ]]
fish_target = fish['Species'].to_numpy()
print(fish_target)
['Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream'
'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream'
'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream'
'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Bream' 'Roach'
'Roach' 'Roach' 'Roach' 'Roach' 'Roach' 'Roach' 'Roach' 'Roach' 'Roach'
'Roach' 'Roach' 'Roach' 'Roach' 'Roach' 'Roach' 'Roach' 'Roach' 'Roach'
Train, Test Set์ ์ค๋นํฉ๋๋ค.
from sklearn.model_selection import train_test_split
train_input, test_input, train_target, test_target = train_test_split(
fish_input, fish_target, random_state=42)
StandardScaler์ ์ด์ฉํ์ฌ ์ ๊ทํ๋ฅผ ํฉ๋๋ค.
from sklearn.preprocessing import StandardScaler
ss = StandardScaler()
ss.fit(train_input)
train_scaled = ss.transform(train_input)
test_scaled = ss.transform(test_input)
Logistic regression๊ณผ ๋น๊ตํ๊ธฐ ์ํ์ฌ K ์ต๊ทผ์ ๋ถ๋ฅ๋ฅผ ์ํํด ๋ณด์์ต๋๋ค.
from sklearn.neighbors import KNeighborsClassifier
kn = KNeighborsClassifier(n_neighbors=3)
kn.fit(train_scaled, train_target)
print(kn.score(train_scaled, train_target))
print(kn.score(test_scaled, test_target))
0.8907563025210085
0.85
์ด์ง๋ถ๋ฅ์์๋ ํ์ค์ ์(z)์ ํ์จ๋ก ๋ฐ๊พธ๊ธฐ ์ํ์ฌ Sigmoid ํจ์๋ฅผ ์ฌ์ฉํฉ๋๋ค.
๋ฐ์ดํฐ๋ฅผ ์ค๋นํ๊ณ , ์ด์ง ๋ก์ง์คํฑ ํ๊ท๋ฅผ ์ํํฉ๋๋ค.
bream_smelt_indexes = (train_target == 'Bream') | (train_target == 'Smelt')
train_bream_smelt = train_scaled[bream_smelt_indexes]
target_bream_smelt = train_target[bream_smelt_indexes]
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
lr.fit(train_bream_smelt, target_bream_smelt)
print(kn.score(train_bream_smelt, target_bream_smelt))
0.9696969696969697
์๋์ ๊ฐ์ด ๋ถ๋ฅ ํญ๋ชฉ์ classes๋ก ํ์ธํ๊ณ , ๊ณ์(coeffcient)๋ค๊ณผ ์ ํธ(intercept)์ ํ์ธํ ์ ์์ต๋๋ค. ์ด๋, Bream, Smelt์ ๊ณ์ฐ๋ ๊ฐ์ proba๋ก ์ฐ์ด๋ณด๋ฉด 0-1์ ํ๋ฅ ๋ก ์๋์ ๊ฐ์ด ๊ณ์ฐ๋จ์ ์ ์ ์์ต๋๋ค. ํ์ค์ ์(z)๋ "z = a * Weight + b * Length + c * Diagonal + d * Height + e * Width + f"๋ก ํํ๋๋ฏ๋ก, ์ด๋ a, b, c, d, e๋ "lr.coef_"๋ก ์ ์ฉ๋จ์ ์๋์ ๊ฐ์ด ์ ์ ์์ต๋๋ค.
print(lr.classes_)
print(lr.coef_, lr.intercept_)
print(lr.predict(train_bream_smelt[:5]))
print(lr.predict_proba(train_bream_smelt[:5]))
['Bream' 'Smelt']
[[-0.4037798 -0.57620209 -0.66280298 -1.01290277 -0.73168947]] [-2.16155132]
['Bream' 'Smelt' 'Bream' 'Bream' 'Bream']
[[0.99759855 0.00240145]
[0.02735183 0.97264817]
[0.99486072 0.00513928]
[0.98584202 0.01415798]
[0.99767269 0.00232731]]
ํ์ค์ ์๋ฅผ decision_function์ผ๋ก ๊ตฌํ๊ณ , ์ด๋ฅผ Sigmoid ํจ์์ ํด๋นํ๋ expit()๋ก ๊ตฌํ๋ฉด, ์๊ธฐ์ ํ์จ๊ณผ ๊ฐ์ ๊ฐ์์ ์์ ์์ต๋๋ค.
decisions = lr.decision_function(train_bream_smelt[:5])
print(decisions)
from scipy.special import expit
print(expit(decisions))
[-6.02927744 3.57123907 -5.26568906 -4.24321775 -6.0607117 ]
[0.00240145 0.97264817 0.00513928 0.01415798 0.00232731]
ํ์ค์ ์(z)์ ํ์จ๋ก ๋ฐ๊พธ๊ธฐ ์ํ์ฌ Softmax ํจ์๋ฅผ ์ฌ์ฉํฉ๋๋ค. ๋ค์ค๋ถ๋ฅ๋ฅผ ์ฐ๋ ๋ก์ง์คํฑ ํ๊ท์์๋ C๋ฅผ ์ด์ฉํด ๊ท์ (L2 ๊ท์ ๋ฅผ ๊ธฐ๋ณธ์ ์ฉ)๋ฅผ ํ๋๋ฐ, C ๊ฐ์ด ํด์๋ก ๊ท์ ๊ฐ ์ฝํด์ง๋๋ค. ์๋์ ๊ฐ์ด ๋ค์ค๋ถ๋ฅ๋ก Logistric regression์ ์ํํฉ๋๋ค. K ์ต๊ทผ์ ๋ถ๋ฅ๋ณด๋ค ์ข์ ๊ฒฐ๊ณผ๋ฅผ ์ป๊ณ ์์ต๋๋ค.
lr = LogisticRegression(C=20, max_iter=1000)
lr.fit(train_scaled, train_target)
print(lr.score(train_scaled, train_target))
print(lr.score(test_scaled, test_target))
0.9327731092436975
0.925
์ด๋ ๋ด๋ถ์์ ๊ฒ์ฐ๋ ๊ฐ์ ์๋์ ๊ฐ์ด predict์ predict_proba๋ก ํ์ธ ํ ์ ์์ต๋๋ค.
print(lr.classes_)
print(lr.coef_.shape, lr.intercept_.shape)
proba = lr.predict_proba(test_scaled[:5])
import numpy as np
print(np.round(proba, decimals=3))
['Bream' 'Parkki' 'Perch' 'Pike' 'Roach' 'Smelt' 'Whitefish']
(7, 5) (7,)
[[0. 0.014 0.841 0. 0.136 0.007 0.003]
[0. 0.003 0.044 0. 0.007 0.946 0. ]
[0. 0. 0.034 0.935 0.015 0.016 0. ]
[0.011 0.034 0.306 0.007 0.567 0. 0.076]
[0. 0. 0.904 0.002 0.089 0.002 0.001]]
๋ง์ฐฌ๊ฐ์ง๋ก ํ์ค์ ์๋ฅผ ๊ณ์ฐํ๊ณ , Softmax๋ก ํ์จ์ ๊ณ์ฐํ๋ฉด predict_proba๋ก ์ป์ด์ง ๊ฒฐ๊ณผ์ ๊ฐ์ต๋๋ค.
decision = lr.decision_function(test_scaled[:5])
print(np.round(decision, decimals=2))
from scipy.special import softmax
proba = softmax(decision, axis=1)
print(np.round(proba, decimals=3))
[[ -6.5 1.03 5.16 -2.73 3.34 0.33 -0.63]
[-10.86 1.93 4.77 -2.4 2.98 7.84 -4.26]
[ -4.34 -6.23 3.17 6.49 2.36 2.42 -3.87]
[ -0.68 0.45 2.65 -1.19 3.26 -5.75 1.26]
[ -6.4 -1.99 5.82 -0.11 3.5 -0.11 -0.71]]
[[0. 0.014 0.841 0. 0.136 0.007 0.003]
[0. 0.003 0.044 0. 0.007 0.946 0. ]
[0. 0. 0.034 0.935 0.015 0.016 0. ]
[0.011 0.034 0.306 0.007 0.567 0. 0.076]
[0. 0. 0.904 0.002 0.089 0.002 0.001]]
ํ๋ฅ ์ ๊ฒฝ์ฌํ๊ฐ๋ฒ(SGD, Stochastic Gradient Descent)์ ์ด์ฉํ์ฌ ์ ํ๋ชจ๋ธ๋ก Linear classifiers์ ๊ตฌํํ ์ ์์ต๋๋ค. SGDClassifier.ipynb์ ๋ํด ์ค๋ช ํฉ๋๋ค.
from sklearn.model_selection import cross_validate
from sklearn.linear_model import SGDClassifier
sc = SGDClassifier(loss='log', max_iter=5, random_state=42)
scores = cross_validate(sc, train_scaled, train_target, n_jobs=-1)
import numpy as np
print(np.mean(scores['test_score']))
์๊ธฐ์ sample ์ฌ์ฉ์ 3 class๊ฐ n_splits๋ณด๋ค ์์ผ๋ฏ๋ก ๊ฒฐ๊ณผ๋ ์ข์ง ์์ต๋๋ค.
/home/ec2-user/anaconda3/envs/python3/lib/python3.8/site-packages/sklearn/model_selection/_split.py:676: UserWarning: The least populated class in y has only 3 members, which is less than n_splits=5.
warnings.warn(
0.7144927536231884
Logistic Regression ์ ์ฉ์์ ์ ํ๋๊ฐ ๋จ์ด์ง๋ ์ผ์ด์ค์ ๋ํด ์ค๋ช ํฉ๋๋ค.
- ๋ฐ์ดํฐ๋ฅผ ์ค๋นํฉ๋๋ค.
pandas๋ก ๋ฐ์ดํฐ๋ฅผ ๋ก๋ํฉ๋๋ค.
import pandas as pd
wine = pd.read_csv('https://bit.ly/wine_csv_data')
wine.head()
์ด๋ ๋ฐ์ดํฐ์ ํํ๋ ์๋์ ๊ฐ์ด "alcohol", "sugar", "pH"์ ๊ฐ์ ํญ๋ชฉ์ ๊ฐ์ง๊ณ ์์ต๋๋ค.
์ฌ๊ธฐ์ ๋ฐ์ดํฐ์ ํํ๋ ์๋์ ๊ฐ์ต๋๋ค.
wine.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6497 entries, 0 to 6496
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 alcohol 6497 non-null float64
1 sugar 6497 non-null float64
2 pH 6497 non-null float64
3 class 6497 non-null float64
dtypes: float64(4)
memory usage: 203.2 KB
Train, Test Set์ ์๋์ ๊ฐ์ด ์ค๋นํฉ๋๋ค.
data = wine[['alcohol', 'sugar', 'pH']].to_numpy()
target = wine['class'].to_numpy()
from sklearn.model_selection import train_test_split
train_input, test_input, train_target, test_target = train_test_split(
data, target, test_size=0.2, random_state=42)
print(train_input.shape, test_input.shape)
(5197, 3) (1300, 3)
์๋์ ๊ฐ์ด ์ ๊ทํ๋ฅผ ํฉ๋๋ค.
from sklearn.preprocessing import StandardScaler
ss = StandardScaler()
ss.fit(train_input)
train_scaled = ss.transform(train_input)
test_scaled = ss.transform(test_input)
- Logistic Regression์ผ๋ก ๊ฒฐ์ ๊ณ์๋ฅผ ๊ตฌํฉ๋๋ค.
์๋์ ๊ฐ์ด ๊ณผ์์ ํฉ(Underfitting)์ ๊ฒฐ๊ณผ๋ฅผ ์ป์ต๋๋ค.
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
lr.fit(train_scaled, train_target)
print(lr.score(train_scaled, train_target))
print(lr.score(test_scaled, test_target))
0.7808350971714451
0.7776923076923077
์ด๋, "alcohol", "sugar", "pH"์ ๋ํ ๊ธฐ์ธ๊ธฐ๋ ์๋์ ๊ฐ์ต๋๋ค.
print(lr.coef_, lr.intercept_)
[[ 0.51270274 1.6733911 -0.68767781]] [1.81777902]
ํผ์ ๊ณต๋ถํ๋ ๋จธ์ ๋ฌ๋+๋ฅ๋ฌ๋
sklearn.linear_model.SGDClassifier
[Machine Learning at Work - ํ๋น๋ฏธ๋์ด]