반응형
Notice
Recent Posts
Recent Comments
Link
관리 메뉴

bro's coding

sklearn.svm.SVC and normalization(breast cancer) 본문

[AI]/python.sklearn

sklearn.svm.SVC and normalization(breast cancer)

givemebro 2020. 4. 16. 16:29
반응형
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

 

from sklearn.datasets import load_breast_cancer
cancer =load_breast_cancer()
X=cancer.data
y=cancer.target

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y)

 

from sklearn.svm import SVC
model=SVC()
model.fit(X_train,y_train)

model.score(X_test,y_test)
# 0.6083916083916084
# 정규화를 하지 않아 결과가 형편없다.


# target값의 비율도 고려해야한다.[0: 37% , 1: 63%]
cancer.target.sum()/len(cancer.target)
# 0.6274165202108963

 

# normalization

X=(X-X.mean(axis=0))/X.std(axis=0)

 

X_train,X_test,y_train,y_test=train_test_split(X,y)

 

from sklearn.svm import SVC
model=SVC()
model.fit(X_train,y_train)
model.score(X_test,y_test)

# 0.965034965034965

 

# test data가 fit에 영향을 주면 안 되기 때문에 오류가 있다.
# 학습데이터의 mean과 std를 사용해야한다.

X_train,X_test,y_train,y_test=train_test_split(X,y)
m=X_train.mean(axis=0)
s=X_train.std(axis=0)

X_train_norm=(X_train-m)/s
X_test_norm=(X_test-m)/s

model=SVC()
model.fit(X_train_norm,y_train)

model.score(X_test_norm,y_test)

# 0.965034965034965
반응형
Comments