반응형
Notice
Recent Posts
Recent Comments
Link
관리 메뉴

bro's coding

sklearn.textdata.LogisticRegression적용 본문

[AI]/python.sklearn

sklearn.textdata.LogisticRegression적용

givemebro 2020. 4. 27. 17:31
반응형

https://broscoding.tistory.com/203

 

sklearn.feature_extraction.text.CountVectorizer

BOW(Bag of words) : 단어집 만들기 from sklearn.feature_extraction.text import CountVectorizer ss=['I am Tom. Tom is me!','He is Tom. He is a man'] vect=CountVectorizer() vect.fit(ss) ''' CountVector..

broscoding.tistory.com

import numpy as np

# upload data file
imdb_tarin,imdb_test=np.load('imdb.npy')

# decode -> remove<br />
text_train=[s.decode().replace('<br />','') for s in imdb_tarin.data]
text_test=[s.decode().replace('<br .>','')for s in imdb_test.data]

y_train=imdb_tarin.target
y_test=imdb_test.target

from sklearn.feature_extraction.text import CountVectorizer

vect=CountVectorizer()
# train(train data)
vect.fit(text_train,y_train)

# define X_train
X_train=vect.transform(text_train)# sparse matrix
X_test=vect.transform(text_test)
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

# cross_val_score
scores=cross_val_score(LogisticRegression(C=1),X_test,y_test)
display(scores.mean())


# 0.8831598032887086

 

반응형
Comments