sklearn.feature_extraction.text.CountVectorizer.ngram.LogisticRegression.2단어들만 출력

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

bro's coding

sklearn.feature_extraction.text.CountVectorizer.ngram.LogisticRegression.2단어들만 출력 본문

[AI]/python.sklearn

sklearn.feature_extraction.text.CountVectorizer.ngram.LogisticRegression.2단어들만 출력

givemebro 2020. 4. 28. 16:24

from sklearn.feature_extraction.text import CountVectorizer

vect=CountVectorizer(ngram_range=(1,2))
X_train=vect.fit_transform(text_train)

from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

model=LogisticRegression()
model.fit(X_train,y_train)

# 2개의 단어로 구성된 feature 추출
fn=np.array(vect.get_feature_names())
mask=np.array([s.find(' ')>=0 for s in fn])

w=model.coef_[0][mask]

index_sorted_w=np.argsort(w)
index_small=index_sorted_w[:20]
index_big=index_sorted_w[-20:]
index_small_big=np.r_[index_small,index_big]
small_big_name=fn[index_small_big]

# visualization
import matplotlib.pyplot as plt
plt.figure(figsize=[10,10])
plt.title('ngram_range=(1,2)')
plt.bar(range(40),w[index_small_big])
plt.xticks(range(40),small_big_name,rotation=90)
pass

저작자표시 (새창열림)

'[AI] > python.sklearn' 카테고리의 다른 글

활성함수를 사용하는 이유 (0)	2020.07.03
sklearn.TfidfVectorizer(tokenizer=twitter_tag.morphs).LogisticRegression (0)	2020.04.29
sklearn.decomposition.LatentDirichletAllocation (0)	2020.04.28
sklearn.feature_extraction.text.CountVectorizer.ngram_range적용 (0)	2020.04.28
sklearn.feature_extraction.text.TfidfTransformer.LogisticRegression적용 (0)	2020.04.28
sklearn.feature_extraction.text.TfidfTransformer (0)	2020.04.28
sklearn.feature_extraction.text.CountVectorizer.stop_words적용 (0)	2020.04.28
sklearn.feature_extraction.text.CountVectorizer.max_df변화 관찰 (0)	2020.04.28

'[AI]/python.sklearn' Related Articles

Comments

bro's coding

sklearn.feature_extraction.text.CountVectorizer.ngram.LogisticRegression.2단어들만 출력 본문

sklearn.feature_extraction.text.CountVectorizer.ngram.LogisticRegression.2단어들만 출력

'[AI] > python.sklearn' 카테고리의 다른 글

티스토리툴바