'분류 전체보기' 카테고리의 글 목록 (50 Page)

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

목록분류 전체보기 (689)

bro's coding

sklearn.feature_extraction.text.CountVectorizer.max_df변화 관찰

불용어 적용(stop words) 관사 지시 대명사 등등 관용적으로 사용하는 단어들 where when the it etc... max_df(너무 많이 나오는 애들)(비율) stop_words : 불용어 목록을 지정함 from sklearn.feature_extraction.text import CountVectorizer from sklearn.naive_bayes import BernoulliNB num_of_words=[] scores_BernoulliNB=[] max_df=np.arange(0.1,1,0.1) for df in max_df: vect=CountVectorizer(max_df=df) vect.fit(text_train) num_of_words.append(len(vect.get_fe..

[AI]/python.sklearn 2020. 4. 28. 10:25

sklearn.feature_extraction.text.CountVectorizer.min_df변화 관찰

(속성(단어) 줄이기) 단어집에서 min_df 이하의 횟수 만큼 나온 단어들을 제거 from sklearn.feature_extraction.text import CountVectorizer from sklearn.naive_bayes import BernoulliNB num_of_words=[] scores_BernoulliNB=[] min_df=range(1,10) for df in min_df: vect=CountVectorizer(min_df=df) vect.fit(text_train) num_of_words.append(len(vect.get_feature_names())) X_train=vect.transform(text_train) X_test=vect.transform(text_test) ..

[AI]/python.sklearn 2020. 4. 28. 09:57

sklearn.textdata.BernoulliNB적용

import numpy as np # upload data file imdb_tarin,imdb_test=np.load('imdb.npy') # decode -> remove text_train=[s.decode().replace(' ','') for s in imdb_tarin.data] text_test=[s.decode().replace(' ','')for s in imdb_test.data] y_train=imdb_tarin.target y_test=imdb_test.target from sklearn.feature_extraction.text import CountVectorizer vect=CountVectorizer() # train(train data) vect.fit(text_train,..

[AI]/python.sklearn 2020. 4. 28. 09:37

sklearn.textdata.LogisticRegression적용

https://broscoding.tistory.com/203 sklearn.feature_extraction.text.CountVectorizer BOW(Bag of words) : 단어집 만들기 from sklearn.feature_extraction.text import CountVectorizer ss=['I am Tom. Tom is me!','He is Tom. He is a man'] vect=CountVectorizer() vect.fit(ss) ''' CountVector.. broscoding.tistory.com import numpy as np # upload data file imdb_tarin,imdb_test=np.load('imdb.npy') # decode -> remove..

[AI]/python.sklearn 2020. 4. 27. 17:31

sklearn.textdata.단어집과 문장 대조하기

# 단어집과 문장 대조하기 # for i in range(X_train[0].shape[1]): # if X_train[0,i]>0: # print(i,vect.get_feature_names()[i],X_train[0,i]) # get_feature_names을 반복문 안에 넣어놓으면 오래걸린다(따로 변수 선언) feature_name=vect.get_feature_names() for i in range(X_train[0].shape[1]): if X_train[0,i]>0: print(i,feature_name[i],X_train[0,i]) ''' 1723 actions 1 1741 actors 1 2880 almost 1 3375 and 2 3859 anything 1 4269 are 1 6512..

[AI]/python.sklearn 2020. 4. 27. 16:21

sklearn.feature_extraction.text.CountVectorizer

BOW(Bag of words) : 단어집 만들기 from sklearn.feature_extraction.text import CountVectorizer ss=['I am Tom. Tom is me!','He is Tom. He is a man'] vect=CountVectorizer() vect.fit(ss) ''' CountVectorizer(analyzer='word', binary=False, decode_error='strict', dtype=, encoding='utf-8', input='content', lowercase=True, max_df=1.0, max_features=None, min_df=1, ngram_range=(1, 1), preprocessor=None, stop_wor..

[AI]/python.sklearn 2020. 4. 27. 15:28

python.str.replace

imdb_tarin.data[6] b"This movie has a special way of telling the story, at first i found it rather odd as it jumped through time and I had no idea whats happening. Anyway the story line was although simple, but still very real and touching. You met someone the first time, you fell in love completely, but broke up at last and promoted a deadly agony. Who hasn't go through this? but we will never ..

[IT]/python 2020. 4. 27. 14:25

python.str.encode/decode

[IT]/python 2020. 4. 27. 14:19

sklearn.textdata.datasets.load_files

import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.datasets import load_files # 파일 읽기 imdb_tarin=load_files('data/aclImdb/train') imdb_test=load_files('data/aclImdb/test') # numpy로 저장(이후에 빨리 읽기 위해) np.save('imdb.npy',[imdb_tarin,imdb_test]) # numpy 파일 읽기 imdb_tarin,imdb_test=np.load('imdb.npy')

[AI]/python.sklearn 2020. 4. 27. 12:46

sklearn.base.BaseEstimator, TransformerMixin(추정기 만들기)

from sklearn.base import BaseEstimator, TransformerMixin class Myclassifier(BaseEstimator,TransformerMixin): def __init__(self,first_parameter=1,second_parameter=2): # __init__ 메소드에 필요한 모든 매개변수를 나열함 self.first_parameter=1 self.second_parameter=2 def fit(self,X,y=None): # fit 메소드는 X와 y매개변수만을 갖음 # 비지도 학습 모델이라도 y값을 받아야함 self.result # 모델 학습 print('모델 학습을 시작함!') return self def predict(self,X): # X만 ..

[AI]/python.sklearn 2020. 4. 27. 11:13

Prev 1 ··· 47 48 49 50 51 52 53 ··· 69 Next

목록분류 전체보기 (689)

bro's coding

티스토리툴바