일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | ||||||
2 | 3 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 | 19 | 20 | 21 | 22 |
23 | 24 | 25 | 26 | 27 | 28 |
- 결합전문기관
- 웹 용어
- cudnn
- web 용어
- KNeighborsClassifier
- pycharm
- 재귀함수
- tensorflow
- web 사진
- 대이터
- C언어
- paragraph
- postorder
- inorder
- 데이터전문기관
- web 개발
- Keras
- java역사
- classification
- CES 2O21 참가
- bccard
- html
- vscode
- 자료구조
- web
- CES 2O21 참여
- mglearn
- 머신러닝
- discrete_scatter
- broscoding
- Today
- Total
목록분류 전체보기 (688)
bro's coding
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/caqI8N/btqDMHBmp87/OLBR41GXczIYvjVmPUMJuk/img.png)
(속성(단어) 줄이기) 단어집에서 min_df 이하의 횟수 만큼 나온 단어들을 제거 from sklearn.feature_extraction.text import CountVectorizer from sklearn.naive_bayes import BernoulliNB num_of_words=[] scores_BernoulliNB=[] min_df=range(1,10) for df in min_df: vect=CountVectorizer(min_df=df) vect.fit(text_train) num_of_words.append(len(vect.get_feature_names())) X_train=vect.transform(text_train) X_test=vect.transform(text_test) ..
import numpy as np # upload data file imdb_tarin,imdb_test=np.load('imdb.npy') # decode -> remove text_train=[s.decode().replace(' ','') for s in imdb_tarin.data] text_test=[s.decode().replace(' ','')for s in imdb_test.data] y_train=imdb_tarin.target y_test=imdb_test.target from sklearn.feature_extraction.text import CountVectorizer vect=CountVectorizer() # train(train data) vect.fit(text_train,..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/olVJA/btqDJLY4C9h/LogK1qodGDzTji3P9kha8K/img.png)
https://broscoding.tistory.com/203 sklearn.feature_extraction.text.CountVectorizer BOW(Bag of words) : 단어집 만들기 from sklearn.feature_extraction.text import CountVectorizer ss=['I am Tom. Tom is me!','He is Tom. He is a man'] vect=CountVectorizer() vect.fit(ss) ''' CountVector.. broscoding.tistory.com import numpy as np # upload data file imdb_tarin,imdb_test=np.load('imdb.npy') # decode -> remove..
# 단어집과 문장 대조하기 # for i in range(X_train[0].shape[1]): # if X_train[0,i]>0: # print(i,vect.get_feature_names()[i],X_train[0,i]) # get_feature_names을 반복문 안에 넣어놓으면 오래걸린다(따로 변수 선언) feature_name=vect.get_feature_names() for i in range(X_train[0].shape[1]): if X_train[0,i]>0: print(i,feature_name[i],X_train[0,i]) ''' 1723 actions 1 1741 actors 1 2880 almost 1 3375 and 2 3859 anything 1 4269 are 1 6512..
BOW(Bag of words) : 단어집 만들기 from sklearn.feature_extraction.text import CountVectorizer ss=['I am Tom. Tom is me!','He is Tom. He is a man'] vect=CountVectorizer() vect.fit(ss) ''' CountVectorizer(analyzer='word', binary=False, decode_error='strict', dtype=, encoding='utf-8', input='content', lowercase=True, max_df=1.0, max_features=None, min_df=1, ngram_range=(1, 1), preprocessor=None, stop_wor..
imdb_tarin.data[6] b"This movie has a special way of telling the story, at first i found it rather odd as it jumped through time and I had no idea whats happening. Anyway the story line was although simple, but still very real and touching. You met someone the first time, you fell in love completely, but broke up at last and promoted a deadly agony. Who hasn't go through this? but we will never ..
import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.datasets import load_files # 파일 읽기 imdb_tarin=load_files('data/aclImdb/train') imdb_test=load_files('data/aclImdb/test') # numpy로 저장(이후에 빨리 읽기 위해) np.save('imdb.npy',[imdb_tarin,imdb_test]) # numpy 파일 읽기 imdb_tarin,imdb_test=np.load('imdb.npy')
from sklearn.base import BaseEstimator, TransformerMixin class Myclassifier(BaseEstimator,TransformerMixin): def __init__(self,first_parameter=1,second_parameter=2): # __init__ 메소드에 필요한 모든 매개변수를 나열함 self.first_parameter=1 self.second_parameter=2 def fit(self,X,y=None): # fit 메소드는 X와 y매개변수만을 갖음 # 비지도 학습 모델이라도 y값을 받아야함 self.result # 모델 학습 print('모델 학습을 시작함!') return self def predict(self,X): # X만 ..
from sklearn.base import BaseEstimator, ClassifierMixin class Myclassifier(BaseEstimator,ClassifierMixin): def __init__(self): # __init__ 메소드에 필요한 모든 매개변수를 나열함 result=0 def fit(self,X,y): # fit 메소드는 X와 y매개변수만을 갖음 # 모델 학습 return self def predict(self,X): # X만 받음 pred_y=np.zeros(len(X))+self.result return pred_y score 등을 만들지 않아도 사용 할 수 있다. from sklearn.base import BaseEstimator, TransformerMixin ,..