일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | ||||||
2 | 3 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 | 19 | 20 | 21 | 22 |
23 | 24 | 25 | 26 | 27 | 28 |
- web 사진
- postorder
- cudnn
- 데이터전문기관
- 결합전문기관
- vscode
- bccard
- KNeighborsClassifier
- CES 2O21 참가
- web 용어
- tensorflow
- pycharm
- broscoding
- mglearn
- 웹 용어
- paragraph
- web
- C언어
- web 개발
- 자료구조
- CES 2O21 참여
- inorder
- discrete_scatter
- 대이터
- Keras
- classification
- html
- 머신러닝
- java역사
- 재귀함수
- Today
- Total
목록[IT] (431)
bro's coding
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/cqIwtL/btqDf1fO3VD/zRA1wNCg3mNG1HEGP1Qtb0/img.png)
data[['역ID','역명','노선명']].drop_duplicates() #drop_duplicates() : 위 세가지 모두 중복되는 것을 제거해준다
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/HGkPc/btqDdIVS5NK/nBPXkSDEOyKCv5EACXQBCK/img.png)
s=data[['노선명','역ID']].drop_duplicates().노선명.value_counts() line_code['역수']=line_code.노선명.map(s) line_code [[day, data[data.사용일자==day].shape[0]] for day in range(20190501,20190532)] 더보기 [[20190501, 593], [20190502, 593], [20190503, 593], [20190504, 591], [20190505, 591], [20190506, 592], [20190507, 590], [20190508, 592], [20190509, 591], [20190510, 591], [20190511, 592], [20190512, 590], [2019051..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/BMFqa/btqDcx8ogxo/IcSqCdnl9dkQKRGMUKZZMK/img.png)
# 노선명 코드화 line_name=np.sort(data.노선명.unique()) line_code=pd.DataFrame(list(enumerate(line_name)),columns=['노선코드','노선명']) line_code dict(enumerate(np.sort(data.노선명.unique()))) #뒤집기 c1=dict(enumerate(np.sort(data.노선명.unique()))) c2={v:k for k,v in c1.items()} c2
# 가볍게 복붙 from matplotlib import font_manager, rc font_name=font_manager.FontProperties(fname="C:/Windows/Fonts/HMFMPYUN.TTF").get_name() # fname="C:/Windows/Fonts/HMFMPYUN.TTF" 원하는 font 찾아서 바꿈 rc('font',family=font_name)
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/b3W0x0/btqDcyM1Uz8/VJ1VZ61eBXVhwPoUtSadEk/img.png)
p=pd.pivot_table(data,values='승차총승객수',index='사용일자',columns='노선명', aggfunc= np.mean).reset_index() wday={0:'월',1:'화',2:'수',3:'목',4:'금',5:'토',6:'일'} p.insert(1,'요일',pd.to_datetime(p.사용일자,format='%Y%m%d').dt.dayofweek.map(wday)) p
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/UUdSG/btqDbEGSvPa/kKxK2kKQa2NE7qX8CT4XZ0/img.png)
correlation(상관 분석) data3.corr() covariance(공분산) data3.cov()
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/C137x/btqDech46fB/AMPEGCvhnD3y4aO4EkuKMK/img.png)
data.iloc[:10,[0,3]] data3=data2[['승차총승객수','하차총승객수']] data3[:3] data3.apply(lambda ser: ser.max()-ser.min()) data3.apply(lambda ser: pd.Series([ser.max(),ser.min(),ser.mean(),ser.std()])) data3.apply(lambda ser: pd.Series([ser.max(),ser.min(),ser.mean(),ser.std()], index=['max','min','mean','std'])) data3.applymap(lambda x: x//10000) # = data3//10000 #applymap : 항목 마다 data3.applymap(lambda x: x..
![](http://i1.daumcdn.net/thumb/C150x150/?fname=https://blog.kakaocdn.net/dn/Br0MX/btqDfAP4p89/WA2VT4X6R6JNBfCh2owrLK/img.png)
data2=data data2['year']=data2.사용일자/10000 data2['month']=(data2.사용일자%10000)//100 data2['day']=data2.사용일자%100 theday=pd.to_datetime(data2.사용일자,format='%Y%m%d') #교재 426 (format) ser=pd.Series(['2020-4-1','2020-4-2']) ser=pd.to_datetime(ser) ser.dt.year data2['year']=theday.dt.year data2['month']=theday.dt.month data2['day']=theday.dt.day wday={0:'월',1:'화',2:'수',3:'목',4:'금',5:'토',6:'일'} data2['wday..