'Certificate/data analytics-Google' 카테고리의 글 목록

[XGBoost] Python

목차 1. 라이브러리 import numpy as np import pandas as pd import matplotlib as plt import pickle from sklearn.model_selection import train_test_split from sklearn.model_selection import GridSearchCV from sklearn import metrics from xgboost import XGBClassifier from xgboost import plot_importance 2. 데이터 3. feature engineering airline_data_dummies = pd.get_dummies(airline_data, columns=['satisfaction','C..

Certificate/data analytics-Google 2023.08.08

[Boosting] Python, AdaBoost, Gradient Boosting Machine, XGBoost, learning rate, min_child_weight

목차 Boosting 여러 개의 분류기가 순차적으로 학습을 수행하되, 앞에서 학습한 분류기가 예측이 틀린 데이터에 대해서 다음 분류기에 가중치를 부여하며 학습과 예측을 진행한다. 현재 성능이 매우 좋아 많은 산업에서 쓰이고 있으며 캐글 대회에서도 우수한 성적을 거두고 있는 방법이다. Technique that builds an ensemble of weak learners sequentially, with each consecutive base learner trying to correct the errors of the one before. 1. 같은 점 - ensembling technique - Aggregates weak learners 2. 다른 점 - lea..

Certificate/data analytics-Google 2023.08.07

[RandomForest] Python, 랜덤포레스트 모델 튜닝

목차 1. 라이브러리 import numpy as np import pandas as pd import pickle as pkl from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split, PredefinedSplit, GridSearchCV from sklearn.metrics import f1_score, precision_score, recall_score, accuracy_score 2. 데이터 3. null값처리 air_data_subset = air_data.dropna(axis=0) 4. Encoding air_data_subset_dummies = pd.get_d..

Certificate/data analytics-Google 2023.08.07

[ensemble] Python, ensemble, voting, pickle, bootstraping, randomforest, hyperparameter tunes

목차 ensemble learning Aggregating their outputs to make a prediction 여러 개의 분류기를 생성하고 그 예측을 결합함으로써 보다 정확한 최종 예측을 도출하는 기법. 다양한 분류기의 예측 결과를 결합하는 것이 단일 분류기보다 신뢰성이 높은 예측값을 얻을 수 있다. 1. Voting 서로 다른 알고리즘을 가진 부류기를 결합 1) Hard Voting 다수결의 원칙을 따른다. 예측한 결과값 중 다수의 분류기가 결정한 예측값을 최종 voting결과값으로 선정한다. 2) Soft Voting 분류기들의 레이블 값 결정 확률을 모두 더하고 이를 평균해서, 확률이 가장 높은 레이블 값을 최종 보팅 결괏값으로 선정한다. 2. bagging= bootstrap aggre..

Certificate/data analytics-Google 2023.08.06

[Decision tree] Python, feature_importances_, tree plot, hyperparameter tunes

목차 1. 라이브러리 # Standard operational package imports import numpy as np import pandas as pd # Important imports for modeling and evaluation from sklearn.model_selection import train_test_split from sklearn.model_selection import GridSearchCV from sklearn.tree import DecisionTreeClassifier from sklearn.tree import plot_tree import sklearn.metrics as metrics # Visualization package imports import ma..

Certificate/data analytics-Google 2023.08.06

올리비아 코딩스쿨

Certificate/data analytics-Google 48

티스토리툴바