Data Science

Posted on April 2, 2019
Tags: probability

1 Common tactics

Linear Regression Logistic Regression Decision Tree SVM Naive Bayes kNN K-Means Random Forest Dimensionality Reduction Algorithms Gradient Boosting algorithms GBM XGBoost LightGBM CatBoost

2 Cross Validation

Measure accuracy of model and prevent overfitting.

folds: Partitions of dataset

2.1 Non-Exhaustive

2.1.1 Hold-out

2.1.2 K-fold cross-validation

2.1.3 Statified K-fold cross-validation

2.2 Exhaustive

2.2.1 Leave P-out cross-validation

2.2.2 Leave one-out cross-validation

2.2.3 Time-Series Rolling cross-validation