No articles match
Partools5 months ago
Problems with P-values9 months ago
Law School Admissions Data | Wealth Bias in the LSAT? | But Aren't There Setting in Which Significance Testing Is of Value? | What Is the Underlying Problem, and Its Implications?
Overfitting9 months ago
Clearing the Confusion: a Closer Look at Overfitting | Preparation | Goals | Required Background | Setting and Notation | The "True" Relation between Y and X | Example: estimating the ρ function via a linear model | Example: estimating the ρ function via a k-NN model | Bias and Variance of WHAT | Bias in prediction | Variance in prediction | What about k-NN? | Where Is the "Goldilocks" Level of Complexity? | Dependency on n | A U-shaped curve | Cross-validation | Overfitting with Impunity--and Even Gain? | Baffling behavior--drastic overfitting | How could it be possible? | Baffling behavior--"double descent"
Machine Learning Overview3 years ago
The 10-Page Machine Learning Book | Contents | Notation | Running example | The qe-series functions | Example | Hyperparameters | Mean functions | ML predictive methods | A note on prediction | k-nearest neighbors | K-NN edge bias | Random forests | Decision trees | Tree and RF edge bias | Boosting | Linear model | Logistic model | Polynomial-linear models | Shrinkage methods | Overview | LASSO for feature selection | Amount of shrinkage | Support Vector Machines | Neural networks | Structure | Estimation, role of weights | More on the estimation process | Learning rates | Overfitting | Which ML method to use? | What the specialists say | Also consider | Well, then, what algorithm?
Unbalanced Classes3 years ago
Clearing the Confusion: A Closer Look at the Issue of Unbalanced Training Data | Outline | Introduction | Motivating examples | Credit card fraud data | Missed appointments data | Optical letter recognition data | Mt. Sinai Hospital X-ray study | Cell phone fraud | Terminology | Notation | Key issue: How were the data generated? | What your ML algorithm is thinking | Artificial balance will not achieve our goals | So, what SHOULD be done? | Approach 1: use the ROC curve | Approach 2: informal, nonmechanical consideration of r (favored choice) | Adjusting for incorrect/changed pi | The adjustment formula | Summary | Appendix A: derivation of the unequal-loss rule | Appendix B: derivation of the adjustment formula | Appendix C: What is really happening if you use equal class probabilities?
PCA and UMAP3 years ago
Clearing the Confusion: PCA and UMAP | Needed background | PCA | Example: mlb data | Apply PCA | Key properties | Practical importance of (a) and (b) | Example: fiftyksongs data | Dimension reduction | And What about UMAP?
Feature_Selection3 years ago
Why Should We Consider Using Just a Few of Our Features? And How Can We Do This? | Which Method to Use? | How Many Is Too Many? | General principles | Example: NYC Taxi Data | Desiderata | Feature selection methods should produce an ordered sequence of candidate models | Feature Selection Methodology Overview | Methods based on p-values | The LASSO | Methods based on measures of feature importance | Feature Ordering by Conditional Independence (FOCI) | Direct dimension reduction for categorical data
Function List3 years ago
Quick Start3 years ago
The qeML Package: "Quick and Easy" Machine Learning | "Easy for learners, powerful for advanced users" | What this package is about | Easy model fit--first examples | Prediction | Holdout sets | Tutorials | Full function list, by category | Package author: Norm Matloff, UC Davis
regtools4 years ago
Novel tools tools for linear, nonlinear and nonparametric regression. | FEATURES: | EXAMPLE: PARAMETRIC MODEL FIT ASSESSMENT | EXAMPLE; OVA VS. AVA IN MULTICLASS PROBLEMS | EXAMPLE: ADJUSTMENT OF CLASS PROBABILITIES IN CLASSIFICATION PROBLEMS | MULTICLASS CLASSIFICATION WITH k-NN | EXAMPLE: RECTANGULARIZATION OF TIME SERIES
polyreg5 years ago
polyreg, an Alternative to Machine Learning Methods | Motivation | Usage | Example