ML algorithms 1.03: K Nearest Neighbors

Published in

Geek Culture

3 min readJun 17, 2021

Introduction

K Nearest Neighbors is a lazy learner. It does not build a model as such. Scaling of features is necessary before calculating distances when the features have different ranges.

Advantages

No training required.
Very easy to interpret.

Disadvantages

Need to find the optimum K(no. of neighbors). This might change as new data is added.
Problems arise when data is very large or the no. of dimensions is large. This can be solved by using kd-tree and ball tree algorithms.
It is sensitive to noisy data.

Distance metrics

Model building

Important points to note:

We need to normalize/standardize the numerical features and encode the categorical features to dummies.
If the predictor data is a mix of numeric and categorical features, we can only do MinMax scaling of the numerical data. This is to keep the values bounded within [0,1] in the same range as categorical encoded variables.
If the predictors are all numeric, we can use Euclidean, Manhattan, Minowski or Mahalanobis distance.
If the predictors are only categorical, we can use hamming distance.
If the predictors are a mix of of numeric and categorical, we have to choose the distance metric very carefully.
When the data has very high dimensions, we suffer from the curse of dimensionality. In this case we restrict to using lower powers in the minkowski distance. We can also resort to dimension reduction techniques and feature selection.
The best value of K can be found out by plotting the error and the K.

Prediction

The new observation is placed in the feature space of the training data.

Classification

The output class is the modal class for n no. of neighbors.

Regression

The output estimate is the mean value of n no. of neighbors.

You should keep the no. of neighbors as odd when the no. of classes is even and vice versa to avoid confusion.

Hyperparameter tuning

n_neighbors: To set the no. of neighbors
weights: uniform; distance:
uniform: uniform weight of all points
distance: nearer points have more weight inversely proportional to their distance
algorithm: the type of algorithm construction.
kd-tree: K dimensional tree will be formed (axis parallel hyperplanes)
ball-tree: the space will be divided into concentric spheres
brute-force: all the distances will be calculated
leaf_size: the no. of points in the kd-tree or ball tree leaf
p: Power parameter for the Minkowski metric
metric: distance metric for the algorithm
metric_params: additional metric parameter; like covariance matrix for Mahalanobis distance

******************************************************************Example code:

from sklearn.neighbors import KNeighborsRegressor as K{or KNeighborsClassifier}

knn =K(n_neighbors=5)

knn.fit(X_train, y_train)

y_hat = knn.predict(X_test)

******************************************************************

References

sklearn.neighbors.KNeighborsClassifier - scikit-learn 0.24.2 documentation

Classifier implementing the k-nearest neighbors vote. Read more in the . Parameters n_neighborsint, default=5…

scikit-learn.org