Little Kitties, Young Cuties, Daugter Secrets...Uploader: Ditilar 4 month ago Subscribe 8
Using an approximate nearest neighbor search algorithm makes k- NN computationally tractable even for large data sets. A drawback of the basic "majority voting" classification occurs when the class distribution is skewed. As the amount of data approaches infinity, the two-class k- NN algorithm is guaranteed to yield an error rate no worse than twice the Bayes error rate the minimum achievable error rate given the distribution of the data. In the classification phase, k is a user-defined constant, and an unlabeled vector a query or test point is classified by assigning the label which is most frequent among the k training samples nearest to that query point.