Faculty Advisor or Committee Member
Elke A. Rundensteiner, Advisor
In traditional classification problems, all classes in the test set are assumed to also occur in the training set, also referred to as the closed-set assumption. However, in practice, new classes may occur in the test set, which reduces the performance of machine learning models trained under the closed-set assumption. Machine learning models should be able to accurately classify instances of classes known during training while concurrently recognizing instances of previously unseen classes (also called the open set assumption). This open set assumption is motivated by real world applications of classifiers wherein its improbable that sufficient data can be collected a priori on all possible classes to reliably train for them. For example, motivated by the DARPA WASH project at WPI, a disease classifier trained on data collected prior to the outbreak of COVID-19 might erroneously diagnose patients with the flu rather than the novel coronavirus. State-of-the-art open set methods based on the Extreme Value Theory (EVT) fail to adequately model class distributions with unequal variances. We propose the Variational Open-Set Recognition (VOSR) model that leverages all class-belongingness probabilities to reject unknown instances. To realize the VOSR model, we design a novel Multi-Modal Variational Autoencoder (MMVAE) that learns well-separated Gaussian Mixture distributions with equal variances in its latent representation. During training, VOSR maps instances of known classes to high-probability regions of class-specific components. By enforcing a large distance between these latent components during training, VOSR then assumes unknown data lies in the low-probability space between components and uses a multivariate form of Extreme Value Theory to reject unknown instances. Our VOSR framework outperforms state-of-the-art open set classification methods with a 15% F1 score increase on a variety of benchmark datasets.
Worcester Polytechnic Institute
All authors have granted to WPI a nonexclusive royalty-free license to distribute copies of the work. Copyright is held by the author or authors, with all rights reserved, unless otherwise noted. If you have any questions, please contact firstname.lastname@example.org.
Buquicchio, Luke J., "Variational Open Set Recognition" (2020). Masters Theses (All Theses, All Years). 1377.
Open Set Classification, Varriational Autoencoder, Extreme Value Theory
Available for download on Saturday, May 08, 2021