Etd

Training Data Generation Framework For Machine-Learning Based Classifiers

Public

Downloadable Content

open in viewer

In this thesis, we propose a new framework for the generation of training data for machine learning techniques used for classification in communications applications. Machine learning-based signal classifiers do not generalize well when training data does not describe the underlying probability distribution of real signals. The simplest way to accomplish statistical similarity between training and testing data is to synthesize training data passed through a permutation of plausible forms of noise. To accomplish this, a framework is proposed that implements arbitrary channel conditions and baseband signals. A dataset generated using the framework is considered, and is shown to be appropriately sized by having $11\%$ lower entropy than state-of-the-art datasets. Furthermore, unsupervised domain adaptation can allow for powerful generalized training via deep feature transforms on unlabeled evaluation-time signals. A novel Deep Reconstruction-Classification Network (DRCN) application is introduced, which attempts to maintain near-peak signal classification accuracy despite dataset bias, or perturbations on testing data unforeseen in training. Together, feature transforms and diverse training data generated from the proposed framework, teaching a range of plausible noise, can train a deep neural net to classify signals well in many real-world scenarios despite unforeseen perturbations.

Creator
Contributors
Degree
Unit
Publisher
Language
  • English
Identifier
  • etd-121818-231026
Keyword
Advisor
Committee
Defense date
Year
  • 2018
Date created
  • 2018-12-18
Resource type
Rights statement
Last modified
  • 2023-08-10

Relations

In Collection:

Items

Items

Permanent link to this page: https://digital.wpi.edu/show/ms35t877m