Faculty Advisor or Committee Member

Jacob R. Whitehill, Advisor

Identifier

etd-3771

Abstract

Semantic segmentation methods using deep neural networks typically require huge volumes of annotated data to train properly. Due to the expense of collecting these pixel-level dataset annotations, the problem of semantic segmentation without ground-truth labels has been recently proposed. Many current approaches to unsupervised semantic segmentation frame the problem as a pixel clustering task, and in particular focus heavily on color differences between image regions. In this paper, we explore a weakness to this approach: By focusing on color, these approaches do not adequately capture relationships between similar objects across images. We present a new approach to the problem, and propose a novel architecture that captures the characteristic similarities of objects between images directly. We design a synthetic dataset to illustrate this flaw in an existing model. Experiments on this synthetic dataset show that our method can succeed where the pixel color clustering approach fails. Further, we show that plain autoencoder models can implicitly capture these cross-instance object relationships. This suggests that some generative model architectures may be viable candidates for unsupervised semantic segmentation even with no additional loss terms.

Publisher

Worcester Polytechnic Institute

Degree Name

MS

Department

Computer Science

Project Type

Thesis

Date Accepted

2020-05-13

Accessibility

Unrestricted

Subjects

neural networks

Share

COinS