Department of Computer Science at UH

University of Houston

Department of Computer Science

In Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy

Francisco Ocegueda-Hernandez

Will defend his dissertation


An Empirical Study of the Suitability of Class Decomposition for Linear Classifiers

Abstract

The presence of sub-classes within a data sample suggests a class decomposition approach to classification, where each subclass is treated as a new class. Class decomposition can be effected using multiple linear classifiers in an attempt to outperform a single global linear classifier; the goal is to gain in model complexity while keeping error variance low. In this dissertation, we propose a study aimed at understanding the conditions behind the success or failure of class decomposition when combined with linear classifiers. We identify two relevant data properties as indicators of the suitability of class decomposition: 1) linear separability; and 2) class overlap. We use well-known data complexity measures to evaluate the presence of these properties in a data sample. Our methodology indicates when to avoid performing class decomposition based on such data properties. In addition we conduct a similar analysis at a more granular level for data samples mark ed as suitable for class decomposition. This extra analysis shows how to improve in efficiency during class decomposition. From an empirical standpoint, we test our technique on several real-world classification problems; results validate our methodology.

 

Date: Thursday, November 15, 2012
Time: 2:30 PM
Place: 218-PGH

Faculty, students, and the general public are invited.
Advisor: Dr. Ricardo Vilalta