Department of Computer Science at UH

University of Houston

Department of Computer Science

In Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy

Bassam Almogahed

Will defend his PhD dissertation proposal


Classification of Imbalanced Data

Abstract

Nowadays, there is an unprecedented amount of data available. Data continues expanding by the second from various sources(e.g., the internet, surveillance cameras, fnancial transactions). This has caused the feld of knowledge discovery to garner quite a bit of attention in recent years, from both researchers and industry. As a result, many successful algorithms and tools have been developed. However, many real-world datasets are imbalanced. The problem of learning from imbalanced data still poses a major challenge and has been recognized as a signifcant research problem.

The problem with imbalanced data centers on the performance of learning algorithms in the presence of underrepresented data and severely skewed class distribution. The performance of models trained from imbalanced datasets are strongly favor the majority class and largely ignore the minority class. Many of the approaches that have been introduced throughout recent years have presented several solutions at both the algorithmic and data levels. Neverthless both approaches have been criticized due to the lack of generalization, loss of importtant information, or over fitting.

The goal of this research is to develop algorithms to balance imbalanced datasets in order for classifer to reach optimal predictions. The specifc objectives are: (i) develop a sampling method for imbalanced data, (ii) automatically estimate the sample size increase needed for the classifer to provide optimal predictions, (iii) evaluate the performance of this method on a variety of imbalanced datasets with an imbalanced nature and (iv) develop of a new risk prediction framework for cardiovascular disease using the methods developed in objectives 1 and 2.

 

Date: Tuesday, May 7, 2013
Time: 1:00 PM
Place: HBS 350

Faculty, students, and the general public are invited.
Advisor: Prof. Ioannis Kakadiaris Committee: Profs. Christoph Eick, Shishir Shah, Panagiotis Tsiamyrtzis and Ricardo Vilalta