[Defense] Visual Summarization of Lecture Videos to Enhance Navigation
Friday, May 7, 2021
3:00 pm - 5:00 pm
Mohammad Rajiur Rahman
will defend his dissertation
Visual Summarization of Lecture Videos to Enhance Navigation
Recorded lecture video is a popular and essential learning resource. A fundamental limitation of lecture video is the inability to access any content of interest quickly. Several lecture video management portals introduced additional navigation features like indexing, captioning, search, etc. Lecture video indexing is the automatic partitioning of videos into smaller segments, each discussing a particular topic. My goal is to create a visual summary containing a subset of images extracted from a lecture video segment to enhance navigation. The quality of a visual summary depends on the uniqueness and importance of the images. The uniqueness is achieved by ensuring a diverse set of images that has low similarity between them. The importance is the desirability of an image to be included in the summary. I explored different methods for measuring similarity and found a combination of keypoints match and color histograms works best to identify unique objects for a visual summary. The importance of an image can be captured from the size, duration, entropy, and the number of keypoints, etc., of an image. Experimental results indicate a combination of the image size and the number of keypoints can closely approximate the desirability or importance of an image for including in the summary. With similarity and importance values for all images calculated, the next problem is to develop a method to select a set of visually different and relatively important images.
This dissertation presents a graph-based algorithm to solve the problem. It takes all images of a segment, their similarity, and importance as inputs and selects a subset of unique and important images for a visual summary. The results from this research are implemented into a real-world lecture video management portal called Videopoints. The evaluation is based on summaries provided by Videopoints users on a dataset of 120 video segments. The graph-based algorithm for identifying summary images achieves 66% F1-measure with frequently-selected images as the ground truth and 74% F1-measure with the union of all user-selected images as the ground truth. For 93.8% of algorithm selected visual summary images, at least one user also selected that image for their summary or considered it similar to another image they selected. Over 70% of automatically generated summaries were rated as good or very good by the users on a 4-point scale from poor to very good. Overall, the results establish that the methodology introduced in this dissertation produces good quality visual summaries that are practically useful for lecture video navigation
3:00PM - 5:00PM CT
Online via MS Teams
Drs. Jaspal Subhlok and Shishir Shah, dissertation advisor(s)
Faculty, students and the general public are invited.