What We Do
Consumer-produced video is the fastest-growing type of content on the Internet. For example, YouTube claims that 100 hours of video are uploaded to its website every minute. At this rate, the amount of available consumer-produced video will grow by 53 million hours in the next year on YouTube alone — and that is only one video-hosting site among many.
In addition to entertainment, these videos provide a wealth of information about the universe. Their subject matter includes personal history, instructions, and snapshots of life in the place and time where the video was recorded. Furthermore, in addition to the content the producers of the videos intended to capture, they also include incidental information in the form of background noise and visuals and contextual artifacts. Individually, these videos are direct records of the world. As a collection, they represent a compendium of emergent information that goes far beyond what was captured in the individual recordings. They provide data on trends and societal dynamics and evidence of events and phenomena. As a result, they are useful for qualitative and quantitative empirical research on a larger scale than has ever been possible before — if we can automatically analyze the content of the recordings.
Multimedia research focuses on the scientific problems that arise out of the complementary nature of the different data sources in a recorded document, each of which captures only partial information. Our Audio and Multimedia group at ICSI is developing computational algorithms, systems, and methods to handle content that is composed of multiple types of data, such as consumer-produced videos. We take a top-down viewpoint, studying how computer systems should be designed to analyze and integrate the different types of information in a document, or across documents. These include not only the audio and/or visual content of the recording but also metadata, such as geo-tags, and information about the context in which the content is presented, such as the goals of the producer and its social implications.
Research in the Audio and Multimedia group grew out of work in ICSI’s Speech group, and our approach often focuses on audio analysis. Audio content is frequently complementary to visual content, as in videos, and its lower data rates can make processing more tractable. Other work in Audio and Multimedia includes experimenting with targeted crowdsourcing for research and exploring the privacy implications of multimedia retrieval techniques.
This site archives our older projects and some of our current ones, but our newest projects are not included. Email Gerald Friedland at fractor[chez]icsi[stop]berkeley[stop]edu to find out what we’re currently up to.