(archive site)

Demos



Crowdsourcing Location Estimation Demo

A Location Estimation Tutorial

The following scenario is based on the tutorial we developed for crowdsourced Amazon Turk workers.

This very brief video seems difficult to place at first, as there are no tags or other obvious indicators of location associated with it, and there is no overlaid text informing the viewer where it was filmed.

However, there is a neon sign in the background saying “City Market Building”.

A Google search on ‘City Market Building’ yields as the first result The City Market in downtown Roanoke, VA. The search results also include images that clearly depict the building we saw in the clip, with titles like “Roanoke City Market Building”, confirming that Roanoke’s is the City Market we are looking for.

Conveniently, when given the name of a recognized landmark, Google also provides a map with a pointer. We have successfully found the location of the video!



Joke-O-Mat Demos

Joke-O-Mat v1 Demos (2009)

To contrast human and machine performance, we created two demos: One where the acoustic-event identification and speaker identification were based on manual annotation (only scene segmentation was automatic), and one where all of the narrative elements were identified using a fully automated process. These Java applets were developed under Linux with Firefox.

Automatic Seinfeld:

The Turing Test — Human vs. Machine:

Which one do you like better?

It’s Not Just About Seinfield:

These applets require a Java-compatible browser with Java 1.4 or higher — which unfortunately does not include Chrome for the Mac. Please be patient; the Joke-O-Mat experience starts with downloading the video content (about 30MB), and sometimes the navigation bar takes a few moments to load all of its many useful features.

Joke-O-Mat HD Demos (2010)

Continuing our comparison of human, machine, and human-and-machine performance, we created three demos for the upgraded Joke-O-Mat 2010 with keyword filtering. The first was generated by using human-derived textual data — i.e., fan-sourced transcripts and closed captioning — aligned with automatically identified narrative elements. The second was generated fully automatically, including using automatic speech recognition (ASR) to create a transcript, and the third was generated using expert human annotations of acoustic events and speaker identifications (but no word-by-word transcription).

Using Humans as Data Sources:

Tangled Turing — Human or Machine vs. Human-and-Machine:

These applets require a Java-compatible browser with Java 1.4 or higher — which unfortunately does not include Chrome for the Mac. Please be patient; the Joke-O-Mat experience starts with downloading the video content (about 30MB), and sometimes the navigation bar takes a few moments to load all of its many useful features.

Emergency Joke-O-Mat:

Notifications

Disclaimer: The software is provided as is; use it at your own risk.

Copyright Information:The Joke-O-Mat Browsers and associated explanatory text are copyright (c) 2009, 2010, 2013 by the International Computer Science Institute. The episodes of Seinfeld, I Love Lucy, and The Big Bang Theory used in our demos are copyrighted by the respective copyright holders. The excerpts are used here solely for the purpose of demonstrating nonprofit scientific research. Please do not illegally use or distribute copyrighted content.



Meeting Diarist Demo

The Diarist Demo shows how you can use the Meeting Diarist tool to navigate a meeting, in this case one recorded using a smartphone.

Unfortunately, the Meeting Diarist demo is not compatible with Chrome for the Mac, which does not support Java.



Meeting Dominance Estimation Demo

This video compares different methods for estimating the total speaking time for each speaker, in order to estimate who is the most dominant person in a meeting.



The topmost graph shows the speaker segmentation and speaking-time measurement using a single microphone in the middle of the table, with the segmentation performed by our speaker diarization system on that single recording. The middle graph shows the segmentation and speaking-time measurement using the threshold activation for each persons’ individual microphone.

The bar graphs on the bottom left show the cumulative estimates of the most dominant speaker using different experimental strategies: Experiment 1 and Experiment 2 were performed using two different variants of the speaker diarization system, on a single audio recording only, and Experiment 3 combined diarization with activation data from individual microphones. The bar graphs in the bottom right show the ground truth speaking length according to the segmentations for each speaker.



PyCASP Demos

Music Recommendation Demo: Pardora

One of the components of PyCASP (Python-based Content Analysis using SPecialization) allows fast training of Gaussian Mixture Models (GMMs) on a GPU. We used this component to build a music recommendation engine called Pardora, demonstrated in this video. Pardora quick-trains a model from a song entered as a query — or a set of songs with some feature, such as genre, entered as a query — and uses it to identify the nearest neighbors to that song or songs in the Million Song Dataset, based on acoustic features of the recordings.

Speaker Diarization Demo: Experimenting with Energy

This video compares the time and energy required to execute the speaker-identification application in our Joke-O-Mat video browser on a multicore CPU when we vary the number of threads in the PyCASP-based parallel implementation.




SMASH Demo

A MED-ium Cup of audioCaffe

We brewed up this demonstration experiment for the audioCaffe content-analysis tool using preliminary data from the YLI Multimedia Event Detection (MED) subcorpus.

Sample the audioCaffe + YLI demo here



Teaching Privacy Demos

Educational Site

The Teaching Privacy Website presents ten principles to describe how social-media privacy works, with detailed explanations, practical guidance, and classroom resources.

Hands-On Learning Tool

Ready or Not? is our first educational app, demonstrating how geotagged social-media posts can be used to predict someone’s daily routine.