(archive site)



PyCASP: Python-based Content Analysis using SPecialization is designed to offer a single software development environment for audio content analysis applications. PyCASP is a pattern-oriented, application-specific specialization framework that uses a tiered approach to parallel programming to automatically generate optimized parallel implementations of audio content-analysis algorithms from Python code.

The framework is comprised of several components for automatically mapping computations that typically occur in content-analysis applications onto parallel platforms. Using PyCASP, applications can be prototyped in a couple hundred lines of Python code and automatically scaled to modern parallel processors such as multi-core CPUs (central processing units) and GPUs (graphics processing units). Applications written with PyCASP are portable to a variety of hardware platforms and efficiently scale from a single desktop GPU to an entire cluster with only a small change to application code.

Caption: The components covered by PyCASP; based on the taxonomy in Keutzer and Mattson 2010.

Common application patterns determine the algorithms to be implemented in the components; each is classified as involving a particular computation according to a typology of computational patterns. The framework also makes reference to a set of structural patterns, to aid in the composition of components. Because of this systematic, pattern-oriented approach, PyCASP is modular and has a tractable scope, and yet is very flexible and applicable to a wide variety of applications.

The PyCASP framework has been used for a variety of audio content-analysis applications, including a state-of-the-art speaker diarization application, a content-based music-recommendation system based on the Million Song Dataset, and a video event-detection system for consumer-produced videos. Across this wide range of applications, PyCASP allows easy prototyping in a high-level language with the efficient performance of low-level optimized code.

Project Results

Source Code:

The source code and documentation for PyCASP are available on GitHub.

PyCASP Publications


PyCASP is a collaboration between researchers at ICSI and University of California – Berkeley’s Parallel Computing Laboratory (ParLab).

Researchers @ ICSI:

  • Gerald Friedland

Collaborators @ ParLab:

  • Eric Battenberg
  • Henry Cook
  • Michael Driscoll
  • Armando Fox
  • Evangelos Georganas
  • Ekaterina Gonina
  • Shoaib Kamil
  • Kurt Keutzer
  • Penporn Koanantakool


Research on PyCASP was supported by Microsoft (Award #024263) and Intel (Award #024894) funding, with matching funding from UC Discovery (Award #DIG07-10227); by ParLab affiliates National Instruments, Nokia, NVIDIA, Oracle, and Samsung; and by the Intelligence Advanced Research Projects Activity (IARPA) (Contract #D11PC20066). The opinions, findings, and conclusions described on this website are those of the researchers and do not necessarily reflect the views of the funders.