Associate Professor of Computer Science Charless Fowlkes is helping advance biomedical image analytics with a course on Big Data Image Processing and Analysis (BigDIPA). The intensive weeklong course is part of a three-year NIH-sponsored project he’s leading with Assistant Professor of Biomedical Engineering Michelle Dignam. The goal is to train researchers to work with complex big data. Fowlkes explains that “people discover they’ve filled up their hard drive with all this beautiful image data, but they don’t quite know what to do next.”
Teaching High-Level Analytics
“We want to take people from this point of collecting data on a microscope, to doing some basic analysis of the images and producing some intermediate results, to conducting a higher-level statistical analysis of the data,” says Fowlkes. A first version of the course was offered last fall, and, based on attendee feedback, modifications were made prior to the recent Fall 2017 course. Material is now better integrated across the lectures, so students can collect data using a microscope one day and then analyze that same data later in the week during a lab session.
“We have them do lots of different hands-on activities, so suddenly something like machine learning that’s very complicated becomes less mysterious and foreboding,” notes Fowlkes. And while he doesn’t expect the students to walk away experts in big data analytics after one week, he hopes that they gain a broad idea of cutting-edge techniques that will help jumpstart collaborations with other experts.
In fact, multidisciplinary collaboration is the high-level challenge. According to Fowlkes, “You can’t solve all of the problems yourself. To make progress, you need these bigger, interdisciplinary teams.” He adds that researchers in the natural sciences understand this because they see the need for computation. “They’re banging on my door, asking for help,” he says, which has opened his eyes to new research questions in computer science.
“Unless your door is being banged on, the interdisciplinary need isn’t as obvious. But then you realize there are all sorts of problems of interest on the computing side,” notes Fowlkes. For example, how do you store and organize big data, scale up computation on a cluster of computers for processing, and visualize the data? In particular, how do you let students access 300 gigabytes worth of image data using their laptops? “We’re trying to teach them about analyzing big data sets that require lots of processing and storage that can’t be run on laptops,” explains Fowlkes. So this year, course tutorials involved writing and running code through a web interface, where the computation was actually running remotely “in the cloud” on servers at UC Irvine and San Diego at Cal IT2. “In the long run, it’s not just us handing tools to someone. It feeds back into our discipline as well.”
Sharing Course Material
Feedback from attendees has thus far been very positive. “They’ve all been thrilled,” says Fowlkes. Furthermore, he says they’re trying to post some of the material online. “One thing we’re trying to do in the final year is to make this material more accessible to people outside the course. The goal, at least for some of the modules, is to encapsulate them in a way that they can be standalone online.” They’re in the process of editing down videotaped lectures and planning for the final course, which will be held Sept. 17-21, 2018. For more information, visit http://bigdipa.ccbs.uci.edu/.
— Shani Murray