There’s a new kind of researcher on campus, one who doesn’t fit into the usual nooks and crannies at a university.
They are data scientists — students, faculty members and staff — who are building the tools and crafting the methods to help scientists analyze vast amounts of data now abundant in every field, from the physical and social sciences to the humanities, natural sciences and engineering. The very nature of their skill set is interdisciplinary, but the university system doesn’t always reward them for the time they spend developing techniques and software to advance science.
These data scientists are sought after by industry — to mine customers’ preferences for more targeted advertising or to analyze traffic patterns to build more sensible roadways — and also are needed in academia to process gene sequences or astronomical amounts of star data. But traditional university career paths can be a poor fit for these experts.
This dilemma, and what universities can do to change it, is the topic of a symposium Feb. 15 at the American Association for the Advancement of Science annual meeting in San Jose, California. The session, “Advancing University Career Paths in Interdisciplinary Data-Intensive Science,” is led by University of Washington faculty members and brings together experts from the University of California, Berkeley, and New York University.
At the UW, an interdisciplinary organization called the eScience Institute, which recently was awarded several prestigious grants, is advancing the research and practice of data-intensive discovery across campus, in part by attracting data scientists to explore new career paths that blend independent research, interdisciplinary consulting and teaching, and development of new software and methods.
Bill Howe, associate director of the eScience Institute and co-organizer of the conference symposium, will talk about how the UW’s programs are designed to help researchers interact with industry partners, particularly to make big-data analysis techniques and methods easier for everyone to use.
“We are trying to centralize these data-scientist roles at universities and give them the prestige and autonomy they would receive in similar industry jobs,” Howe said. “This could ultimately attract more early career researchers and practitioners to the field.”
The eScience Institute also has established a new postdoctoral fellow program to explicitly identify and reward young researchers who operate at the intersection of their own domain and data science. By building a community of these rising stars and helping to position them for prestigious faculty positions, UW eScience aims to promote a model of interdisciplinary data-intensive science as the norm rather than the exception, Howe added.
The UW’s presenters will talk about their early successes in bringing data science to campus, including:
– A new data science studio: A physical space on campus open to anyone who needs help with big data or wants to exchange ideas and techniques for working with large datasets. The UW’s studio opened in January and has been busy, Howe said, citing the in-person, “water-cooler effect” aspects as important for the collaborations that are happening.
– The data science incubator program: Research labs from across campus send one person to work side by side with data scientists two days a week for the academic quarter. The goal is to train researchers to tackle their big-data projects, then bring those skills back to their respective labs. The studio also hosts more informal office hours for researchers to ask for guidance on smaller projects.
– A data science seminar series: Brings together thought leaders from universities and industry to talk about topics related to data analysis, visualization and applications to other fields.
– A new doctoral track in big data: Graduate students in a number of participating departments take courses and focus a portion of their research on methods in data-intensive science.
The symposium’s presentations and speakers are:
The Moore/Sloan Data Science Environments: Advancing Data-Intensive Discovery
Ed Lazowska, University of Washington
Future Career Paths for Data Scientists in Academia
Cecilia Aragon, University of Washington
Computational and Data Literacy for Domain Scientists
Joshua Bloom, University of California, Berkeley
Managing and Reusing Provenance as a Critical Capability for (Data) Scientists
Juliana Freire, New York University
IPython: From Interactive Computing to Computational Narratives
Fernando Perez, University of California, Berkeley
Industry Partnerships in Data Science to Advance Scientific Research
Bill Howe, University of Washington