Mark Whitehorn, professor at the university, said the course will overlap with and extend an existing master’s
One source of inspiration for the new course is work on proteomics data carried out by Professor Angus Lamond’s team at the Wellcome Trust Centre for Gene Regulation and Expression at the university.
For more on data science master’s education, and training
Nicole Laskowski, from SearchBusinessAnalytics.com, mulls the genesis of the data scientist role
Find out how data scientists are taking on a critical role in predictive analytics
On the Beye Network, Pinyaka Jain blogs on the data analytics skills shortage
But the widely canvassed notion that the UK and US are confronting a stark lack of data scientists and data-savvy managers was the main driver for the course's creation. "We saw the term 'data scientist' coming up again and again, but no good definition of it," Whitehorn said.
The first cohort of students will start in January 2013. It will be one year full-time or two years part-time, with two intensive weeks at Dundee, the rest of the time distance learning.
The data science master’s course will extend the school's BI teachings with modules on "non-tabular data," or "big data," and another on technologies for manipulating large data stores. Technologies to be covered include Hadoop, MapReduce and the R programming language for statistical computing.
"Data scientists will need more advanced statistics," Whitehorn said. "I'm also interested in [teaching people how to] design algorithmic solutions to business problems."
Dundee is also interested in teaching students the difference between a BI professional and a data scientist.
"A good BI architect will look more at 'what does this mean for the company?' " Whitehorn explained. "A data scientist confronts specific blocs of data. But both are interested in the meaning of the data, and both need good interpersonal skills."
To make room for the data science blocks, Dundee had to eliminate two sections of its BI course. They include one on transactional databases, "which breaks my heart," Whitehorn said; and the ETL [extract, transform and load] module.
"A data scientist will not be building large, highly sustainable, carefully air-trapped ETL routines to shift the same data day after day," he said.
The BI course that the data science course will overlap with has been running for three years. It now has 21 alumni and 40 current students, and "they won't go away!" Whitehorn said.
The course "filled the gaps in my theoretical knowledge. I'm a more rounded business intelligence professional, having come from a strong Microsoft background," said recent graduate Jon Reade, a BI and database consultant. "It has made me explore new avenues and products in the world of BI, particularly data mining and big data, opening my eyes to new and better ways of analysis and data exploration."