Colleges incorporate data science into curriculums

1 2 Page 2
Page 2 of 2

The program is divided into thirds and teaches the IT, science and business aspects of analytics, says Klabjan. Some topics covered in the IT portion include data warehousing and workflow management. Science courses will look at machine learning and data mining, among other topics. The business curriculum includes communication, project leadership, which gets a full course, and conveying information to business users. Elective courses based on vertical industries like marketing and health care are also offered. Students will spend their summer completing a mandatory off-campus internship before returning to campus for the final quarter.

To learn the software behind big data, students attend a software boot camp a few days before the program starts during which they're trained in IBM's SPSS and Cognos, Tableau, Hadoop and SAS products.

Beyond the classroom learning, students will serve as de facto consultants and work on business-sponsored data projects. During the first quarter, students start an eight-month project and complete a capstone project in the final quarter. Students, divided into teams of four, will interact directly with participating companies during weekly meetings and be responsible for delivering a completed project.

"These are projects that the companies have on their to-do lists," says Klabjan. "Some don't have analytical solutions in-house. Or they're interested in starting [an] analytics practice within their organization. The deliverable of this project is always going to be the implementation, so companies will use these projects."

University of Washington

An immediate and strong demand for a new kind of data analyst led the University of Washington in Seattle to introduce a data science certificate program this October. These analysts use their strong math and technology backgrounds to build data analysis tools and then use that data to make business decisions, says Ed Lazowska, the director of the University of Washington eScience Institute, in an email.

Focusing on training IT professionals instead of undergraduate or graduate students helps meet market needs faster, he says.

Applicants should be proficient in programming, databases and basic statistics, Lazowska says. To test these skills, they must take an online quiz. Their score, along with their resume, are considered when determining admission.

The nine-month, three-course program "filled up faster than any new certificate program in recent memory," writes Lazowska. An August information session for the program, which is offered through the university's Professional and Continuing Education department, reached its 175-person capacity the week it was announced.

The first course teaches students about the different tools used in data management, storage and manipulation, including instruction in and hands-on experience with Hadoop and MapReduce. The second course deals with understanding core statistical and machine learning techniques, including deploying and maintaining a Hadoop cluster. The final course will combine the lessons from the earlier modules and teach the technology that data workers use to find value in the data sets. Each course will address interpreting and communicating statistical results, says Lazowska.

"This area is not typically emphasized in technology-focused curricula, but is critical in data science." He adds that each course also addresses data visualization, especially the first one, which has many lectures on the topics.

Three big data experts, one from the University of Washington's faculty and two Microsoft employees, will teach the course.

In the spring, the course will go online when it is added to the curriculum of Web-learning company Coursera. Lazowska hopes to eventually offer the course to students in the College of Engineering. In the on-campus version database instructors from the computer science and engineering department will teach the course.

Data science has appeared in the undergraduate and graduate introductory database curriculum as well. The courses now include subject matter on NoSQL and scalable data processing, writes Lazowska. The university is looking into developing a full-time, on-campus master's program after a "banner year in recruiting" gave it the necessary faculty, including four prominent professors in the big data and machine learning fields.

Another certificate program is also being developed around courses that are relevant to data science and already taught at the university. Examples are courses on scalable systems, databases, statistics and visualization, Lazowska says.


Copyright © 2012 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2
Shop Tech Products at Amazon