[Skip Global Navigation]

Customer Success Stories

Success Stories Home

University of Ulster Adopts Data Mining for Computer Science Students

Situation

The multi-campus University of Ulster is the largest in Ireland - North or South - and its Faculty of Informatics is one of the most substantial providers of undergraduate Computing Science courses in the UK.

The Faculty also has a large Data Mining Research Group. There is considerable collaboration with blue chip companies as well as with other universities in Europe, government agencies and hospitals. The Group is a member of the newly-formed European Network of Excellence in Knowledge Discovery from Databases, KDDNet. Members of the group developed mining software that has now been incorporated by SPSS into the Clementine Data Mining Workbench.

Critical Issue

Throughout the 1990s limited aspects of Machine Learning and Data Mining were taught to undergraduates as part of their work on Artificial Intelligence. However, there was still a substantial shortfall in student knowledge and skill level in the area of Data Mining. Ray Hickey, Lecturer in Computing Science at the University of Ulster, explains, “It was decided in 1998, that with the much increased interest in Data Mining in the business community, the likelihood was that graduates everywhere would encounter it at some point in their future careers, whether as software engineers required to build systems or to evaluate a package, or as managers required to implement data mining for more improved business financial performance.”

Solution

The committee at Ulster University decided it would be appropriate and beneficial to student skills development to devote a full module to the subject of Data Mining at the final year level. The module, would have to be pitched in a way that made it accessible and rewarding for a typical undergraduate majoring in Computing Science. Such students vary considerably in their mathematical and statistical backgrounds as well in the Business options they may have taken.

The overriding aims were that students should be able to assess the suitability of a Data Mining solution for a business problem; that they should have a good understanding of a range of commonly used techniques; and that they should be able to interpret and evaluate the results of mining activity.

In addition it was felt to be important that they should have practical experience of Data Mining gained through use of industry standard software. Various staff in the faculty were already using Clementine and had been undertaking development work aimed at enhancing its capabilities and therefore Clementine seemed a natural choice. A further major requirement of the software was that it would run on standard lab machines as used by undergraduates, again Clementine fits this requirement nicely.

Results

The module called Machine Learning and Data Mining began in the autumn of 1998 as an option for students on the main Computing Science degree at the Coleraine campus of the university. “It was an immediate success with about 60 students enrolling - about 80% of the year group - and has remained so since then”, says Mr Hickey. Right from the start, many students found it fascinating that, in a variety of application areas, a computer could discover or manufacture knowledge that seemed impossibly hidden in data - even from the sight of human experts in those fields.

Students found the Clementine interface very easy to use and quickly became quite independent and adventurous. They were encouraged to find data sets in application areas that interested them and to experiment with them in Clementine. A major additional benefit was that Clementine supported the students in gaining understanding of the mining process and getting a feel for how the main routines work.

In the last academic year, the CRISP-DM (CRoss Industry Standard Practice for Data Mining) methodology was introduced as a major feature of the module. For coursework, students undertake a complete mining task, following the guidelines provided by CRISP-DM, on a very substantial data set right through from the formulation of business goals to suggestions for deployment of the results of mining. Mr Hickey concludes on what has been a very successful project, “This use of CRISP has been very successful. Students are now much more confident about where they are in the mining process and what is required of them at any stage. Yet CRISP while providing this support does not stifle creativity - as some methodologies in other areas do. Many students reported that they found this coursework very stimulating and rewarding.”

 

The complete list of global SPSS success stories can be found here

back to top