Please enable Java in your browser's option menu to view this page Concordia's Thursday Report 

May 28, 1998

Networks of Centres of Excellence project attracts industry giants

Data warehousing provides a challenge

by Eve Krakow

Professor Laks V.S. Lakshmanan will be coordinating the sub-project that deals with online analytical processing in Web-based data warehouses.

It's a paradox all Internet users face: You want to do a search on the Internet. You call up your favourite search engine and input a few key words.

Hundreds of entries pop up on screen. As you scroll through them and click on a few, you find that at least half are irrelevant.

What is even more frustrating is that you know the information you're seeking is out there. And so the tedious, time-consuming search begins...

Might there one day be a way to explain to your computer exactly what you're looking for, and have it conduct the search intelligently?

This is, in a simplified sense, one of the goals that Laks V.S. Lakshmanan, Associate Professor of Computer Science, hopes to achieve with his current research project.

The project, titled "Building, Querying, Analyzing, and Mining Data Warehouses on the Internet," is a joint collaboration among database researchers at the University of Toronto, the University of British Columbia, Simon Fraser University, and Concordia, together with IBM Canada's Centre for Advanced Studies.

It has received major funding from the Networks of Centres of Excellence/Institute for Robotics and Intelligent Systems (NCE/IRIS), as well as graduate fellowship awards from IBM.

"The goal," Lakshmanan explained, "is to come up with a uniform way of accessing information and a means of assimilating it so that what you end up with is not raw data, but information at a higher level -- what's referred to as "knowledge": patterns, trends and rules, information that is useful to the end user, whether for business, industry, health care, or a specific field."

The four-year project has four components: constructing data warehouses on the Internet, querying data warehouses on the Internet, online analytical processing (OLAP) in Web-based data warehouses, and data mining in Web-based data warehouses.

Lakshmanan illustrates these concepts using the example of a food retail store.

Each day, massive amounts of data are entered into the store computer: what items were sold, to whom, and so on. Company decision-makers then pool the data from all their outlets across the province. They assimilate and integrate this data in order to extract useful information, such as what are the aggregate sales per month in each region, and which brand of yoghurt is the most popular. This collection and assimilation of data form a core component of data warehousing.

Suppose the decision-makers suspect a certain trend, and want to verify it. They input an analysis query; for example, how are the top 10 products of last year's first quarter performing in the first quarter of this year? This is called on-line analytical processing (OLAP).

In other cases, however, the decision-makers don't know what they're looking for. Data mining is therefore used to find patterns and trends, and to sort through the data and come up with meaningful information. An example of a mining query could be, "What are the expensive food items that are often bought together with cheap food items?"

"Data warehousing, data mining, and Internet information technology are three emerging technologies with wide industrial and commercial applications and significant economic benefits," Lakshmanan said. "The proposed work is an integration of these technologies based on our R&D experience and close industry collaborations."

Concordia stands to benefit greatly from the project. It is a way to strengthen industry ties and gain international visibility in the field.

"It provides tremendous opportunities for graduate students," Lakshmanan said. "Already, one of our PhD students is working at the IBM Almaden Research Center in California, while one of our Master's students recently visited the AT&T Labs research facility in New Jersey for three months. A third student is set to go to IBM Almaden on a six-month visiting appointment. Typically, such leading research centres hire graduates almost exclusively from top schools like Stanford, Wisconsin, Maryland and Duke. To my knowledge, this is the first time they have taken people from Concordia's Computer Science Department."

It is also a chance for Concordia to contribute to the high-tech industry in Canada. "This technology may attract other Canadian high-tech companies to work with us and benefit from technology transfer," Lakshmanan said. "Because of the high application potential of the proposed work, several companies, such as B.C. Hydro, Boeing Aircraft, and JC Penny, have contacted us for potential test, use, or purchase of our system, once it is well developed."

Copyright 1998 Thursday Report. For technical comments please e-mail   Webmaster