Motivation
Early work in predictive data mining did not address the complex circumstances
in which models are built and applied. It was assumed that a fixed amount
of training data were available and only simple objectives, namely
predictive accuracy, were considered. Over time, it became clear that these
assumptions were unrealistic and that the economic utility of
acquiring training data, building a model, and applying the model had to
be considered. The machine learning and data mining communities responded
with research on active learning,
which focused on methods for cost-effective acquisition of information for
the training data, and research on cost-sensitive learning, which considered
the costs and benefits associated with using the learned knowledge and how
these costs and benefits should be factored into the data mining process.
All the different stages of the data mining process are affected by economic
utility. In the data acquisition phase we have to consider the costs of
obtaining training data, such as the cost of labelling additional examples
or acquiring new feature values.
In applying the data mining algorithm, we have to consider the running time
of the algorithm and the costs and benefits associated with cleaning the data,
transforming the data and constructing new features. Economic utility also
impacts the assessment of the decisions made based on the learned knowledge.
Simple assessment measures like predictive accuracy have given way to more
complex economic measures, including measures of profitability. These
considerations can in turn impact policies for model induction. The latter
topic has received more attention in the context of cost-sensitive learning.
Goals
Almost all work that considers the impact of economic utility on data mining
focuses exclusively on one of the stages in the data mining process. Thus,
economic factors have been studied in isolation, without much attention to
how they interact. This workshop will begin to remedy this deficiency by
bringing together researchers who currently consider different economic
aspects in data mining, and by promoting an examination of the impact of
economic utility throughout the entire data mining process. This workshop
will attempt to encourage the field to go beyond what has been accomplished
individually in the areas of active learning and cost-sensitive learning
(although both of these areas are within the scope of this workshop). In
addition, existing research which has addressed the role of economic
utility in data mining has focused on predictive data
mining tasks. This workshop will begin to explore methods for incorporating
economic utility considerations into both predictive and descriptive data
mining tasks.
This workshop will be geared toward researchers with an interest in how
economic factors affect data mining (e.g., researchers in cost-sensitive
learning and evaluation and active learning) and practitioners who
have real-world experience with how these factors influence data mining.
Attendance is not limited to the paper authors and we strongly encourage
interested researchers from related areas to attend the workshop.
This will be a full-day workshop and will include invited talks, paper
presentations, short position statements and two panel discussions.
|