Decision-Theoretic Rough Sets (DTRS)
What is Decision-Theoretic Rough Set?
Ever since the introduction of rough set theory by Pawlak in 1982, many
proposals have been made to incorporate probabilistic approaches into the
theory. They include, for example, rough set based probabilistic
classification, 0.5 probabilistic rough set model, decision-theoretic rough
set models, variable precision rough set models, rough membership functions,
parameterized rough set models, and bayesian rough set models. The results
of these studies increase our understanding of the rough set theory and its
domain of applications.
The decision-theoretic rough set models and the
variable precision rough set models were proposed in the early 1990’s. The
two models are formulated differently in order to generalize the 0.5
probabilistic rough set model. In fact, they produce the same rough set
approximations. Their main differences lie in their respective treatment of
the required parameters used in defining the lower and upper probabilistic
The decision-theoretic models systematically calculate the
parameters based on a loss function through the Bayesian decision procedure.
The physical meaning of the loss function can be interpreted based on more
practical notions of costs and risks. In contrast, the variable precision
models regard the parameters as primitive notions and a user must supply
those parameters. A lack of a systematic method for parameter estimation has
led researchers to use many ad hoc methods based on trial and error.
The results and ideas of the decision-theoretic model, based on the well
established and semantically sound Bayesian decision procedure, have been
successfully applied to many fields, such as data analysis and data mining,
information retrieval, feature selection, web-based support systems, and
intelligent agents. Some authors have generalized the decision-theoretic
model to multiple regions.
Theoretic foundation of DTRS
The DTRS model utilizes ideas from Bayes Decision Theory to form a scientific method of
calculating probabilistic parameters to define rough regions. Using the notion of expected loss
(conditional risk), the model enables the user to depend solely on their notions of cost in classifying
an object into a region.
The user's cost for classifying an object is measured through loss functions. These loss
functions are obtained through domain-specific investigations of the true cost of performing an action,
in this case, the classification of an object. In a two-class classification problem, the loss
functions measure the cost of an action (classifying an object) either into a set or the set's
The loss functions consider three types of actions: classification into the positive region,
classification into the negative region, and classification into the boundary region. These regions can
correspond to either the set in question or the set's complement. All-in-all, 9 loss functions are used
in a two-class classification problem.
The combination of loss functions result in values for probabilistic parameters that determine
inclusing of an object into a rough region. The arguement is that these parameters are calculated using
the concrete notions of expected loss within the domain instead of being arbitrarily decided by a domain
expert, as in the case of other probabilistic approximation models.
For a multi-class classification problem, each state in which actions are performed on
correspond to a partition in the universe. The number of actions perfomed on these partitions remain
the same (classifying objects into regions corresponding to each partition).
For further information, please consult the decision-theoretic rough set white paper, available
Development of DTRS
Development of the DTRS model began in the early nineties. The start of the 21-st century saw
an increased interest in the model.
Application of DTRS
The prospects of the application of the DTRS model are broad and exciting for the future. Given
the ability to use domain-specific notions of loss in taking a classification action, the application of
this model is only limited to the availablity of this data.
Some possible applications may include data analysis within the financial, medical, geological
domains as well as Web intelligence, including Web-based support systems and information retrieval.