# Decision-Theoretic Rough Sets (DTRS)

## Introductions

• ### What is Decision-Theoretic Rough Set?

Ever since the introduction of rough set theory by Pawlak in 1982, many proposals have been made to incorporate probabilistic approaches into the theory. They include, for example, rough set based probabilistic classification, 0.5 probabilistic rough set model, decision-theoretic rough set models, variable precision rough set models, rough membership functions, parameterized rough set models, and bayesian rough set models. The results of these studies increase our understanding of the rough set theory and its domain of applications.

The decision-theoretic rough set models and the variable precision rough set models were proposed in the early 1990’s. The two models are formulated differently in order to generalize the 0.5 probabilistic rough set model. In fact, they produce the same rough set approximations. Their main differences lie in their respective treatment of the required parameters used in defining the lower and upper probabilistic approximations.

The decision-theoretic models systematically calculate the parameters based on a loss function through the Bayesian decision procedure. The physical meaning of the loss function can be interpreted based on more practical notions of costs and risks. In contrast, the variable precision models regard the parameters as primitive notions and a user must supply those parameters. A lack of a systematic method for parameter estimation has led researchers to use many ad hoc methods based on trial and error.

The results and ideas of the decision-theoretic model, based on the well established and semantically sound Bayesian decision procedure, have been successfully applied to many fields, such as data analysis and data mining, information retrieval, feature selection, web-based support systems, and intelligent agents. Some authors have generalized the decision-theoretic model to multiple regions.
• ### Theoretic foundation of DTRS

The DTRS model utilizes ideas from Bayes Decision Theory to form a scientific method of calculating probabilistic parameters to define rough regions. Using the notion of expected loss (conditional risk), the model enables the user to depend solely on their notions of cost in classifying an object into a region.

The user's cost for classifying an object is measured through loss functions. These loss functions are obtained through domain-specific investigations of the true cost of performing an action, in this case, the classification of an object. In a two-class classification problem, the loss functions measure the cost of an action (classifying an object) either into a set or the set's complement.

The loss functions consider three types of actions: classification into the positive region, classification into the negative region, and classification into the boundary region. These regions can correspond to either the set in question or the set's complement. All-in-all, 9 loss functions are used in a two-class classification problem.

The combination of loss functions result in values for probabilistic parameters that determine inclusing of an object into a rough region. The arguement is that these parameters are calculated using the concrete notions of expected loss within the domain instead of being arbitrarily decided by a domain expert, as in the case of other probabilistic approximation models.

For a multi-class classification problem, each state in which actions are performed on correspond to a partition in the universe. The number of actions perfomed on these partitions remain the same (classifying objects into regions corresponding to each partition).

For further information, please consult the decision-theoretic rough set white paper, available here.

• ### Development of DTRS

Development of the DTRS model began in the early nineties. The start of the 21-st century saw an increased interest in the model.

• ### Application of DTRS

The prospects of the application of the DTRS model are broad and exciting for the future. Given the ability to use domain-specific notions of loss in taking a classification action, the application of this model is only limited to the availablity of this data.

Some possible applications may include data analysis within the financial, medical, geological domains as well as Web intelligence, including Web-based support systems and information retrieval.

Home