
An insight into some aspects of roughneurocomputing
SPEAKER: Marcin Szczuka, PhD
Warsaw University, Poland
CS Dept. Visiting Scholar
DATE: Wednesday, February 15, 2006
TIME: 2:30pm
PLACE: CL 408
Abstract
The presentation is aimed at bringing together several ideas that have
emerged on the boundary between the theories of rough sets and
neurocomputing. Starting with the first attempts that coupled rough
set methods with ANN's for classification purposes, we will further
present the ideas that strive against incorporation of roughset
specific notions, such as approximation, into the fabric of neural
network. Finally, we will present our recent findings, that try to
make use of overall neurocomputing paradigm for the sake of
construction of extended classification systems. Such systems make use
of extensions of notions which originated in the rough set theory.

Rough Set based 1v1 and 1vr
Approaches to Support Vector Machine
Multiclassification
SPEAKER: Pawan Lingras
Department of Math and Computer Science
Saint Mary's University
DATE: Monday, June 20, 2005
TIME: 11:00am 12:00pm
PLACE: CL 435
ABSTRACT
Support vector machines (SVMs) are essentially
binary classifiers. To improve their applicability,
several methods have been suggested for extending
SVMs for multiclassification, including one versus
one (1v1), one versus rest (1vr) and DAGSVM.
In this seminar, we first describe how binary
classification with SVMs can be interpreted using
rough sets. A rough set approach to SVM
classification removes the necessity of exact
classification and is especially useful when dealing
with noisy data. Next, by utilizing the boundary
region in rough sets, we suggest two new approaches,
extensions of 1vr and 1v1, to SVM
multiclassification that allow for an error rate.
We explicitly demonstrate how our extended 1vr may
shorten the training time of the conventional 1vr
approach. In addition, we show that our 1v1
approach may have reduced storage requirements
compared to the conventional 1v1 and DAGSVM
techniques. Our techniques also provide better
semantic interpretations of the classification
process.

From Business Objectives to Data Mining:
Towards a Systematic Way of Data Mining Project Development
SPEAKER: Ernestina Menasalvas
Facultad de Informática
Universidad Politecnica de Madrid
Spain
emenasalvas@fi.upm.es
DATE: Tuesday, Nov. 30, 2004
TIME: 4:005:00pm
PLACE: CL 418
ABSTRACT
Confronted with a confusing set of techniques and ways to
transform business problems into data mining problems, data
miners need proof that a particular technique is better
than another. Despite the existence of data mining
standards such as CrispDM, SEMMA, PMML, up to date, data
mining projects are being developed more as an art than as
a science.
The process depends completely on the expertise of the data
miner since no method is available to make the process
systematic and automatic. This is due to a lack of data
mining problem conceptualization. In this sense, a deep
understanding of both of the data to be analyzed and the
application domain of the results as well as of the data
mining functions is needed. Knowing the meaning of the data
to be analyzed: facts they represent, constraints and
context under which they were captured and the constrains
underneath the data mining functions to be applied, will
make it possible to find out whether the business goals to
achieve are feasible. However, up to date, there is no
formal method to describe these elements in such a way that
the quality of results can be assured. On the other hand,
this setting is a step towards a methodology for data
mining project development that will be, in itself, the
main basis for automatizing the process.
It is also a need to prove that the improvements are really
due to the actions taken after a data mining discovery and
not to any other factor or action carried out in the
company. In particular, results are thought to improve
benefits. It is surprisingly though, that none of the
obvious claims that are taken for sure as the starting
point of a data mining project have ever been
systematically tested.
Here is where experimentation plays its role. Experiments
are crucial to establish if the impact of the deployment is
really positive or negative. Also experimentation will
highlight if this impact is a side effect of actions taken
as a result of the data mining project or just of other
environmental factors.
In this talk, we present the approach of the Data Mining
group at Facultad de Informática, Universidad Politecnica
de Madrid, to the data mining project development.
 The Rough Set Exploration System
SPEAKER: Dr. Marcin Szczuka
Assistant Professor
The University of Warsaw
Warsaw Poland
DATE: Thursday Sept 2, 2004
TIME: 13:0014:00
PLACE: ED 122
ABSTRACT
The Rough Set Exploration System (RSES 2.1)
is a free software tool developed at the Warsaw
University. This tool's main purpose is to provide
users with ability to use advanced data analysis
that uses results of our reserch, in particular in the
field of Rough Sets. This tool has been developed for
several years and reached some level of maturity. The
presentation will contain the general
information about the usage of RSES 2.1, the possible
fields for its application and information about the
inventory of data analysis methods that are provideded
within this software system.
 Current research at Group of Logic,
Warsaw University, Poland
(Prof. A. Skowron's group)
SPEAKER: Dr. Marcin Szczuka
Assistant Professor
The University of Warsaw
Warsaw Poland
DATE: Wednesday, Sept 1, 2004
TIME: 10:0011:00
PLACE: ED 122
ABSTRACT
The current major directions in research of
researchers associated with Group of Logic, Warsaw
University, will be sketched. In particular, we will
show the current activities that take palace
within the frame of ongoing national research grant
"Classifier networks". The presentation will focus on
perspectives and research fields that we perceive as
promising in the longer term.
 Feedforward classifier networks
SPEAKER: Dr. Marcin Szczuka
Assistant Professor
The University of Warsaw
Warsaw Poland
DATE: Tuesday, August 31, 2004
TIME: 10:0011:00
PLACE: ED 122
ABSTRACT
The problem of approximating compound concepts
and making use of them by composition, classification
and comparison is encompassed. To be able to capture the
complex and multifaceted nature of such compound
constructs we propose an approach based on so
called concept networks. These networks are
representations of simpletocompound construction of
final concept from the simpler, more basic
one. To achieve goals several techniques from
probabilistic reasoning, rough set theory and neural
networks are employed. The presentation will outline the
problem we start with and show our views on possible
solutions, including some propositions for algorithms
and solution methods.
 Rough Set based Initiative Data Mining
SPEAKER: Prof. Guoyin Wang
DATE: Friday, August 13, 2004
TIME: 11:00 a.m.
PLACE: CL 418
ABSTRACT
Rough set theory is emerging as a new tool for dealing with fuzzy and
uncertain data. In this paper, a theory is developed to express, measure and
process uncertain information and uncertain knowledge based on our result about
the uncertainty measure of decision tables and decision rule systems. Based on
Skowron's propositional default rule generation algorithm, we develop an
initiative learning model with rough set based initiative rule generation
algorithm. Simulation results illustrate its efficiency.
2003
 Using LERS for Knowledge Discovery From RealLife Data
SPEAKER: Professor Jerzy W. GrzymalaBusse
DATE: Tuesday, November 25, 2003
TIME: 1:00 p.m.
PLACE: Screening Room "C"
ABSTRACT
The data mining system LERS (Learning from Examples based on Rough
Sets), developed at the University of Kansas, induces a set of rules
from examples and classifies new, unseen examples using the induced set
of rules. LERS is equipped with a number of tools. First, a family of
programs may be used to preprocess data with errors, with missing
attribute values, and with numerical attributes. If the input data file
is inconsistent, LERS computes lower and upper approximations of all
concepts.
The classification system of LERS is a modification of the bucket
brigade algorithm. The decision to which concept an example belongs is
made on the basis of four factors: strength, specificity, matching
factor, and support. LERS is also equipped with a tool for
multiplefold cross validation. The system has been used in the medical
area, nursing, global warming, environmental protection, natural
language, data transmission, etc. LERS may process big data sets and
frequently outperforms not only other data mining systems but also human
experts.
Biography of Dr. GrzymalaBusse:
Dr. Jerzy W. GrzymalaBusse is a Professor of Electrical Engineering and
Computer Science at the University of Kansas since August of 1993. His
research interests include data mining, knowledge discovery from data
bases, machine learning, expert systems, reasoning under uncertainty and
rough set theory. Recently he participated, as a coprincipal
investigator, in the project "Informatics Techniques for Medical
Knowledge Building", together with the Medical Center of Duke
University. The project was funded by the National Institutes of
Health. Currently he is a coinvestigator in the project: "CISE
Research Infrastructure: Ambient Computational Environments", funded by
the National Science Foundation. He has published three books and over
180 articles in the above areas, mostly in data mining.Dr. Jerzy W.
GrzymalaBusse received his M.S. in Electrical Engineering from the
Technical University of Poznan, Poland, in 1964; M.S. in Mathematics
from the University of Wroclaw, Poland, in 1967; Ph. D. in Engineering
from the Technical University of Poznan, Poland, 1969 and Doctor
habilitatus in Engineering from the Technical University of Warsaw,
Poland, in 1972.
 Rough set Model based on database operation (CS836)
SPEAKER: Waqar Ahsan
DATE: Thursday, Nov 13, 2003
TIME: 2:30 pm
PLACE: CL 251
ABSTRACT
In this presentation I discuss the rough sets model based on the
database theory to take benefit of efficient set oriented database
operations. A drawback of rough set theory is the inefficiency in
computations, which limits the suitability for large data sets. In
order to find the reducts, core, dispensable attributes, rough set
needs to construct all the equivalent classes based on the attribute
values of the condition and decision attributes. This is a very time
consuming process and does not scale for large data set, which is
common in data mining application. I discuss the new set of algorithms
to calculate the core and reduct based on database based rough set
model.
 Models of Concurrency
SPEAKER: Prof. Ryszard Janicki,
Dept. Computer Science,
McMaster University
DATE: Tueday, September 16, 2003
TIME: 2:30 pm
PLACE: CL 312
ABSTRACT
Concurrent systems are abundant in human experience but their full
conceptualization and understanding still elude us. Concurrency
theory is more than 30 years old, and although many
problems are still far from a satisfactory solution,
a lot of formal techniques have been developed, including a sophisticated use of partial orders, automata, composition and decomposition operators, etc.
There are two major different (and often incompatible) attitudes towards abstracting nonsequential behaviour, one based on interleaving abstraction, another based on partially ordered causality. The interleaving models, mainly in the form of process algebras, are very structured and compositional, but have difficulty in dealing with topics like fairness, confusion, etc. The partial order models, like Petri nets, handle these problems better but are less compositional, although the recent results make that distance much smaller. Nevertheless some aspects of concurrent behaviour are difficult or almost impossible to tackle by both process algebras and partially ordered causality based models. E.g., the specification of priorities, error recovery, time testing, proper treatment of simultaneity, are in some circumstances problematic. New models, where causality is represented by two relations, can handle those problems but are often too complex for a practical use. Temporal Log!
ic, invented long before the first computers, is the logic of choice for formulating many problems occurring in concurrent systems. The talk will cover all the issues mentioned above.
 Feature and Concept Extraction in KDD
SPEAKER: Dr. Jakub Wroblewski
PolishJapanese Institute of Information Technology
DATE: September 5, 2003
TIME: 4:00 pm  5:00 pm
PLACE: ED 621
ABSTRACT
One of the most important parts of KDD (Knowledge Discovery
in Databases) process is feature extraction and selection.
This step is usually done after preprocessing of working
data set (or is considered as its part). It precedes the DM
(Data Mining) step. During the seminar, I will present some
theoretical issues and practical examples (incl.
algorithms) of the feature extraction process, treated as a
databased induction of new concepts.
In particular, I would like to address the following:
 How to evaluate a quality of a new concept;
 How to construct new attributes in case of numerical
(continuous) values;
 How to construct new attributes as discrete concepts
based on continuous values;
 How to construct attributes for complex structures, in
relational and temporal databases.
 Evolutionary Computation in Rough Sets and KDD
SPEAKER: Dr. Jakub Wroblewski
PolishJapanese Institute of Information Technology
DATE: August 29, 2003
TIME: 2:00 pm 3:00 pm
PLACE: CL 408
ABSTRACT
During the seminar I would like to address several issues
concerned
with Evolutionary Computation (EC) methods in some data
analysis
problems and applications. The approximate plan of my talk
is as follows
 Evolutionary methods: from classical genetic algorithms to
genetic programming;
 Optimization problems, especially concerned with the rough
set based methods;
 Hybrid EC solutions in KDD (examples).
I will start with a short presentation of EC for
Participants less familiar with the topic.
 Variable Precision Rough Set Inductive Logic Programming and
Statistical Relational Learning
SPEAKER: Arul Siromoney
School of Computer Science and Engineering
Anna University, Chennai 600 025, India
DATE: Wednesday, August 6, 2003
TIME: 11:00 a.m.  12:00 p.m.
PLACE: CL 408
ABSTRACT
The generic Rough Set Inductive Logic Programming (gRSILP)
model combines Rough Set Theory (RST) and Inductive Logic
Programming (ILP). Variable Precision Rough Set theory is
used to extend gRSILP to the Variable Precision Rough Set
Inductive Logic Programming (VPRSILP) model.
Statistical Relational Learning (SRL) explores approaches
to learning statistical models from relational data. One of
the approaches in SRL is stochastic logic programs. A
stochastic logic program (SLP) is a probabilistic extension
of a normal logic program that has been proposed as a
flexible way of representing complex probabilistic
knowledge.
The VPRSILP model is presented in the context of SRL and
SLP. A preliminary experiment using the Predictive
Toxicology Evaluation (PTE) Challenge dataset is presented.
The PTE Challenge dataset is based on the rodent
carcinogenicity tests conducted within the US National
Toxicology Program.
 Rough Set Approach to Approximating Compound Decisions
SPEAKER: Dr. Dominik Slezak
DATE: May 8, 2003
TIME: 1:30 PM
PLACE: ED 621
ABSTRACT
The theory of rough sets provides the tools for extracting knowledge from incomplete data based information. The rough set approximations enable to describe the decision classes, regarded as the sets of objects satisfying some predefined conditions, by means of indiscernibility relations grouping into classes the objects with the same (similar) values of the considered attributes. Moreover, the rough set reduction algorithms enable to approximate the decision classes using possibly large and simplified patterns. It corresponds to the well known Ockham's Razor Principle, as well as, e.g., to the statistical Minimum Description Length Principle.
In many approaches, especially these dedicated to (strongly) inconsistent data tables, where the decision class approximations cannot be determined to a satisfactory degree, decisions can take more complex forms, e.g. probabilistic distributions (rough membership functions and distributions) of the original decision values. In the same way, one could consider, e.g., statistical estimates, plots, etc., definable using the original attributes, in a way appropriate for a particular decision problem. Then, one should develop methods aiming at optimal approximation of such decision structures, possibly similar to the classical reduction techniques.
Complex values occur very often in the medical
domain, while analyzing heterogeneous data
gathering series of measurements, images, texts,
etc. In this paper, we analyze data about medical
treatment of patients with the head and neck
cancer cases. The data table, collected for years
by Medical Center of Postgraduate Education in
Warsaw, Poland, consists of 557 patient records
described by 29 attributes. The most important
attributes are welldefined qualitative features.
The decision problem, however, requires
approximation of especially designed complex
decision attribute, corresponding to the needs of
the survival analysis. It seems to be perfect case
study for learning how complex decision semantics
can influence the algorithmic framework and
results of its performance. It illustrates that
even quite unusual structures can be still handled
using just slightly modified rough set algorithms.
Therefore, one may conclude that the proposed
methodology is applicable not only to the
presented case study but also to other medical, as
well as, e.g., multimedia or robotics problems.
 Bayesian extension of VPRS
SPEAKER: Dr. Dominik Slezak
DATE: Monday, May 5
TIME: 1:30PM
PLACE: ED 621
ABSTRACT
The variable precision rough set (VPRS) model
introduces definitions of the approximate set
positive and negative regions, which depend on the
settings of model parameters defining the
permissible levels of uncertainty associated with
each of the rough regions.
Using model parameters them to define the
approximation regions is often not required. For
that reason, we introduce a nonparametric
modification of VPRS model, where the prior
probability of the event is used as a benchmark
value, against which the quality of available
information about objects of the universe of
interest can be measured. We consider three
possible scenarios in that respect:
1. The acquired information increases our
perception of the likelihood that the event of
interest would happen
2. The acquired information increases the
assessment of the probability that the event would
not happen
3. The acquired information has no effect at
all
Such a categorization of the universe leads to the
Bayesian Rough Set (BRS) model, which seems to be
more appropriate to application problems concerned
with achieving any certainty gain in the decision
making of prediction processes rather than meeting
specific certainty goals.
A next step is to think about BRS model as a
special case of some more general, parametric
approach, just like in case of the classical rough
set model, which is a special case of VPRS model.
We present the Variable Precision Bayesian Rough
Set (VPBRS) model, where the set approximations
correspond to the following situations:
1. The acquired information sufficiently
increases our perception of the likelihood that
the event of interest would happen
2. The acquired information sufficiently
increases the assessment of the probability that
the event would not happen
3. The acquired information has almost no
effect at all
The verbs ''sufficiently'' and ''almost'' are
expressed in terms of mathematical constrains
parameterized by appropriately tuned thresholds.
Besides development of theoretical foundations of
VPBRS, an important issue is to investigate its
properties in terms of being able to capture the
quality of attributes (columns, features) in the
analysis of real life data  and hence, its
applicability in the feature selection, extraction
and reduction problems. We concentrate on the
issue of the attribute reduction, which is
addressed in the theory of rough sets in terms of
(approximate) decision reduct. We adapt the
relative probabilistic gain function to evaluate
the global average information gain associated
with a subset of features. We also formulate
criteria for maintaining the level of the
probabilistic gain during the process of attribute
reduction. Finally, we draw a connection between
those criteria and the reduction principles, based
on discernibility between the set approximation
regions.
 A Better Driver Assessment using the Variable Precision Rough
Set Methodology
PEAKER: Kwei Aryeetey
DATE: March 31, 2003
TIME: 1:00 PM
PLACE: 285 Riddell Centre
ABSTRACT
In most jurisdictions in North America, traffic violation and accident
experience of drivers are closely monitored by jurisdictional licensing
agencies. Studies conducted so far on driver behaviour have also
concluded that past accidents are better predictors of future accidents
than past violations.
Accidents are enormously costly, both socially and economically, to
drivers and insurance companies. In order to prevent these costs,
jurisdictions must be able to identify potentially unsafe drivers and
intervene through education or remedial actions. These measures could
take the form of warnings, stiff penalties, suspensions from driving for
specific periods of time or the imposition of high annual insurance
premiums for these drivers.
This presentation focuses on a different approach of analyzing
drivingbehaviour of drivers, using the variable precision rough sets
by modeling the relationship between a driver's socioeconomic,
demographic, traffic conviction and accident
history, other characteristics, and the future probability of being in
an atfault car accident.
 Acquisition of Control Algorithms Using Rough Sets
SPEAKERS: Fulian Shang and Peng Yao
Department of Computer Science
DATE: March 24, 2003
TIME: 1:00 PM
PLACE: Riddell Centre 285
ABSTRACT
The seminar will include two short presentations
of two projects concerned with applications of
rough set theory to control.
First, a multiinput multioutput (MIMO) data
acquired controller using system of hierarchical
decision tables for a simulated vehicle driving
control problem will be presented. The simulator
incorporates dynamic mathematical model of a
vehicle driving on a track. Sensor readings and
expert driver control actions are accumulated to
derive the vehicle control model. Sensor readings
include random error to reflect realistic data
acquisition conditions. The methodology of rough
sets is being used to process the data and to
automatically derive the control algorithm.
In the second, the research is to investigate the
application of rough set theory in the automatic
acquisition of control algorithms from operation
log files aquired when controlling concurrent
processes by human operators. Our objective is to
develop a methodology for the elimination of the
mathematical modeling and manual programming
stages in the development of a control system for
such systems. The approach includes a training
stage followed by automatic generation of a
decision algorithm. The research will use rough
setbased data modelling techniques to obtain a
control algorithm from an operator's history log
and to apply this control algorithm in controlling
concurrently moving objects in real time.
 Variable Precision Rough Sets in Modeling from Data (2)
SPEAKER:Dr. Wojciech Ziarko
Department of Computer Science
DATE: March 17, 2003
TIME: 1:00 PM
PLACE: 285 Riddell Centre
ABSTRACT
The presentation is a continuation of the
introduction to the Variable Precision Rough Set
Model (VPRSM). Two main topics will be discussed.
The first is the derivation of hierarchical
structures of decision tables in the context of
VPRSM, which will include presentation of two
methods for forming such structures. The
structures constitute predictive models derived
from data, which are applicable to variety of
problems existing in data mining, pattern
recognition, control etc.
The second is the discussion of different
evaluative measures to assess the quality of the
decision tablebased models. A number of measures
will be presented, some of which are
generalizations of original measures introduced by
Pawlak.
Finally, work in progress on application of
the presented methodologies to analysis of car
insurance company database will be discussed.
 An Introduction to the Variable Precision Rough Set Model
SPEAKER:Dr. Wojciech Ziarko
Department of Computer Science
DATE: March 10, 2003
TIME: 1:00 PM
PLACE: 285 Riddell Centre
ABSTRACT
The Variable Precision Rough Set Model (VPRSM) was
introduced in early nineties as a probabilistic
extension of the original Rough Set Theory (RST).
The VPRSM generalizes RST by adapting a partial
inclusion relation for definitions of rough
approximation regions, that is, positive region,
boundary region and negative region of a set.
The generalization is motivated by frequent
absence of positive or negative regions in
application problems dealt with the original RST.
In other words, in many practical
problems existing in data mining, machine
learning, pattern classification etc., it is not
possible to identify deterministic rules or
patterns from data but probabilistic patterns can
be found. The VPRSM helps in applying the results
and methodologies of RST to analysis,
identification and optimization of probabilistic
regularities in data. The presentation will
introduce the basics and motivations behind the
VPRSM and its relationship to the original RST.
 On Generalizing Rough Set Theory (II)
SPEAKER: Dr. Yiyu Yao
Department of Computer Science
DATE: Monday, March 3, 2003
TIME: 1:00 PM
PLACE: Riddell Centre 285
ABSTRACT
I will continue with last week's talk.
 On Generalizing Rough Set Theory (I)
SPEAKER: Dr. Yiyu Yao
Department of Computer Science
DATE: Monday, February 24, 2003
TIME: 1:00 PM
PLACE: Riddell Centre 285
ABSTRACT
This talk summarizes various formulations of the standard rough
set theory. It demonstrates how those formulations can be
adopted to develop different generalized rough set theories. The
relationships between rough set theory and other theories are
discussed.
 Toposes and rough set theory (III)
SPEAKER: Dr. Jonathon Funk
Department of Mathematics & Statistics
DATE: Monday, February 10, 2003
TIME: 1:00 PM
PLACE: Riddell Centre 285
ABSTRACT
We continue our analysis of the category K of
generalized upper approximations associated with a
Pawlak relational system. We discuss the Yoneda
embedding, constant objects, global sections,
finite limits, exponential objects, and the
subobject classifier.
 Toposes and rough set theory (II)
SPEAKER: Dr. Jonathon Funk
Department of Mathematics & Statistics
DATE: Monday, February 3, 2003
TIME: 1:00 PM
PLACE: Riddell Centre 285
ABSTRACT
Previously, we defined the category K (of
generalized upper approximations) associated with
a Pawlak relational system. We next explain the
properties of K (K is a topos), and consider other
examples of objects that live in this category.
 Toposes and rough set theory
SPEAKER: Dr. Jonathon Funk
Department of Mathematics & Statistics
DATE: Monday, Jan. 27, 2003
TIME: 1:00 PM
PLACE: Riddell Centre 285
ABSTRACT
A natural connection between topos theory and rough set theory
can be found by considering a variation over equivalence relations.
I will begin with some basic category theory needed to explain this
connection.
2002
 LOGIC: An Exlogician perspective (III)
SPEAKER: Dr. Anita Wasilewska
DATE: September 23, 2002
TIME: 10:30 AM
PLACE: LB 235
ABSTRACT
An algebraic approach to classical and nonclassical logics. History, techniques and results. Logics that are defined only algebraically. Case studies and connection with Rough Sets.
 LOGIC: An Exlogician perspective (II)
SPEAKER: Dr. Anita Wasilewska
DATE: September 20, 2002
TIME: 1:30 PM
PLACE: LB 268
ABSTRACT
Automated proof systems. Definition, types, history. Constructive and
nonconstructive proofs of completeness theorem. Gentzen and Resolution.
Case study: First constructive proof of completeness theorem for
classical predicate logic.
 LOGIC: An Exlogician perspective (I)
SPEAKER: Dr. Anita Wasilewska
Computer Science Department
State University of New York
Stony Brook, NY
DATE: September 19, 2002
TIME: 1:00 PM
PLACE: CL 232
ABSTRACT
What is LOGIC: distinctions, classifications and
definition. Distinctions: philosophical logic,
mathematical logic, and logics for computer
science. Classifications (some): propositional,
predicate, classical, nonclassical, logics that
extend to and from classical logic, logics that
extend to and from intuitionistic logic, and
others. Examples (and history). Syntax and
semantics: general definition and distinctions.
Techniques for a proof of completeness theorem.
Theories based on a (given) logic: Consistency,
incompleteness theorem. Foundations of Mathematics,
logic as foundation of AI.
 Entropy Based Approximate Reducts and Networks
SPEAKER: Dr. Dominic Slezak
DATE: Sept 17, 2002
TIME: 1:00 PM
PLACE: CL 232
ABSTRACT
Information entropy measures are widely applied to
evaluate the degree of probabilistic dependencies
between random variables. At the level of data
analysis, one can efficiently apply entropy to
selection, extraction and reduction of features
providing optimal data models. We discuss two
applications of information entropy to modeling
data dependencies:
The first application is related to the rough set
approach to the construction of classification
models. It is based on the paradigm of reducing
attributes irrelevant with respect to determining a
distinguished decision attribute. The degree of
such irrelevance can be expressed in probabilistic
terms, by using entropy. Hence, one can consider a
kind of approximate reduction principle, claiming
that attributes should be removed during the
reduction process, if and only if entropy of the
model remains approximately at the same level. We
discuss various specifications of this principle,
as well as computational complexity of optimization
problems concerning the search for entropy based
approximate decision reducts.
The second application is related to the notion of
an approximate Bayesian network, capable to encode
the statements about approximate conditional
independence between random variables. The usage of
information entropy to approximate the notion of
probabilistic independence is a natural consequence
of its fundamental properties. The notion of an
entropy based approximate decision reduct becomes
to correspond to the notion of an approximate
Markov boundary  irreducible subset of random
variables making a distinguished variable
approximately independent from the rest of them. We
discuss the advantages of dealing with approximate
conditional independence statements and approximate
Bayesian networks while analyzing real life data.
We also show mathematical foundations for
generalizing results concerning classical Bayesian
networks onto the entropy based approximate case.
 Data Based Approximation of Complex Concepts
SPEAKER: Dr. Dominic Slezak
DATE: Sept 10, 2002
TIME: 1:00 PM
PLACE: CL 232
ABSTRACT
The theory of rough sets provides a clear and
efficient approach to the concept approximation
tasks. The most common application is the data
based construction of decision models, where the
concepts correspond to the values of a
distinguished decision feature. In case of many
decision problems there is an issue of data
inconsistency, where construction of deterministic
models is impossible. This problem is addressed by
introducing, e.g., the set approximations,
generalized decision functions and rough membership
functions. In specific applications, decision can
be expressed as a continuous value, function plot,
probabilistic distribution, etc. Then there is a
need for measuring how close two values of decision
are. Such measures may be devised in a manner
supporting the particular goal we want to achieve.
The above issues are illustrated by the example of
application of rough set based tools to the
postsurgery survival analysis. Decision problem is
in this case defined over data related to the head
and neck cancer cases, for two types of medical
surgeries. The task is to express the differences
between expected results of these surgeries and to
search for rules discerning different survival
tendencies. The needs of considering complex
decision values, like the plots of KaplanMeier
product estimates and contingencylike cross
distributions of the success ratio against the type
of surgery, are discussed.
Data based concept approximation may be also
applied to the construction of decision models in a
hierarchical way. In many situations decision
states are impossible to be expressed by means of
simple decision rules. A possible approach is to
follow the layered learning paradigm. It
corresponds to the theory of rough mereology, where
at each level of the learning hierarchy one tries
to approximate more complex concepts, by basing on
those from previous levels. This principle is
illustrated by applications to the synthesis of
decision models, multilayered feature extraction,
as well as the others.
 AUTOMATIC CLASSIFICATION OF MUSICAL INSTRUMENTS SOUNDS BASED ON WAVELETS AND NEURAL NETWORKS
SPEAKER: Dr.Bozena Kostek
DATE: July 25, 2002
TIME: 2:00 PM
PLACE: CL 345
A study on the classification of musical instruments by means of the wavelet analysis and artificial neural networks will be shown. A short discussion on pitch detection methods of musical sounds will be presented. Then, some details of the engineered pitch detection method are shown. Several analyses exemplifying problems related to automatic pitch tracking process are included. Principles of the waveletbased parameterization of musical instrument sounds will be presented and a set of parameters resulting from the parametrization process will be shown. Artificial neural networks were used for classification purposes. Exemplary results obtained in the carried out investigations will be presented and discussed.
 NEURAL NETWORK  BASED IDENTIFICATION OF SOUND SOURCE POSITION AND MUSICAL SOUND PITCH
SPEAKER: Dr. Andrzej Czyzewski
DATE: July 25, 2002
TIME: 3:00 PM
PLACE: CL 345
Sound source position identification systems are often used in many
telecommunication areas. Numerous approaches to this task were
developed. Usually such systems are based on digital signal
processing technology and are computationally intensive.
This paper presents one of alternative methods implementing
intelligent neural networkbased decision module. The method effectiveness
was tested with various types and structures of multilayer neural networks.
The obtained results will be presented and discussed.
A new approach to musical signal pitch prediction based on musical
knowledge modeling will be discussed in the second part of the presentation.
First, signal is partitioned into segments roughly analogous to consecutive
notes. Thereafter, for each segment an autocorrelation function is
calculated. Autocorrelation function values are then altered using pitch
predictor output. A music predictor based on artificial neural networks
was introduced for this task. The description of the proposed pitch
estimation enhancement method is included and some details concerning
music prediction and recognition are discussed in the paper.
 Studies in Rough Sets and Variable Precision Rough Sets
SPEAKER: Dr. Arul Siromoney
DATE: July 22
TIME: 1:30 PM
PLACE: CL 305
ABSTRACT
This talk presents a survey of the author's
research studies in Rough Set Theory (RST) and
Variable Precision Rough Sets (VPRS).
One area of research is the intersection of Rough
Set Theory and Inductive Logic Programming. ILP
uses positive examples, negative examples and
background knowledge to induce a hypothesis that,
along with the background knowledge, describes the
positive examples (completeness), without
describing the negative examples (consistency).
The
examples, background and hypothesis are all
expressed as Prolog clauses. The notions of
consistency and completeness in ILP are studied in
RST in a finite universe and in an extension to
future test cases. RST is extended to ILP, and
elementary sets are defined that any induced logic
program, for that background knowledge and
declarative bias, cannot distinguish between the
elements of the elementary set. VPRS is then
extended to ILP (VPRSILP). The VPRS and VPRSILP
models are also defined for universes that include
future test cases.
VPRS is also used after a domainrelevant
preprocessing stage. Domain relevant techniques are
used to generate a small number of attributes, that
are then used in VPRS for classification.
Illustrative examples include the identification of
transmembrane domains and classification of web
usage sessions.