Research Assistant at the University of Regina

I have been working on machine learning and data mining topics under the supervision of Dr. Sandra Zilles, Canada Research Chair in Computational Learning Theory, since August 2014.

Research interests: Artificial Intelligence, Machine Learning, Data Mining, Ensemble-based Learning Methods, Computational Learning Theory, Pattern Languages

Education

MSc in Computer Science

2014-2016

Supervisor: Dr. Sandra Zilles

Thesis: Precision-based selection criteria for classification with ensembles of learners

GPA: 95 (out of 100)

Machine Learning
Rough Sets and Applications
Artificial Intelligence
Computational Learning Theory
Knowledge Discovery in Databases

BSc in Electrical Engineering(Specialized in Telecommunications)

Amirkabir University of Technology (Tehran polytechnic), Tehran, Iran

2009-2014

Supervisor: Dr. Yaser Norouzi

Thesis: Pulse-Amplitude and Time-of-Arrival Based Pulse De-interleaving

GPA : 16.9 (out of 20) - Last year (38 credits) GPA: 17.51 (out of 20)

Achievements and Awards

• 2016
• International Experience Travel Fund, International Study Abroad and Mobility, University of Regina, Canada
• Dr. Paul W. Riegert Memorial Scholarship in Graduate Studies, Faculty of Graduate Studies and Research, University of Regina, Canada
• 2015
• 2014
• 2013
• Rated Excellent Level'' for MATLAB programming in Nation Technical and Vocational Training Organization, Iran
• 2009
• Top 0.1% of the Iranian university entrance exam: Ranked 321st among more than 300,000 participants.

Selected Projects

Ensemble-based methods

Human beings tend to seek multiple experts' opinions and combine them for making a wise decision. For example, multiple reviewers judge a paper and classify it to accept with/without revision or reject. In this way, intuitively, the risk of making a wrong decision will be reduced when compared to relying on information provided by a single expert.

A core problem in machine learning is to learn a classifier that categorizes data instances into two or more classes. Such a classifier could be considered as an artificial expert. Similar to the above-mentioned examples, combining multiple experts is a very popular approach in the field of machine learning. An ensemble of classifiers is made by integrating a set of base classifiers to build a predictive model, an idea that was formally introduced by Hansen and Salamon in 1990.

In this project, we introduce a framework of boosting that generalizes AdaBoost.M1 and derive several variants of AdaBoost.M1, based on a new approach from that framework.

Relational pattern languages

Angluin's pattern languages are built on patterns consisting of terminal symbols and variables. A string in the language of a pattern is obtained when replacing every variable with a finite nonempty string of terminal symbols. For example, let $\Sigma = \{a, b, c\}$ and X = \{ x_1, x_2, x_3 \} be a set of terminal symbols and variables respectively. Then, p_1=ax_1cx_2 is a pattern and w_1=abccc is a string in the language of pattern p_1.

In Angluin's patterns if a variable occurs more than once, each of its occurrences is replaced by the same string. For example, if p_2=x_1ax_1, then w_2=bcaabca is a string in the language of p_2, but w'_2=cbabc is NOT.

Michael Geilke and Sandra Zilles introduced Relational Patterns by allowing other relations between the variables in a pattern. Let R be a set of relations among variables. For example, consider p_3=x_1ax_2 and R_3 = \{x_1=x_2^r\} (the substitutoin for x_1 must be reverse of the substituion for x_2). Then, w_3=cbabc is a string in the language of p_3 under the relation R_3, but w'_3=cbacb is NOT.

In this project we studied a specific type of relation and considered some fundamental problems such as: decision problems, classical learnability questions, the properties of tell-tale sets, and the design of subclasses that can be learnt efficiently with membership queries, to name but a few.

Mining amino acid sequences for protein classification

Transmembrane (TM) proteins are proteins that span a cell membrane; their segments crossing the membrane are called TM domains. TM domain and TM protein detection are important problems in computational biology, but typical machine learning approaches yield classifiers that are difficult to interpret and hence yield no biological insight.

We study both TM domain and TM protein detection with easy to interpret decision trees. For TM domain detection, the use of decision trees is already reported in the literature, but we provide a critical study of the existing approach, resulting in improved feature sets as well as observations on how to avoid biased training and test sets. In particular, we discover a motif known to be common to TM domains that was not discovered in previous research using machine learning. For TM protein detection, we propose a 2-layer learning method. This method can be generalized to deal with a large class of string classification problems. The method achieves sensitivity and specificity values of up to 92% on the settings we experimented with, while providing intuitive classifiers that are easy to interpret for the domain expert.

The result of this project is published and available here.

Teaching Experience

• 2016 Winter
• Lab instructor, Introduction to Digital Systems, University of Regina, Canada
• 2015 Fall
• Teaching assistant, Introduction to Digital Systems-Lab, University of Regina, Canada
• 2015 Fall
• Teaching assistant, Introduction to Digital Systems, University of Regina, Canada
• 2015 Winter
• Teaching assistant, Computer Audio, University of Regina, Canada
• 2013 Winter
• Teaching assistant, Telecommunication I, Amirkabir University of Technology, Iran
• 2009-2014
• Tutor in mathematics for students preparing for the Iranian university entrance exam.

Contact Me

Email: nikravam(at)uregina(dot)ca