
Xiaoli
Z. Fern, Ph.D
Assistant Professor
To contact me:
Office: Kelly 3073
Phone: (541)737-2557
e-mail: xfern AT eecs.oregonstate.edu
Quick links: Teaching, Research and Publications, Students,
CV
Eductation:
Ph.D, Computer Engineering, ECE, Purdue University, Indiana, USA.
2005
M.S. Institute of Image Processing and Pattern Recognition, Shanghai
Jiao Tong Univ., Shanghai, China 2000
B.S. Automation, Shanghai Jiao Tong University, Shanghai, China 2000
Short Biography:
Dr. Xiaoli Fern is an
assistant professor at the School of
Electrical Engineering and Computer Science, Oregon State University,
Corvallis, OR, since 2005. She received her Ph.D. degree in
Computer Engineering from Purdue
University, West Lafayette, IN, in 2005 and her M.S. degree from
Shanghai Jiao Tong university (SJTU), Shanghai China in 2000. Her Ph.D
work was about high dimensional data clustering and
correlation analysis applied to remote sensing data and environmental
science applications. She was the publicity chair for the international
conference on Machine Learning in 2007 and she has served regularly in
the program committees for a number of international conferences such
as ICML, AAAI, KDD. She was awarded ACM 2005 -2006 professor (school of
EECS) of the year.
Research Interest:
My general research interests are
in the area of data mining and machine learning. I am particularly
interested in the following areas.
- Ensemble methods for unsupervised learning
- Multi-view clustering of data
- Interactive unsupervised
learning for self trained systems
- Core cluster identification
- Data mining interpretability
issues
- Active learning
I also work in a variety of
application areas:
- Ecological and enviromental science
- Biological data analysis - gene sequence and expression data
analysis
- Human Computer interaction Data analysis
- Bioacoustics
NEWS
- Our journal article "Mining Problem-solving Strategies from HCI
log data" is accepted by ACM Transaction on human computer interaction
(TOCHI), special issue on "Data mining for understanding user behavior"
- Our journal version of "Cluster
Ensemble Selection" will appear in Statistical Analysis and
Data Mining, special issue on "Best of SDM08"
- Our paper "Cluster Ensemble Selection"
accepted by SIAM Data Mining (SDM08).
Teaching
Explorative data clustering involves
grouping objects into clusters such that similar objects are grouped
together. My research attemps to advance the field of unsupervised
clutsering in a number of directions.
First, motivated by the fact that objects in a data set maybe similar
to each other in multiple different ways, and different clustering
structures may exist in the same data. I
am interested in exploratively examine data in different ways to
produce different clusterings. Such clusterings can be sometimes
combined to provide a more reliable view of the structure of the
data via cluster
ensemble methods,
or other times examined individually as they may provide different
insights (multi-view
clustering).
I am also interested in developing techniques that can efficiently and
effectively identify coherent clusters within data without partitioning
all data points into clusters. In such a setting, only a portion of the
data gets clustered and the same data point may belong to multiple
clusters. This in some sense is related to outlier detection, only that
the outliers
are clusters and they may not truely be outliers.
Related publications:
Xiaoli Z. Fern and Wei Lin, "Cluster
Ensemble Selection", to appear in journal of Statistical Analysis and
Data Mining (2008)
Xiaoli Z. Fern and Wei Lin, "Cluster
Ensemble Selection", to appear in Proceedings of 2008 SIAM
International Conference on Data Mining (SDM08).
Ying Cui, Xiaoli Z. Fern and Jennifer
Dy, "Non-redundant multi-view clustering via orthogonalization", in
Proceedings of 7th IEEE International Conference on Data
Mining (ICDM07)
pdf.
Xiaoli Z. Fern and
Carla
E. Brodley,
"Cluster ensembles for high dimensional data clustering: An
empirical study", Techenical report
CS06-30-02.
Xiaoli Z. Fern and Carla E. Brodley, "Solving cluster
ensemble problems by bipartite graph partitioning", in
Proceedings of 21th International Conference on Machine learning
(ICML2004),
PDF file,
Matlab implementation of the algorithm ( Note:
this code is provided on "as is" basis for research use only. )
Xiaoli Z. Fern and Carla E. Brodley, "Random Projection
for High Dimensional Data Clustering: A Cluster Ensemble
Approach", in Proceedings of 20th International Conference on
Machine learning (ICML2003),
PDF file
- Mining Gene Sequence data and expression data. This is a joint
project with Todd
Mockler and Wengkeen
Wong at the center for genome research and computing.
In this project, we are interested in jointly analysis the
expression data and the promoter sequences of the genes in order to
more reliable identify co-regulated gene groups and their transcription
factor binding sites.
- Mining human strategies (behavioral patterns) from human
computer interaction data. In this research, we apply data mining to
HCI log data with the goal of finding interesting behavioral patterns
that shed some lights on the strategies users employ while using
software for problem solving. We collaborate with the HCI researchers
at EUSES consortium
on this project.
- Xiaoli Fern, Chaitanya Komireddy, Valentina Grigoreanu, and
Margaret Burnett, "Mining Problem-Solving Strategies from HCI Data",
accepted by ACM Transaction on CHI (TOCHI), under final revision.
- Neeraja Subrahmaniyan, Laura Beckwith, Valentina Grigoreanu,
Margaret Burnett, Susan Wiedenbeck,Vaishnavi Narayanan, Karin Bucht,
Russell Drummond, and Xiaoli Fern, “Testing vs. Code Inspection vs. ...
What Else? Male and Female End Users’ Debugging Strategies”, ACM
Conference on Human-Computer Interaction, April 2008 (CHI2008)
- Xiaoli Fern, Chaitanya Komireddy,
Margaret Burnett, "Mining Interpretable Human Strategies: A Case
Study", to appear in Proceedings of 7th IEEE International Conference
on Data Mining (ICDM07) pdf. A longer tech
report version can be found here pdf.
- Grigoreanu, V., Beckwith, L., Fern, X., Yang, S., Komireddy,
C.,
Narayanan, V., Cook, C., Burnett, M., Gender Differences in End-User
Debugging, Revisited: What the Miners Found. In Proceedings of IEEE
Symposium on Visual Languages and Human-Centric Computing Languages and
Environments, Brighton, England, September 2006. pdf
- Correlation analysis on Earth science data. In this project, we
are interested in finding out how one environmental factor impact
another, for example Sea surface temperature versus Sea surface
pressure, and vegetation index versus precipitation. We developed a
method that builds mixture of
linear canonical correlation (CCA) models to explain correlation
patterns between different domains.
- Xiaoli Z. Fern, Carla E. Brodley and
Mark A. Friedl, "Correlation clustering for learning mixture of
canonical correlation models ", Accepted for SIAM International
Conference on Data Mining (SDM2005). PDF file; Matlab code for
generating synthetic data
with a prespecified correlation structure.
- Xiaoli Z. Fern and Carla E. Brodley, "Boosting Lazy
Decision Trees", Proceedings of 20th Iternational Conference on
Machine Learning (ICML 2003), PDF file