Jan Hendrik Metzen's Publications

Default OrderingSorted by DateClassified by Publication TypeClassified by Research Category

Online Skill Discovery using Graph-based Clustering

Jan Hendrik Metzen. Online Skill Discovery using Graph-based Clustering. Journal of Machine Learning Research, W&CP 24:77–88, 2012.

Download

[PDF]338.7kB  

Abstract

We introduce a new online skill discovery method for reinforcement learning indiscrete domains. The method is based on the bottleneck principle and identifiesskills using a bottom-up hierarchical clustering of the estimated transitiongraph. In contrast to prior clustering approaches, it can be usedincrementally and thus several times during the learning process. Our empiricalevaluation shows that “assuming high connectivity in the face of uncertainty”can prevent premature identification of skills. Furthermore, we show that thechoice of the linkage criterion is crucial for dealing with non-random samplingpolicies and stochastic environments.

BibTeX

@article{Metzen:JMLRWC:2013:OGAHC,
	title = {Online Skill Discovery using Graph-based Clustering},
	author = {Jan Hendrik Metzen},
	volume = {W&CP 24},
        journal = {Journal of Machine Learning Research},
        editor = {Marc Peter Deisenroth and Csaba Szepesvari and Jan Peters},
	pages = {77-88},
	year = {2012},
        abstract = {We introduce a new online skill discovery method for reinforcement learning in
discrete domains. The method is based on the bottleneck principle and identifies
skills using a bottom-up hierarchical clustering of the estimated transition
graph. In contrast to prior clustering approaches, it can be used
incrementally and thus several times during the learning process. Our empirical
evaluation shows that ``assuming high connectivity in the face of uncertainty''
can prevent premature identification of skills. Furthermore, we show that the
choice of the linkage criterion is crucial for dealing with non-random sampling
policies and stochastic environments.},
        Local-Url = "../files/jmlr_wcp24_ogahc.pdf"
        bib2html_pubtype = {Journal},
        bib2html_rescat = {Reinforcement Learning}
}

Generated by bib2html.pl (written by Patrick Riley ) on Thu May 23, 2013 11:36:00