A note on phase transitions and computational pitfalls of learning from sequences

Cornuéjols, Antoine; Sebag, Michèle

doi:10.1007/s10844-008-0063-6

A note on phase transitions and computational pitfalls of learning from sequences

Published: 23 May 2008

Volume 31, pages 177–189, (2008)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

116 Accesses
3 Citations
Explore all metrics

Abstract

An ever greater range of applications call for learning from sequences. Grammar induction is one prominent tool for sequence learning, it is therefore important to know its properties and limits. This paper presents a new type of analysis for inductive learning. A few years ago, the discovery of a phase transition phenomenon in inductive logic programming proved that fundamental characteristics of the learning problems may affect the very possibility of learning under very general conditions. We show that, in the case of grammatical inference, while there is no phase transition when considering the whole hypothesis space, there is a much more severe “gap” phenomenon affecting the effective search space of standard grammatical induction algorithms for deterministic finite automata (DFA). Focusing on standard search heuristics, we show that they overcome this difficulty to some extent, but that they are subject to overgeneralization. The paper last suggests some directions to alleviate this problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Implicit learning deficits among adults with developmental dyslexia

Article 10 February 2016

Implicit sequence learning is preserved in dyslexic children

Article 03 April 2018

Learning Languages in the Limit from Positive Information with Finitely Many Memory Changes

Notes

In this paper, following the standard Machine Learning terminology, a string is said to be covered by a FSA iff it belongs to the language thereof.
More precisely, the Prefix Tree Acceptor is obtained by merging the states that share the same prefix in the Maximal Canonical Automaton (MCA), which represents the whole positive learning set as an automaton. A PTA is therefore a DFA with a tree-like structure.
The difference with the DFA case is due to the determinisation process that forces further states merging when needed. A diffusion like analytical model was devised, that predicts the observed start of the gap with 15% precision.
The datasets and more detail are available at http://www.lri.fr/~antoine/www/pt-gi/.

References

Angluin, D. (1987). Information and Computation, 75(2), 87–106.
Article MATH MathSciNet Google Scholar
Botta, M., Giordana, A., Saitta, L., & Sebag, M. (2003). Journal of Machine Learning Research, 4, 431–463.
Article MathSciNet Google Scholar
Cheeseman, P., Kanefsky, B., & Taylor, W. M. (1991). In Proceedings of the twelfth international joint conference on artificial intelligence, IJCAI-91, Sidney, Australia (pp. 331–337).
Denis, F., D’Halluin, C., & Gilleron, R. (1996). In 13th annual symposium on theoretical aspects of computer science, LNCS (Vol. 1046, pp. 231–242), Grenoble, France, Springer.
Dupont, P., Miclet, L., & Vidal, E. (1994). In Proceedings of the second international colloquium on grammatical inference and applications, ICGI-94, (pp. 25–37).
Giordana, A., & Saitta, L. (2000). Machine Learning, 41, 217–251.
Article MATH Google Scholar
Gold, E. M. (1978). Information and Control, 37, 302–320.
Article MATH MathSciNet Google Scholar
Hogg, T., Hubberman, B., & Williams, C. (1996). Artificial Intelligence, 81, 1–15.
Article MathSciNet Google Scholar
Lang, K., Pearlmutter, B., & Price, R. (1998). In Fourth international colloquium on grammatical inference (ICGI-98) LNCS, (Vol. 1433, pp. 1–12), Springer Verlag.
Muggleton, S., & Raedt, L. D. (1994). Journal of Logic Programming, 19/20, 629–679.
Article Google Scholar
Oncina, J., & Garcia, P. (1992). Pattern Recognition and Image Analysis, 49–61.
Parekh, R., & Hanovar, V. (1997). In Li, M., & Maruoka, A. (Eds.), Workshop on algorithmic learning theory (ALT-97) LNAI, (Vol. 1316, pp. 116–131), Berlin, Germany, Springer.
Pitt, L. (1989). In Proceedings of the workshop on analogical and inductive inference (AII-89), LNCS, (Vol. 397, pp. 18–44), Springer Verlag.
Sakakibara, Y. (1997). Theoretical Computer Science, 185, 15–45.
Article MATH MathSciNet Google Scholar
Sakakibara, Y., Brown, M., Hughey, R., Mian, I. S., Sjlander, K., Underwood, R. C., et al. (1994). Nucleic Acids Research (NAR), 22, 5112–5120.
Vapnik, V. (1995). The Nature of Statistical Learning. Springer.

Download references

Acknowledgements

The authors are partially supported by the Pascal Network of Excellence IST-2002-506 778. They gratefully acknowledge the help of Nicolas Pernot for the work done at the beginning of this project.

Author information

Authors and Affiliations

AgroParisTech / INRA UMR 518, Mathématiques et Informatique Appliquées, F75005, Paris, France
Antoine Cornuéjols
Laboratoire de Recherche en Informatique, CNRS UMR 8623, Bât.490, Université de Paris-Sud, Orsay, 91405, Orsay Cedex, France
Michèle Sebag

Authors

Antoine Cornuéjols
View author publications
Search author on:PubMed Google Scholar
Michèle Sebag
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Antoine Cornuéjols.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cornuéjols, A., Sebag, M. A note on phase transitions and computational pitfalls of learning from sequences. J Intell Inf Syst 31, 177–189 (2008). https://doi.org/10.1007/s10844-008-0063-6

Download citation

Published: 23 May 2008
Issue date: October 2008
DOI: https://doi.org/10.1007/s10844-008-0063-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A note on phase transitions and computational pitfalls of learning from sequences

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Implicit learning deficits among adults with developmental dyslexia

Implicit sequence learning is preserved in dyslexic children

Learning Languages in the Limit from Positive Information with Finitely Many Memory Changes

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now