ABL : Alignment-Based Learning
Introduction
Alignment-Based Learning (ABL), introduced in [1], is a symbolic grammar inference framework that has successfully been applied for several unsupervised machine learning tasks in Natural Language Processing (NLP). Given sequences of symbols only, a system that implements ABL induces structure by aligning and comparing the input sequences. As a result, the input sequences are augmented with the induced structure.The demo on this page allows you to apply ABL on unstructured text. For a stand-alone implementation for non-commercial educational purposes and research, see the software release announcement or go to this page
.Alignment learning
The first phase in ABL, alignment learning, builds a search space of possible structures, called hypotheses, by comparing the sequences and clustering sub-sequences that share similar context. E.g., when we have the following input sentences:
she is smart she is tired |
(0 she is (1 smart )1 )0 |
she runs from CBD to Bondi he drives from CBD to Bondi he walks terribly slow |
Selection learning
In the second phase, selection learning, the most probable combination of hypothesized structures are selected in an Expectation-Maximalization search and overlapping constituents are eliminated according to a heuristic. The final induced structure thus becomes:
0 ->
1 from CBD to Bondi 0 -> he 2 1 -> she runs 1 -> he drives 2 -> walks terribly slow |
If you have any questions or comments, please do not hesitate to contact me.
References
| [1] | Menno van Zaanen (2001) Bootstrapping Structure into
Language: Alignment-Based Learning. PhD Thesis, School of Computing,
University of Leeds, UK. [ PDF ] |
| [2] | Jeroen Geertzen and Menno van Zaanen (2004). Grammatical Inference using Suffix Trees. In: Proceedings of the 7th International Colloquium on Grammatical Inference (ICGI) Athens, Greece, October 11-13. [ PDF ] |
