Q-systems
This article includes a list of general references, but it lacks sufficient corresponding inline citations. (May 2012) |
Q-systems are a method of directed graph transformations according to given grammar rules, developed at the Université de Montréal by Alain Colmerauer in 1967–70 for use in natural language processing. The Université de Montréal's machine translation system, TAUM-73, used the Q-Systems as its language formalism.
The data structure manipulated by a Q-system is a Q-graph, which is a directed acyclic graph with one entry node and one exit node, where each arc bears a labelled ordered tree. An input sentence is usually represented by a linear Q-graph where each arc bears a word (tree reduced to one node labelled by this word). After analysis, the Q-graph is usually a bundle of 1-arc paths, each arc bearing a possible analysis tree. After generation, the goal is usually to produce as many paths as desired outputs, with again one word per arc.
A Q-System consists of a sequence of Q-treatments, each being a set of Q-rules, of the form <matched_path> == <added_path> [<condition>]. The Q-treatments are applied in sequence, unless one of them produces the empty Q-graph, in which case the result is the last Q-graph obtained. The three parts of a rule can contain variables for labels, trees, and forests. All variables after "==" must appear in the <matched_path> part. Variables are local to rules.
A Q-treatment works in two steps, addition and cleaning. It first applies all its rules exhaustively, using instantiation (one-way unification), thereby adding new paths to the current Q-graph (added arcs and their trees can be used to produce new paths). If and when this addition process halts, all arcs used in some successful rule application are erased, as well as all unused arcs that are no more on any path from the entry node to the exit node. Hence, the result, if any (if the addition step terminates), is again a Q-graph. That allows several Q-Systems to be chained, each of them performing a specialized task, together forming a complex system. For example, TAUM 73 consisted of fifteen chained Q-Systems.
An extension of the basic idea of the Q-Systems, namely to replace instantiation by unification (to put it simply, allow "new" variables in the right hand side part of a rule, and replace parametrized labelled trees by logical terms) led to Prolog, designed by Alain Colmerauer and Philippe Roussel in 1972. Refinements in the other direction (reducing non-determinism and introducing typed labels) by John Chandioux led to GramR, used for programming METEO from 1985 onward.
In 2009, Hong Thai Nguyen of GETALP,[1] Laboratoire d'Informatique de Grenoble[2] reimplemented the Q-language in C, using ANTLR to compile the Q-systems and the Q-graphs, and an algorithm proposed by Christian Boitet (as none had been published and sources of the previous Fortran implementation had been lost). That implementation was corrected, completed and extended (to labels using Unicode characters and not only the printable characters of the CDC6600 of the historical version) by David Cattanéo in 2010-11.
See also
[edit]References
[edit]- ^ "Groupe d'Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parol" (in French).
- ^ "LIG" (in French).
Further reading
[edit]- Colmerauer, A: Les systèmes Q ou un formalisme pour analyser et synthétiser des phrases sur ordinateur. Mimeo, Montréal, 1969.
- Nguyen, H-T: Des systèmes de TA homogènes aux systèmes de TAO hétérogènes. thèse UJF, Grenoble, 2009.