Definite Clause Grammar Translation



Definite clause grammar processing is based on difference lists. Difference lists are used to avoid incessantly appending lists. For instance with a production A --> BC, to test if a list is derived from A, we must test if it is the concatenation of a list derived from B with a list derived from C. This would mean every derivation step requires a concatenation (several for longer right-hand sides), typically of long lists. With difference list processing, we require no list concatenation at all!


A difference list in Prolog represents a list by using two lists. The idea is, for instance, to use lists [1,2 | Xs] and Xs to represent the list [1,2] = [1,2 | Xs] - Xs, the “difference”. And Xs can be any other list allowing it to change dynamically. The advantage is in concatenating difference lists since to append difference lists As-Bs and Bs-Cs, the result is simply As-Cs -- no concatenation, just another pairing.


When a Prolog program is consulted, a DCG rule such as

p(X, Y) --> q(X), r(X, Y), s(Y).

is translated into the normal Prolog clause

p(X, Y, L0, L) :- q(X, L0, L1), r(X, Y, L1, L2), s(Y, L2, L).


Supplied parameters (which are optional) X and Y are retained, and two additional difference list parameters are added at the end. Thus the interpretation is that p(X,Y) derives the initial portion of list L0 leaving L -- that is, p(X,Y) derives L0-L. The above clause is read as: list L0-L is derived by p(X,Y) if q(X) derives list L0 leaving L1, r(X,Y) derives L1 leaving L2, and s(Y) derives L2 leaving L. Then a query to check if some specific list As is derived using parameters 'a' and 'b' is entered as

?- p(a,b,As,[ ]).


Deriving literals is translated using the built-in predicate 'C'(L1, X, L2), read as “'C' derives list L1 leaving list L2 if L1 = [X | L2]”, and processing the single token X. It is defined (only internally) by the single clause

'C'([X|S], X, S).

The 'C' predicate is motivated by a rare technical situation but is used for all terminals to replace the use of an append operation;  it has been given the name upper-case C to avoid accidentally clashing with any ordinary or internal Prolog predicate name.


So, for instance, DCG

cmds(X) --> [go,to], label(X), [stop].

label(X) --> [X].

is translated as

cmds(X, S0, S) :- 'C'(S0, go, S1), 'C'(S1, to, S2),

                   label(X, S2, S3), 'C'(S3, stop, S).

label(X, S0, S) :- 'C'(S0,X,S).

and a corresponding query that succeeds is

?- cmds(L,[go,to,abc,stop],[ ]).

yes L = abc