English 中文(简体)
使用Prolog DCG拆分字符串
原标题:Using a Prolog DCG to split a string

我正在尝试使用DCG将字符串拆分为两部分,用空格分隔。例如,abc def应该给我“abc”&;“def”。程序&;DCG如下。

main:-
    prompt(_,   ),
    repeat,
    read_line_to_codes(current_input, Codes),
    (
        Codes = end_of_file
    ->
        true
    ;
        processData(Codes),
        fail
    ).

processData(Codes):-
    (
        phrase(data(Part1, Part2), Codes)
    ->
        format( ~s, ~s
 , [ Part1, Part2 ])
    ;
        format( Didn  t recognize data.
 )
    ).

data([ P1 | Part1 ], [ P2 | Part2 ]) --> [ P1 | Part1 ], spaces(_), [ P2 | Part2 ].
spaces([ S | S1 ]) --> [ S ], { code_type(S, space) }, (spaces(S1); "").

这是正确的。但我发现必须键入[P1|Part1]&[P2|Part2]确实很冗长。因此,我尝试将[P1|Part1]w/Part1&;类似地,在data的定义中,w/[P2|Part2],即以下内容。

data(Part1, Part2) --> Part1, spaces(_), Part2.

这更容易键入,但这给了我一个<code>参数没有充分实例化的</code>错误。因此,看起来一个未绑定的变量不会自动解释为DCG中的代码列表。有没有其他方法可以让这件事不那么冗长?我的意图是在其他编程语言中使用正则表达式的地方使用DCG。

最佳回答

你的直觉是正确的;DCG(至少在SWI Prolog中,但应适用于其他产品)的术语扩展过程,以及您的数据的修改版本,给出了以下内容:

?- listing(data). 

data(A, D, B, F) :-
    phrase(A, B, C),
    spaces(_, C, E),
    phrase(D, E, F).

如您所见,DCG规则的变量Part1Part 2部分已被解释为再次调用短语/3,而不是列表;您需要明确指定它们是列表,以便将它们视为列表。

我可以建议一个更通用的替代版本。考虑以下一组DCG规则:

data([A|As]) --> 
    spaces(_), 
    chars([X|Xs]), 
    {atom_codes(A, [X|Xs])}, 
    spaces(_), 
    data(As).
data([]) --> [].

chars([X|Xs]) --> char(X), !, chars(Xs).
chars([]) --> [].

spaces([X|Xs]) --> space(X), !, spaces(Xs).
spaces([]) --> [].

space(X) --> [X], {code_type(X, space)}. 
char(X) --> [X], {+ code_type(X, space)}.

看看顶部的第一个子句;data规则现在尝试匹配0到多个空格(由于剪切,尽可能多),然后匹配一到多个非空格字符以从代码中构造原子(A),然后再次匹配0到很多空格,然后递归以在字符串中查找更多原子(as。你最终得到的是一个原子列表,它出现在输入字符串中,没有任何空格。您可以使用以下内容将此版本合并到代码中:

processData(Codes) :-
    % convert the list of codes to a list of code lists of words
    (phrase(data(AtomList), Codes) ->
        % concatenate the atoms into a single one delimited by commas
        concat_atom(AtomList,  ,  , Atoms),
        write_ln(Atoms)
    ;
        format( Didn  t recognize data.
 )
    ).

这个版本用单词之间任意数量的空格来分隔字符串,即使它们出现在字符串的开头和结尾。

问题回答

暂无回答




相关问题
Reserved keywords in Objective-C?

At the CocoaHeads Öresund meeting yesterday, peylow had constructed a great ObjC quiz. The competition was intense and three people were left with the same score when the final question was to be ...

ANTLR grammar license [closed]

I m planning to make an implementation of Lua for the DLR, and i would like to use the listed Lua 5.1 grammar here. However i can t see a license that it was released under, so can someone please ...

Does anyone recognise this unfamiliar notation?

I have a question from a test in a Programming Languages class that is confusing me. Give a context-free grammar to generate the following language L = { aibjck | 0 <= i <= j <= i + k } I ...

Question about building a symbol table with a yacc parser

If my yacc parser encounters the following code: int foo(int a, int b) should it add int a and int b as attributes of foo? The way I have it now, it enters a and b as separate table entries.

Tips on Using Bison --graph=[file] on Linux

Recently (about a month ago) I was trying to introduce new constructs to my company s in-house extension language, and struggling with a couple of reduce-reduce errors. While I eventually solved this ...

Yacc program not recognizing function declaration

I think my program should be able to recognize the following as a function declaration: int fn(int i) { int n; return; } but it doesn t. Here s the relevant part of my yacc file: program : ...

Grammars, Scala Parsing Combinators and Orderless Sets

I m writing an application that will take in various "command" strings. I ve been looking at the Scala combinator library to tokenize the commands. I find in a lot of cases I want to say: "These ...

What s the matter with this Grammar?

grammar Test; IDHEAD: ( a .. z | A .. Z | _ ); IDTAIL: (IDHEAD | 0 .. 9 ); ID: (IDHEAD IDTAIL*); fragment TYPE: ( text | number | bool ); define: define ID as TYPE; The problem ...

热门标签