[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Literal Syntax

To: cofi-language@brics.dk
Subject: Re: Literal Syntax
From: Markus Roggenbach <roba@informatik.uni-bremen.de>
Date: Wed, 7 Jul 1999 17:23:06 +0200 (MET DST)
Delivered-To: cofi-language-archive@daimi.au.dk
Reply-To: cofi-language@brics.dk
Sender: owner-cofi-language@brics.dk
Dear Peter,

many thanks for your detailed comments to our proposal.

> > may I remind you that there is a proposal for the concrete syntax
> > of literals (numbers, string, lists, ...) in CASL, available under
> >
> >
>
http://www.informatik.uni-bremen.de/~cofi/LiteralExtension/literal.[tex,dvi,ps,ps.gz]

>
> The title, date, and label of the above document are the same as for
> the original Summary-Changes document, which is a bit confusing.

I have changed the subtitle now.

> The full text of the proposal for literals is reproduced below, to
> facilitate citing parts of it in any further e-mail discussion.
>
> > The deadline for comments is (as for the list of changes to the
summary)
> >
> >         Monday, July the 5th
> >
> > So far we have not received any comments, but we would like
> > to have a discussion on literal syntax: it is not likely
> > that the our proposal cannot be improved.
>
> Sorry for the lateness of this response...
>
> In summary: I'm sympathetic towards most of the proposed syntax for
> literals themselves, but I'd much prefer to use annotations (rather
> than attributes and new forms of datatype declarations) for linking
> the literal syntax to declarations of operations.

OK, this would be more uniform with the other syntax annotations
(precedences and associativities). However, note that the syntax
annotations have an impact on the way how concrete syntax is
translated to abstract syntax, and thus also indirectly
influence the semantics of a given piece of concrete syntax.
A similar argument holds for the semantic annotations,
which can generate proof obligations, like e.g. instantiations
of generic specifications also do. Dieter Hutter told me that he
found this confusing and would prefer more uniformity.
This would mean to turn the semantic and parsing annotations into
constructs of the language, while annotations are used
for more special things like simplicifation orderings.
Anyway, I agree that the literal syntax declarations
should have the same status as the parsing annotations
(regardless whether they are language constructs or annotations).


> In more detail (my comments are delimited by solid lines like this):
>
________________________________________________________________________

>
>                                      CASL
>                    The Common Algebraic Specification Language
>                                      Summary
>                       -- DRAFT LIST OF CHANGES: LITERALS --
>
>                                by Till Mossakowski
>
>                                     May 1999
>
>   Changes
>
>   App. C.2.1, p. C-4 and p. C-6
>
>   Add:
>
>             TERM   ::= ... | LITERAL
>
>             TOKEN  ::= ... | DIGIT | ""
>
>             NUMBER ::= DIGIT | NNUMBER
>
>   App. C.4
>
>   Replace:
>
>        A WORD consists of a sequence of upper- and/or lower- case
letters
>        (A,...,Z and all the ISO Latin-1 national and accented letters
>        except for the Icelandic `eth' and `thorn'), digits (0,...,9),
and
>        primes (').
>
>   by:
>
>        A WORD consists of a sequence of upper- and/or lower- case
letters
>        (A,...,Z and all the ISO Latin-1 national and accented letters
>        except for the Icelandic `eth' and `thorn'), digits (0,...,9),
and
>        primes ('). It may not entirely consist of digits, nor may it
>        contain an `E' preceding or following a digit. This special
syntax
>        allows to use the binary operation E within floating point
numbers
>        without the need to insert spaces between the E and the digits.

>
________________________________________________________________________

>
> I agree that it would be nice to allow standard notation like 2.7E6 to

> be written without spaces.  But I don't see that this requires banning

> the use of `E' following or preceding a digit in all words.  My
> suggestion is to replace the last two sentences above by:
>
>        It may not start with a digit (various sequences of characters
>        starting with a digit are reserved for literal numbers).
>
> This should simplify the grammar for WORD below considerably.  Perhaps

> NNUMBER can then be eliminated altogether too, with single digits
> being reserved for literal numbers?
>
> One may then add something like:
>
>             FLOATING       ::= NUMBER . NUMBER 'E' OPT-SIGN NUMBER
>             OPT-SIGN       ::= + | - |
>
> Use of this notation would require that an operation for E has been
> declared.  (Without the above addition of FLOATING, 2.7E6 would have
> to be written as `2.7E 6' or `2.7E+6'.)

That's why I wanted to treat 'E' as single token, used for a binary
infix operation.
But I fully agree with your propsal, it is much simpler.

>
> It might be good to cater also for binary, octal, and hexadecimal
> notation for numbers.  Isn't there a rather standard notation for
> indicating the base, by prefixing a sequence of digits by `2#', `8#',
> or `16#'?  It should be easy to add this (now or in some later
> extension), provided that words are not allowed to start with a digit;

> otherwise, one might need to remove further patterns of letters and
> digits from WORD...
OK. In C, hexadecimal numbers are prefixed with 0x.

 ________________________________________________________________________

>
>
>        This is specified by the grammar:
>
>             WORD ::= E NON-DIGIT-WORD
>                    | NON-E-WORD
>
>             NON-E-WORD ::= NON-DIGIT-E-WORD
>                          | DIGIT NON-E-WORD
>
>             NON-DIGIT-E-WORD ::= NON-E-LETTER NNUMBER
>                                | NON-E-LETTER DIGIT
>                                | NON-E-LETTER WORD
>                                | NON-E-LETTER
>
>             NON-DIGIT-WORD ::= NON-DIGIT-E-WORD
>                              | E NON-DIGIT-WORD
>
>
>   Furthermore, add:
>
>             FRACTION       ::= NUMBER . NUMBER
>             NNUMBER ::= DIGIT DIGIT | DIGIT NNUMBER
>             NUMBER ::= DIGIT | NNUMBER
>
>             CHAR ::= " " | ! | '\"' | # | $ | ... | '\'' | ... | '\\'
| ...
>                    | \n | \t | \r | ...
>                    | \000 ... | \255
>             QUOTED-CHAR ::= ' CHAR '
>             STRING ::= '"' CLOSE-STRING
>             CLOSE-STRING ::= CHAR '"'
>                            | CHAR CLOSE-STRING
>
>        Only WORD, FRACTION, NNUMBER, QUOTED-CHAR, STRING are the
>        non-terminals of the lexical syntax. The other non- terminals
are
>        just auxiliary.
>
>        (N.B. There is a difficulty in using double quotation marks in
the
>        grammar: sometimes, they have to stand for themselves. In this
>        case, I enclosed them into single quotes. Till)
>
>   The following section is meant to be added as a separate section to
>   appendix C.
>
>
>   SYNTAX FOR LITERALS
>
>   In this section, several attributes for operations are introduced
that can
>   be used to provide a literal syntax for numbers, strings and lists.
>
>
>   Literal syntax for numbers
>
>        LITERAL ::= NNUMBER
>
________________________________________________________________________

>
> Each single digit could be reserved too - or is it important to allow
> the use of the constants `0' and `1' in non-numerical specifications?

Exactly, 0 and 1 could, for example, be used in

free type  bit ::= 0 | 1

>
________________________________________________________________________

>
>        OP-ATTR ::= ... | concatdigits
>
>   The attribute for declaring an operation to be used for
concatenation of
>   digits within a number is written `concatdigits'. Only one binary
operation
>   f within a specification SPEC is allowed to have this attribute,
otherwise,
>   the specification is ill-formed.
>
>   The attribute has the effect that an NNUMBER of form d1 ...dn (where

> n>1 and
>   each di is a DIGIT) is translated to the (abstract syntax of) the
term
>   f(f(...f(t_1,t_2) ...,t_n-1),t_n) where ti is the abstract syntax
tree for
>   di.
>
>   Vice versa, an abstract syntax tree corresponding to a term of the
above
>   form which is maximal (i.e., it is not a subterm of a larger term of
the
>   same form) is expected to be printed as d1 ...dn.
>
________________________________________________________________________

>
> It seems that the proposed attribute cannot be merely part of concrete

> syntax, but must be added to the abstract syntax of CASL too.  Thus
> corresponding changes would be needed in Apps. A and B, and in the
> text of the body of the summary.  This should perhaps have been stated

> explicitly in the proposal.

Yes.

>
> An alternative might be to regard the proposed attribute as an
> ANNOTATION.  Parsers and formatters could take account of it -
> complaining if they found more than one such annotation - and there
> would be no need to let it affect well-formedness: the abstract syntax

> tree produced by the above translation would determine the entire
> semantics of the specification.
>
> If there are vital reasons for preferring attributes to annotations
> here, kindly explain them.

See above.

> Similar objections apply to the other parts of the proposal below.
>
________________________________________________________________________

>
>   If there is no operation with a `concatdigits' attribute, then an
> NNUMBER is
>   not recognized as a well-formed LITERAL.
>
>        LITERAL ::= ... | FRACTION
>        OP-ATTR ::= ... | decimalpoint
>
>   The attribute for declaring an operation to be used for evaluating
the
>   decimal point within a sequence of digits within a number is written

>   `decimalpoint'. Only one binary operation f within a specification
> SPEC is
>   allowed to have this attribute, otherwise, the specification is
ill-formed.
>
>   The attribute has the effect that a FRACTION of form n1.n2 (where
each
> ni is
>   a NUMBER) is translated to the (abstract syntax of) the term
f(t_1,t_2)
>   where ti is the abstract syntax tree for ni, i=1,2.
>
>   Vice versa, an abstract syntax tree corresponding to a term of the
above
>   form which is maximal (i.e., it is not a subterm of a larger term of
the
>   same form) is expected to be printed as n1.n2.
>
>   If there is no operation with a `decimalpoint' attribute, then a
> FRACTION is
>   not recognized as a well-formed LITERAL.
>
>   Due to the special status of the character E in the lexical syntax
of WORDS,
>   it is possible without introducing further syntax to write
> exponentations of
>   form FRACTION E NUMBER in the case that there is a binary infix
> operation E
>   in the signature.
>
________________________________________________________________________

>
> The point above is that you are wanting to avoid the need for spaces
> around the E.
>
> If E is also used as a variable (as in E=mc^2 :-), the declaration of
> E as an infix operation may give ambiguities in the presence of an
> invisible `__ __' operation.  Since many users may want to import the
> basic datatype for numbers, one should be careful to avoid making
> assumptions about their general style of notation.
>
> If there is no special status of E in the lexical syntax for words,
> one could identify the binary operation to use to interpret E in
> FLOATING (see above) in the same way as for the dot in FRACTION.
> This would give extra uniformity to the proposal.

OK.

>
________________________________________________________________________

>
>   Literal syntax for strings
>
>        LITERAL ::= ... | STRING
>        OP-ATTR ::= ... | concatchars
>
>   The attribute for declaring an operation to be used for
concatenation of
>   digits within a number is written `concatchars'. Only one binary
> operation f
>   within a specification SPEC is allowed to have this attribute,
otherwise,
>   the specification is ill-formed.
>
>   The attribute has the effect that an STRING of form "c1 ...cn"
(where n>1
>   and each ci is a CHAR) is translated to the (abstract syntax of) the
term
>   f(t_1,f(t_2, ...f(t_n,t)...)) where ti is the abstract syntax tree
for ci,
>   and t is the abstract syntax tree for "".
>
________________________________________________________________________

>
> Is "" supposed to be declared as a constant?  It would seem more
> uniform to let the declared constant have an arbitrary name, and
> identify it in the same way as the concatenation operation.

OK.

>
________________________________________________________________________

>
>   Vice versa, an abstract syntax tree corresponding to a term of the
above
>   form which is maximal (i.e., it is not a subterm of a larger term of
the
>   same form) is expected to be printed as "c1 ...cn".
>
>   If there is no operation with a `concatchars' attribute, then a
STRING is
>   not recognized as a well-formed LITERAL.
>
>
>   Literal syntax for lists
>
>        DATATYPE-DECL   ::= SORT "::=" ALTERNATIVE "|"..."|"
>                                       ALTERNATIVE , LIST-BRACKETS
>        LIST-BRACKETS ::= brackets SIGNS-BACKETS PLACE SIGNS-BRACKETS
>
>   The attribute for declaring a list-like syntax for a datatype is
written
>   `brackets b1 __ b2'. Thus enclosing datatype declaration declaring
the sort
>   s must consists of excatly two ALTERNATIVEs: the first has to be a
constant
>   c of arbitrary argument type, while the second is a constructor f
with
>   exactly two argument sorts, the second of which has to be s.
>
________________________________________________________________________

>
> Although the above proposal allows the introduction of neat notation
> for literals of various sorts of collections (e.g. {e1,...,en} for
> sets, as well as [e1,...,en] for lists), I find the means used to
> determine the translation quite ugly.  In particular, it seems strange

> to REQUIRE the use of a DATATYPE-DECL; what about sub-languages that
> don't support this construct?
>
> Couldn't one instead use annotations for indicating that any outfix
> operation `b1 __ b2' is a unary bracketing constructor, determining at

> the same time the intended `nil' and `cons' (or `conc') operations
> used in translating non-unary applications?  How the latter
> constructors have been declared is then irrelevant.

OK, Bernd proposed a similar thing.
This would mean something like

free type List[Elem] ::= nil | cons (Elem; List[Elem])
op [__] : Elem -> List[Elem] %bracket-ops nil, cons


>
________________________________________________________________________

>
>   The attribute leads to an extension of the syntax for LITERALs:
>
>        LITERAL ::= ... | b1 b2
>                        | b1 TERM , ... , TERM b2
>
>   A list of form b1 t1,...,tn b2 is (where n>=0 and each ti is a TERM)
is
>   translated to the (abstract syntax of) the term f(u_1,f(u_2,
>   ...f(u_n,d)...)) where ui is the abstract syntax tree for ci, and d
is the
>   abstract syntax tree for c.
>
>   Vice versa, an abstract syntax tree corresponding to a term of the
above
>   form which is maximal (i.e., it is not a subterm of a larger term of
the
>   same form) is expected to be printed as "b1 t1,...,tn b2".
>
------------------------------------------------------------------------

>
> I hope that the above may help you clarify your proposal.  I'm
> intending to finalize the new release of the Summary by this coming
> weekend, and announce it early next week.  I'll ask Bernd to decide
> whether your (perhaps revised) proposal for literals should be
> included in the new release or not.  In the meantime, I look forward
> to your reactions to my comments.

I will try to revise the proposal on Friday.

Greetings,
Till

--
--------------------------------------------------------------------------
Markus Roggenbach                Phone +49-421-218-4683
Dept. of Computer Science        Fax +49-421-218-3054
University of Bremen             roba@informatik.uni-bremen.de
P.O.Box 330440, D-28334 Bremen   http://www.informatik.uni-bremen.de/~roba
--------------------------------------------------------------------------
Prev by Date: Re: Literal syntax
Next by Date: Re: Literal Syntax
Prev by thread: Re: Literal syntax
Next by thread: Re: Literal Syntax
Index(es):
- Date
- Thread