[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Position of Comments and Annotations



Dear friends,
some time ago I wrote a longer message to Peter Mosses to which he
replied only now; just in time, I hope, for a discussion about these
issues at the Language Design or Tools meeting coming up.
Based on the exchange I wrote a kind of Study Note which I am including
below. As I agree with most of Peter's remarks, you can consider this as
a joint proposal by me and him; it is based on an original proposal by 
Till Mossakowski who suggested two kinds of annotations: bracketed and unbracketed.

The proposal differs from the present solutions. Note that the issue is
really underspecified / unresolved at present and needs a solution for
tool builders. Of course, any change we make does not affect the
semantics (and hardly the syntax). 

If we adopt these (or other) changes, we should move to v1.0.1 of
CASL. 
After the discussion at ETAPS, a definite proposal will be made with a
3 weeks discussion period.

best regards
BKB

------  Summary of proposed changes ------------------------------------

						present					future
- end-of-line comment	%%...eol				%%...eol
- bracketed comment		<none>					%{...maybe 
													 several lines}%

- commented-out text	<none>					%[text, not displayed]%

- general annotation	<non-uniform>			%annoId ...eol
- bracketed annotation	<none>					%annoId(...)%

- label annotation		%[labelId] text eol		text %(labelId)% eol

- display annotation old	
		op div: Int * Int -> Int %!<latex>$\div$<html>&divide;%!

- display annotation new [independent of declaration, as for %prec etc.]	
		%display(div %latex \div 
					 %html &divide;)%


------  Proposal for (changed) syntax of annotations and comments ------

The Problem
-----------
The problem to solve is to have a well-defined Position of Comments and
Annotations for parsing, but more so for tools that handle the
specification later, e.g. by printing it in a pretty way, by adding
information from tools (annotations), by transforming it (e.g. in a
program transformation system), in general by possibly moving pieces of
"text" around. Such tools should always be working on the abstract
syntax trees and not themselves have to worry about parsing or display
of specs.

Thus the following questions arise: 
- what node of the abstract syntax tree is a comment or annotation
  associated to?
- where are comments and annotations allowed to occur syntactically?
- how does a parser or reader recognize their start and finish?

Motivation
----------
My [BKB's] motivation is to have clear and simple rules to be easily
understood by users, but also to make life for tool builders easy.

My general assumptions are 
(and these may prove to be oversimplified or simply wrong): 

(1) There should be a general, easily-parsable syntax for
annotations/comments, preferably allowing new annotations to be added
without changes to parsers
(2) There should be as few positions allowed as possible. [see
discussion below]

General rules
=============
Let us treat comments as a special kind of annotation below.

- An annotation starts with "%<a>" where "<a>" is a specific token,
  followed by a space character. It is terminated by the end of a line.
  [this means that %left assoc or %implies must be last on the line] 
  Examples are:

  o comment :                  "%% ..."
  o precedence annotation:     "%prec ..."
  o associativity annotation:  "%left assoc"
  o number annotation:         "%number ..."

  To make things uniform, the token should always be an identifier 
  (except for comments); thus

  o display annotation:        "%display ..."  [instead of "%! ..." ]

- A bracketed annotation may run over several lines; 
  it is terminated by the ending bracket:

  o "%{     .. }%" : comment             
  o "%[     .. ]%" : comment-out (in input, but never displayed!)      
  o "%(     .. )%" : label annotation
  o "%<id>( .. )%" : general annotation  [no space required before ( ]

  Bracketed annotations are important for long annotations that do not
  fit on a line, e.g. generated by a tool.

  Note that nesting is not required (this relegates the analysis to regular
  lexical analysis), except that commenting-out may enclose the others.

General positions of annotations
--------------------------------
A commnt-out may appear anywhere (text is deleted at the lexical level
and does not appear in the abstract syntax tree).

In Basic Specifications, an annotation may appear by itself at the top
of a list of BASIC-ITEMS: it then annotates the BASIC-SPEC (i.e. it
applies to the whole list). 

An annotation may appear at the end of an ALTERNATIVE in a
DATATAYPE-DECL; then it applies to this preceding ALTERNATIVE; if it is
placed at the end of a DATATAYPE-DECL, after the ";", it applies to the
whole DATATAYPE-DECL.
An annotation after a SORT-ITEM, OP-ITEM or PRED-ITEM applies to this
preceding ITEM, analogously for an AXIOM.

In Structured Specifications, an annotation may appear by itself at the
beginning of a SPEC (e.g. at the top of a list of BASIC-ITEMS, see
above): it then annotates the whole SPEC.

[Eventually, one would like to formulate the above in a more
syntax-oriented way.]

Display Annotations
-------------------
Display annotations should be divorced from declarations altogether, in
the same way as for syntax annotations (such as %prec). Anyway, a
display annotation applies to all occurrences of the token in the
scope. Example:

    %display( div <latex> \div
                  <html>  &divide;
                  <rtf>   somethingelsethatisquitelong )%

It would be even better to also use the %-style inside, thus

    %display( div %latex \div
                  %html &divide;
                  %rtf somethingelsethatisquitelong )%

Can the "formatting instructions" inside comments 
(previously e.g. <CASL>...</CASL>) now also use an annotation style,
nested inside comments? e.g. %casl(...)%

Discussion
==========

Label Annotations
-----------------
[A label annotation now FOLLOWS an axiom. This is very much intended
since it probably solves the nasty formatting problem in TEX (labels
could be right-aligned at the end of the line). Also: in mathematics,
labels for equations etc. usually also FOLLOW on the same line in
parentheses. Consider the examples below:
  
	vars x, y, z : Elem
	 .  x <= x                              %(reflexivity)%
	 .  x <= y /\ y <= z => x <= z         %(transitivity)%

BTW: why several labels in a label annotation, i.e. a "syntax" inside?
This complicates matters unnecessarily and is only rarely used. Use
several label annotations instead (syntactically)]

Comments
--------
[BKB wonders whether the comment with a ":" at the end is really
necessary and whether anybody has been able to implement it
successfully. There is a general ambiguity here: should it apply to the
subsequent item or the subsequent list? Is it worth the trouble? The
examples in the BasicDatatypes to date reveal, that it would have been
better to make general comments about a spec inside it (i.e. after the
"=" on the rhs) rather than preceding it in a rather forlorn position in
the library.
PM is not so sure.]

Position of Annotations
-----------------------
[PM: It is difficult to foresee exactly which constructs may need to be
annotated or commented upon.  We could start with few positions, and
add them as needed, but then parsers would need to be changed each
time a new one is added.  We could start with a longer list, and hope
that it will be adequate for a long time.  Or we could adopt the only
uniform rule, which is to allow them to follow ANY construct...]

[BKB: The more positions are allowed, the more complicated it becomes to
remember how to associate them, and more importantly, to move them
around in a meaningful way in transformations. If you think annotations
on TERMS are required (I wonder seriously whether this is necessary)
(not preferred):
Analogously, an annotation applies to the preceding complete (sub)TERM
or (sub)FORMULA" (e.g. to the preceding complete TERM or FORMULA in (the
branch of) a CONDITIONAL), thus to a complete AXIOM (note that
parentheses may be used). 

If you think this useful (not preferred):
An annotation may appear after a SYMB (but only the second one after
the 
"|->" in a SYMB-MAP) and annotates this SYMB or the preceding RENAMING
or RESTRICTION, resp.
This is useful for syntactic annotations, but not strictly necessary as
re-declarations can be made: perhaps that is clearer]

[BKB: I would also do away with comments before LIB-ITEMs (pertaining
to the following one); these are a little forlorn in the library; put
them inside the spec. If you want a comment pertaining to a whole
library, it should be at the head of a list of LIB-ITEMS, just as for
BASIC-ITEMS (see above)]

------ Some of Peter's comments and BKB's reply -------------------

[PM]
> I think it is helpful to make a sharp distinction between parsing and
> display tools, and other tools.  The former must indeed be able to
> deal with all the annotations, whereas the latter should work directly
> on the interchange format and thus need only look for or insert those
> annotations with which they are themselves concerned.
> 
> Your proposal seems to be concerned only with making life easier for
> builders of parsers and displayers.  That is still worthwhile, but
> please avoid giving the impression that those building other tools
> need worry about the issues - except of course that the possible
> placement of annotations in trees should be sufficiently general for
> their purposes (or extensible in a straightforward way).
> 
[BKB]
You are right, of course, except that all tools displaying information
have to be concerned with generated texts, and trafo tools (etc.) too;
there the position in the abstract syntax tree matters, if the generated
annotation should persist as a displayed text (an example is the
information for rewriting, e.g. an order that has been interactively
input with a tool but that should persist in the text from then on).
Those tools that only generate internal annotations (never to be
displayed) are of no real concernd here. There seems to be an additional
category of tools, e.g. those that display certain information on demand
(but not to be carried along with the text), e.g. the type of a term or
the list of all visible tiems from a spec, or so. These seem not to
matter either, except that they have to interact with display tools somehow.

------ Till's comments -------------------

Let me just mention some other questions concerning annotations:
- Which annotations should be translated along renamings and the like?
   (Currently, precedence and associativity annotations are translated.
     What about display annotations? Should the same display be kept
     for the renamed symbol, or should the annotation vanish?)
- Should the LaTeX display annotations be interpreted in math mode
  or not? (On p. C-14, one example contains $...$, another one
  does not). I wouls suggest to omit the $...$.
  
These questions are relevant for the implementation of the static checker.