Prev Up Next
Go backward to 4 Tool interoperability
Go up to Top
Go forward to 6 Encoding CASL into HOL

5 Parsing and static semantic analysis

Apart from having a relatively complex grammar, CASL has several features that cause some difficulties for parsing and static analysis:

  1. CASL's syntax allows user-defined mixfix syntax,
  2. CASL allows mutually recursive subsort definitions, causing loops within a naive subsorting analysis, and
  3. CASL allows overloading, and formulas which have a unique overload resolution up to semantical equivalence.

Concerning mixfix syntax, we separate parsing into two steps: The first pass of parsing can be done using standard technology (we use MLyacc), following a grammar provided in the CASL summary [CoF98]. An abstract syntax tree is produced, where formulas and terms (i.e. those parts of the specifications that may contain mixfix symbols) remain in their unparsed textual form.

Mixfix grouping analysis can be done only after a first phase of static semantic analysis has collected the operation and predicate symbols (among them the mixfix symbols). The CASL grammar is then extended dynamically according to the mixfix declarations, and formulas and terms are parsed with the generic Isabelle parser, which uses the well-known Cocke-Younger-Kasami algorithm for context-free recognition [HU79]. This grammar-parameterised algorithm has a complexity of O(n3), which is quite acceptable, since formulas and terms in CASL specifications are not that long (however, it would be much too slow to parse whole CASL specifications with this approach).

After having done the parsing of terms and formulas, those resulting parse trees are selected that are precedence correct with respect to the user-specified precedence relations. If more than one parse tree remains, the corresponding term or formula is ambiguous, and the possible disambiguations are output to the user. To obtain a concise output, not all pretty-printed forms of the parse trees are shown, but only the local places at which they actually differ.

The definition of precedence correctness follows the one of [Aas95], generalized to CASL's pre-order based precedences ([Aas95] uses number based precedences).

Concerning static semantic analysis, the treatment of subsorts and overload resolution needs a careful algorithmic design in order not to run into an exponential time trap. The details of this have already been worked out in [MKK98].


CoFI Note: T-10 -- Version: v1.0 -- 10 Dec 1999.
Comments to till@informatik.uni-bremen.de

Prev Up Next