GFCC Syntax 


==Syntax of GFCC files==

The parser syntax is very simple, as defined in BNF:
```
  Grm. Grammar ::= [RExp] ;

  App.  RExp ::= "(" CId [RExp] ")" ;
  AId.  RExp ::= CId ;
  AInt. RExp ::= Integer ;
  AStr. RExp ::= String ;
  AFlt. RExp ::= Double ;
  AMet. RExp ::= "?" ;

  terminator RExp "" ;

  token CId (('_' | letter) (letter | digit | '\'' | '_')*) ;
```
While a parser and a printer can be generated for many languages
from this grammar by using the BNF Converter, a parser is also
easy to write by hand using recursive descent.


==Syntax of well-formed GFCC code==

Here is a summary of well-formed syntax, 
with a comment on the semantics of each construction.
```
  Grammar ::= 
    ("grammar" CId CId*)      -- abstract syntax name and concrete syntax names
    "(" "flags"    Flag* ")"     -- global and abstract flags
    "(" "abstract" Abstract ")"  -- abstract syntax
    "(" "concrete" Concrete* ")" -- concrete syntaxes

  Abstract ::= 
    "(" "fun"   FunDef* ")"      -- function definitions
    "(" "cat"   CatDef* ")"      -- category definitions

  Concrete ::= 
    "(" CId                      -- language name
      "flags"     Flag*          -- concrete flags
      "lin"       LinDef*        -- linearization rules
      "oper"      LinDef*        -- operations (macros)
      "lincat"    LinDef*        -- linearization type definitions
      "lindef"    LinDef*        -- linearization default definitions
      "printname" LinDef*        -- printname definitions
      "param"     LinDef*        -- lincats with labels and parameter value names
    ")" 

  Flag   ::= "(" CId String ")"   -- flag and value
  FunDef ::= "(" CId Type Exp ")" -- function, type, and definition
  CatDef ::= "(" CId Hypo* ")"    -- category and context
  LinDef ::= "(" CId Term ")"     -- function and definition

  Type ::= 
    "(" CId                 -- value category
      "(" "H" Hypo* ")"     --   argument context
      "(" "X" Exp* ")" ")"  --   arguments (of dependent value type)

  Exp ::=
     "(" CId                -- function
       "(" "B" CId* ")"     --   bindings
       "(" "X" Exp* ")" ")" --   arguments
   | CId                    -- variable
   | "?"                    -- metavariable
   | "(" "Eq" Equation* ")" -- group of pattern equations
   | Integer                -- integer literal (non-negative)
   | Float                  -- floating-point literal (non-negative)
   | String                 -- string literal (in double quotes)

  Hypo ::= "(" CId Type ")" -- variable and type

  Equation ::= "(" "E" Exp Exp* ")" -- value and pattern list

  Term ::= 
     "(" "R"  Term* ")"       -- array (record or table)
   | "(" "S"  Term* ")"       -- concatenated sequence
   | "(" "FV" Term* ")"       -- free variant list
   | "(" "P"  Term Term ")"   -- access to index (projection or selection)
   | "(" "W"  String Term ")" -- token prefix with suffix list
   | "(" "A"  Integer ")"     -- pointer to subtree
   | String                   -- token (in double quotes)
   | Integer                  -- index in array
   | CId                      -- macro constant
   | "?"                      -- metavariable
```


==GFCC interpreter==

The first phase in interpreting GFCC is to parse a GFCC file and
build an internal abstract syntax representation, as specified
in the previous section.

With this representation, linearization can be performed by
a straightforward function from expressions (``Exp``) to terms
(``Term``). All expressions except groups of pattern equations
can be linearized.

Here is a reference Haskell implementation of linearization:
```
  linExp :: GFCC -> CId -> Exp -> Term
  linExp gfcc lang tree@(DTr _ at trees) = case at of
    AC fun -> comp (map lin trees) $ look fun
    AS s   -> R [K (show s)] -- quoted
    AI i   -> R [K (show i)]
    AF d   -> R [K (show d)]
    AM     -> TM
   where
     lin  = linExp gfcc lang
     comp = compute gfcc lang
     look = lookLin gfcc lang
```
TODO: bindings must be supported.

Terms resulting from linearization are evaluated in
call-by-value order, with two environments needed:
- the grammar (a concrete syntax) to give the global constants
- an array of terms to give the subtree linearizations


The Haskell implementation works as follows:
```
compute :: GFCC -> CId -> [Term] -> Term -> Term
compute gfcc lang args = comp where
  comp trm = case trm of
    P r p  -> proj (comp r) (comp p)
    W s t  -> W s (comp t)
    R ts   -> R $ map comp ts
    V i    -> idx args (fromInteger i)  -- already computed
    F c    -> comp $ look c             -- not computed (if contains V)
    FV ts  -> FV $ Prelude.map comp ts
    S ts   -> S $ Prelude.filter (/= S []) $ Prelude.map comp ts
    _ -> trm

  look = lookOper gfcc lang

  idx xs i = xs !! i

  proj r p = case (r,p) of
    (_,     FV ts) -> FV $ Prelude.map (proj r) ts
    (FV ts, _    ) -> FV $ Prelude.map (\t -> proj t p) ts
    (W s t, _)     -> kks (s ++ getString (proj t p))
    _              -> comp $ getField r (getIndex p)

  getString t = case t of
    K (KS s) -> s
    _ -> trace ("ERROR in grammar compiler: string from "++ show t) "ERR"

  getIndex t =  case t of
    C i    -> fromInteger i
    RP p _ -> getIndex p
    TM     -> 0  -- default value for parameter
    _ -> trace ("ERROR in grammar compiler: index from " ++ show t) 0

  getField t i = case t of
    R rs   -> idx rs i
    RP _ r -> getField r i
    TM     -> TM
    _ -> trace ("ERROR in grammar compiler: field from " ++ show t) t
```
The result of linearization is usually a record, which is realized as
a string using the following algorithm.
```
  realize :: Term -> String
  realize trm = case trm of
    R (t:_)  -> realize t
    S ss     -> unwords $ map realize ss
    K s      -> s
    W s t    -> s ++ realize t
    FV (t:_) -> realize t  -- TODO: all variants
    TM       -> "?"
```
Notice that realization always picks the first field of a record.
If a linearization type has more than one field, the first field
does not necessarily contain the desired string.
Also notice that the order of record fields in GFCC is not necessarily
the same as in GF source.
