Feb 15

Using Custom Symbols in CUP

Category: Java, Programming

CUP lists being able to use your own Symbol class as a big selling point of the latest version, but they don’t say much about how to do it. In addition there is actually a fairly annoying bug in the code. Of course you may ask “Why not use javacc?” Good question. Anyway, without further ado…

Why use Custom Symbol Objects?

The default symbol objects don’t provide a whole lot of context. In fact even the ComplexSymbolFactory symbols of the latest CUP version (look in the ComplexSymbolFactory class) don’t necessarily provide everything one would want.

For the purposes of this article I’m gong to address annotating each symbol with Position objects. This is useful because Position interfaces with other Java components (not to mention JEdit ) much better than ComplexSymbolFactory’s Location objects (which give line and column information rather than absolute offset).

Creating A Custom Symbol Object

The first step to using custom symbol objects is to create a custom symbol object. Your symbol object must derive from the java_cup.runtime.Symbol object. Ours is going to look something like

public class CustomSymbol extends Symbol{
     private class Location implements Position{
         private int position;
         public Location(int position){
             this.position = position;
         };
         public int getPosition(){
             return position;
         }
     };
      Location left, right;
     String name;
     ... 
};

We’ll leave the constructor definitions until after talking about the symbol factory.

Creating a Custom Symbol Factory

The second step to using custom symbols is creating your own symbol factory. This factory is used by both the lexical analyzer (in my case JFlex) and the grammar parser generated by CUP to create symbol objects.

A good baseline reference for doing this is the java_cup.runtime.ComplexSymbolFactory class used by CUP. Your symbol factory must implement the java_cup.runtime.SymbolFactory interface. Basically every newSymbol method will go directly to a constructor for your custom symbol class:

public class CustomSymbolFactory implements SymbolFactory{
    //Used by lexical parser to create a new symbol with value object
    public Symbol newSymbol(String name, int id, Location left, 
                                       Location right, Object value){
        return new CustomSymbol(name,id,left,right,value);
    }
    //used by lexical parser to create a new symbol without value object
    public Symbol newSymbol(String name, int id, Location left, 
                                       Location right){
        return new CustomSymbol(name,id,left,right);
    }
    //Used by CUP parser to create a symbol for non-terminal nodes 
    //(like expression or statement)
    public Symbol newSymbol(String name, int id, Symbol left,
                                       Symbol right, Object value){
        return new CustomSymbol(name,id,left,right,value);
    }
    //Same as previous
    public Symbol newSymbol(String name, int id, Symbol left, 
                                       Symbol right){
        return new CustomSymbol(name,id,left,right);
    }
    //Not used!
    public Symbol newSymbol(String name, int id){
        return new CustomSymbol(name,id);
    }
    //Not used!
    public Symbol newSymbol(String name, int id, Object value){
        return new CustomSymbol(name,id,value);
    }
    //Used to create a symbol in a specified state.
    public Symbol startSymbol(String name, int id, int state){
        return new CustomSymbol(name,id,state);
    }
}

Creating the Symbol Constructors

Most of the symbol constructors are actually pretty straight forward if you take a look at java_cup.runtime.ComplexSymbolFactory, but there is one hitch. Symbol(String name, int id, int state) has been left private (I assume by accident) so if you make your own symbol constructor and call super(id, state) you will actually end up calling Symbol(int id, Object value). Fortunately this isn’t difficult to fix:

CustomSymbol(String name, int id, int state){
    super(id);
    this.state = state; //sets Symbol.state
    this.name = name;
}

Associating the SymbolFactory With the Parser

The only thing left to do is associate your custom symbol factory with CUP’s autogenerated grammar parser. To do this you can just pass it explicitly as an argument to the grammar parser’s constructor.

The Author

Michael Smit is a software engineer in Seattle, Washington who works for amazon

Comments are off for this post

Comments are closed.