Example

HISTOGRAM PROBLEM

To illustrate Panini's new features in practice, consider the classic histogram problem. The goal of this problem is to count the number of times each ASCII character occurs on a page of text. The input to this problem is the ASCII text and its output is a histogram with 128 buckets, one for each ASCII character, where each entry stores the number of occurrences of the corresponding ASCII character on the page.

ARCHITECTURE and DESIGN

In capsule-oriented programming better design leads to better implicit concurrency, i.e. better designed programs often run faster, so it is valuable to start off with the architecture and design.

  1. Divide the problems into subproblems. For our problem, the subproblems are:

    • read the ASCII text,
    • sort characters into bit bucket, and
    • output the bit bucket
    The Panini programmer specifies a system as a collection of capsules and ordinary object-oriented classes. A capsule is an abstraction for decomposing a software into its parts and a system is a mechanism for composing these parts together. So the first order of business is to come up with this capsule-oriented design. This involves creating capsules and assigning subtasks to these capsules.

  2. Create capsules and assign responsibilities to capsules. In assigning responsibility follow the information-hiding principle. There are three design decisions that are likely to change: input format, output format, and how we represent buckets in the histogram. Therefore, we must hide these responsibility behind interfaces of separate capsules.

    This suggests three capsules: Reader, Bucket, and Printer.

         capsule Reader() { }
         capsule Bucket() { }
         capsule Printer() { }
    	

    We do not yet know the interconnection between these three capsules, but it seems to be the case that Reader ought to read characters from the ASCII text and invoke Buckets to put characters in the bucket. Finally, when characters are sorted, Bucket could call the Printer to print count. This seems to suggest that the Reader capsule ought to be able to reach Buckets and Bucket capsule ought to be able to reach a Printer. We can use this knowledge to refine our architecture and design.

         capsule Reader(Bucket[] buckets) { }
         capsule Bucket(Printer p) { }
         capsule Printer() { }
    	

    The first line says that the Reader capsule could be connected with a set of Bucket capsules. The second line says that the every Bucket could be connected with a printer capsule, and the third line says that the Printer capsule may not be connected to any other capsule. Alternatively, we could read line 1 as: "the Reader capsule requires an array of Bucket capsules."

  3. Integrate capsules to form a complete program. We now know that this program would require one reader, one printer, and a bunch of buckets. At this time, we can make a decision about how many buckets do we want in this program. One simple design would be to have one Bucket capsule per ASCII character, but other choices are certainly feasible.

    We will use another capsule as a container for these other capsules. This capsule that we will call Histogram is essentially a complete program. The listing below shows a design of the Histogram capsule.

         capsule Histogram (String args) { 
         	design {
         		Reader r; 
         		Bucket buckets[128];
    			Printer p;
    	
    			r(args, buckets);
    			wireall(buckets, p);
    		}
         }
    	

    This declarative design on lines 2-9 says that this capsule Histogram would have an instance of capsule Reader (line 2), a set of 128 instances of capsule Bucket (line 3), and an instance of capsule Printer (line 4).

    Lines 7-8 define interconnections between capsule instances. Line 7 says that the capsule instance r would be connected with the set of capsule instances buckets and would also receive the command-line arguments of the program. Line 8 says that each capsule instance in the set of buckets would be connected with the capsule instance p.

IMPLEMENTATION

  1. Capsule Printer. We can now start specifying behavior of each of these capsules. The behavior of capsule Printer is fairly straightforward, given a string it ought to display that string on Console.

        capsule Printer () {
        	void print(String output) {
        		System.out.println(output); 
        	}
        }
    	
  2. Capsule Bucket. The behavior of the capsule Bucket requires keeping track of the the number of items in the bucket (since all items are the same). In Panini, a capsule can declare states to keep track of such pieces of information. A state declaration is syntactically the same as a field declaration in object-oriented languages, however, it differs semantically in two ways: first, a state is private to a a capsule (there are no public modifiers.), second, all the memory locations that can be reached via this state are uniquely owned by the containing capsule instance. Other capsule instances may not access it.

        capsule Bucket(Printer p) {
        	long count = 0;
        }
    	

    To allow other capsules to change its state, a capsule can provide capsule procedures, procedures for short. A capsule procedure is syntactically similar to methods in object-oriented languages, however, they are different semantically in two ways: first, a capsule procedures is by default public (although private helper procedures can be declared using the private keyword), and second a procedure invocation is guaranteed to be logically synchronous. In some cases, Panini may be able to run procedures in parallel to improve parallelism in Panini programs. Two example procedures of the capsule Bucket are shown below.

        capsule Bucket(Printer p) {
        	long count = 0;
        	void bump() { 
        		count++; 
        	}
        	void finish(int index) { 
        		p.print(" "+ index + ":" + count); 
        	}
        }
    	
  3. Capsule Reader. The Reader capsule declares a procedure read that reads the input array and sorts them into buckets.

    	capsule Reader(String[] args, Bucket[] buckets) {
    		void read() {
    			for(String fileName : args)
    				process(fileName);
    		}
    		private void process(String fileName) {
    			try {
    				FileInputStream stream = new FileInputStream(new File(fileName));
    				System.out.println("READER: input file " + fileName + " successfully opened.");
    				int r;
    				while ((r = stream.read()) != -1) {
    					buckets[(char) r].bump();
    				}  
    				System.out.println("READER: Reading complete. Asking buckets to print count.");
    			} catch (IOException e) { System.out.println(e); }
    			for(int i = 0; i < buckets.length; i++) 
    				buckets[i].finish(i); 
    			System.out.println("READER: work complete.");
    		}
    	}
    	

    On lines 3-4 every file that is named on the command-line is processed using the helper procedure process. A helper procedure in Panini is only accessible with the Capsule, therefore it ought to be declared with a modifier private, e.g. the process procedure on line 6.

    On lines 11-13 each character in the input file is read, the value of the character is converted to an ASCII range, and the procedure bump is called on capsule instance corresponding to that character. So for instance, if the character 'a' (ASCII value: 97) was read then the procedure bump of the bucket capsule instance buckets[97] would be called.

    On lines 17-18, the procedure finish is called on each capsule instance that signals each of these capsule instances to print their value on the printer.

  4. Capsule Histogram. Finally, the Histogram capsule declares a procedure run that invokes the procedure read on the Reader capsule.

         capsule Histogram (String args) { 
         	design {
    			Reader r; 
    			Bucket buckets[128];
    			Printer p;
    	
    			r(args, buckets);
    			wireall(buckets, p);
    		}
    		void run () {
    			r.read();
    		}
         }
    	

    The execution of this program begins by allocating memory for all capsule instances, and connecting them together as specified in the design declaration on lines 2-9. Recall that capsule parameters define the other capsule instances required for a capsule to function. A capsule listed in another capsule's parameter list can be sent messages from that capsule. Design declarations allow a programmer to define the connections between individual capsule instances. These connections are established before execution of any capsule instance begins.

    Next, any capsule with a run procedure begins executing independently as soon as the initialization and interconnection of all capsules is complete and may generate calls to the procedures of other capsules. For example, referring to the code above, capsule Histogram will run code on 10-12. Capsules without a run procedure, such as Reader, perform computation only when their procedures are invoked.

IMPLICIT CONCURRENCY

The implicit concurrency in this example is on lines 11-13 in the capsule Reader, where calling an external capsule procedure immediately returns, so that the reader capsule instance can continue reading on lines 11-13 while the bucket capsule instance increments the count.

Similarly, on lines 17-18 in the capsule Reader the reader can continue calling the next bucket instance's capsule procedure without waiting for the previous bucket instance to finish printing.

Another source of implicit concurrency is on line 7 in the implementation of the capsule Bucket. This procedure can return while the printer instance p prints the argument String.

When it is safe to exploit these sources of implicit concurrency, Panini compiler will automatically introduce parallelism to speedup this program without intervention from the programmer.

Page last modified on $Date: 2013/08/02 03:42:27 $