Building Scalable Software Systems in the Multicore Era

A position paper by Hridesh Rajan. This work was originally presented at the 2010 FoSER workshop.

Cite as: Hridesh Rajan, "Building Scalable Software Systems in the Multicore Era," 2010 FSE/SDP Workshop on the Future of Software Engineering, Santa Fe, NM, November 2010.

Revised on $Date: 2012/05/22 17:20:47 $.

Abstract

Software systems must face two challenges today: growing complexity and increasing parallelism in the underlying computational models. The problem of increased complexity is often solved by dividing systems into modules in a way that permits analysis of these modules in isolation.

The problem of lack of concurrency is often tackled by dividing system execution into tasks that permits execution of these tasks in isolation.

The key challenge in software design is to manage the explicit and implicit dependence between modules that decreases modularity. The key challenge for concurrency is to manage the explicit and implicit dependence between tasks that decreases parallelism. Even though these challenges appear to be strikingly similar, current software design practices and languages do not take advantage of this similarity.

The net effect is that the modularity and concurrency goals are often tackled mutually exclusively. Making progress towards one goal does not naturally contribute towards the other.

My position is that for programmers that are not formally and rigorously trained in the concurrency discipline the safest and most productive way to get scalability in their software is by improving modularity of their software using programming language features and design practices that reconcile modularity and concurrency goals. I briefly discuss preliminary efforts of my group, but we have only touched the tip of the iceberg.

Problem and Their Importance

Scalability of software in the next decade crucially depends on its ability to effectively utilize multicore platforms. For scientific applications such scalability generally comes from invention and refinement of better algorithms and data structures, but that is not the case for non-scientific software that often exhibit irregular fine-grained parallelism. However, scalability of these applications is an equally important concern for society, defense, and the individual.

Scalability of these applications faces two major hurdles. A first and well-known hurdle is that writing correct and efficient concurrent software using concurrency-unsafe programming language features has remained a challenge. A language feature is concurrency-unsafe if its usage may give rise to program execution sequences that contains two or more memory accesses to the same location that are not ordered by a happens-before relation and at least one is a write to the memory. Threads and processes are examples of such features as are Future and FutureTask as embodied in the 1.5 edition of the Java programming language. Without strict design and implementation disciplines they are concurrency-unsafe. Many of the features planned for 1.7 edition of the Java programming language are similarly concurrency-unsafe.

A second and less explored hurdle is that unlike in scientific applications, in general-purpose programs potential concurrency isn't always obvious. A typical scientific application is generally data-parallel, whereas general-purpose programs typically exhibit irregular parallelism. As a result, techniques that have been remarkably successful in scientific domains have only seen modest success for general-purpose programs.

I believe that both these hurdles persist, in part, because of a significant shortcoming of current software design practices. The basic problem is that modularity and concurrency are treated as two separate and orthogonal goals. As a result, concurrency goals are often tackled at a level of abstraction lower than modularity goals. Synchronization defects arise when developers work at low abstraction levels and are not aware of the behavior at a higher level of abstraction. This lack of awareness also limits the discovery of potentially available concurrency in the resulting systems.

All of this is complicated by the fact that our current software development workforce is vastly under-prepared to develop correct, efficient and fair software systems for the emerging multicore hardware platforms using concurrency-unsafe features that are currently available in languages and libraries.

How to Solve it?

Look at the unknown! And try to think of a familiar problem having the same or a similar unknown. --- George PĆ³lya, 1945.

Are better software designs inherently more concurrent? In the following I will argue and present some evidence to my hypothesis that modularity and concurrency goals are intertwined and that by advances in programming language design and software design practices, it may be possible to achieve mutualism between them!

To motivate conside a simple example shown in the listing below, which shows three versions of the parts of a telecommunication software. The class Call shown in this figure (left column) models a typical connection in such setting. It models the state of a phone call using enumeration State and the caller and the receiver with fields caller and receiver respectively. It also contains a timer object to monitor the duration of a call. This class provides two methods complete and drop that serve to connect and disconnect a call respectively.

class Call {
 enum State { PENDING, COMPLETE, DROPPED }
 Customer caller, receiver;
 State state = PENDING;
 Call(Customer a, Customer b) {
    caller = a; receiver = b;
  }
 void complete() {
   state = COMPLETE;
   timer.start();
  }
 void drop() {
  state = DROPPED;
  timer.stop();
  long time = timer.getTime();
  long cost = 0.07 * time;
  caller.addCharge(cost);
  }
 Timer timer = new Timer();
}

An example requirement for such application would be to bill customers for the duration of the conversation. A simple implementation of such requirement could be done by adding its logic to the code for drop method (shown in the listing above on lines 15-17).

This solution works, however, it has several software engineering problems. For example, since the code for billing is mixed with the code for call logic, it would be harder to implement any changes to either requirement. This is primarily because the developer making changes to either requirement would have to understand the other requirement as well to ensure correctness. In addition, implementation of neither requirements is reusable. Last but not least, understanding billing and call logic in isolation is not possible because their implementations are mixed.

This example demonstrates, at a small scale, the modularity problems faced by developers in building large software systems.

To modularize the implementation of the billing requirements, a good software engineer would separate its implementation out in a new module, while ensuring that this new module communicates to the class Call via a well-defined interface (and vice-versa). The listing below shows the modularized version of the same system, where the implementation of the billing requirement is separated out as the class Billing using the Observer design pattern.

interface CallEndObserver {
  void notify (Customer c, Timer t);
}
class Call {
 enum State { PENDING, COMPLETE, DROPPED }
 Customer caller, receiver;
 State state = PENDING;
 Call(Customer a, Customer b) {
    caller = a; receiver = b;
  }
 void complete() {
   state = COMPLETE;
   timer.start();
  }
 void drop() {
  state = DROPPED;
  timer.stop();
  for(CallEndObserver o : observers)
   o.notify(caller,timer);
 }
 List<CallEndObserver> observers = new ArrayList<CallEndObserver>();
 void addObserver(CallEndObserver o){ 
  observers.add(o); 
 }
 void removeObserver(CallEndObserver o){ 
  observers.remove(o); 
 }
}
class Billing implements CallEndObserver {
 void notify (Customer c, Timer t) {
  long time = timer.getTime();
  long cost = 0.07 * time;
  caller.addCharge(cost);
 } 
}

This alternative solution in the listing above is modular and solves all the problems pointed out previously with the solution in the left column. In this version, the code for billing and call logic are separated via well-defined interface (the observer interface CallEndObserver). This makes it easier to change these independently, reuse them, and understand them. This design thus breaks the dependencies between the implementation of these two requirements.

Quite interestingly, this design can facilitate the concurrent execution of the billing logic. For example, we could encapsulate the method notify in the middle column to run as a concurrent task. In other words, the modularization transformation from the left to the middle could also serve as an effective parallelization transformation. The key research question is whether the observation made in the context of this example holds for a large class of modularization transformations, i.e. do many techniques for modularity improvements increase potential concurrency in program design?

This question rests on the observation that from the point of view of both concurrency and modularity, challenges are similar. For example, in order for the modular reasoning about the class Billing to succeed, it is important to understand the explicit and implicit dependencies of the billing concern. Similarly, in order for concurrent processing of billing to succeed one must also understand these dependencies to avoid data races and deadlocks in the solution that can potentially decrease parallelism. It is thus intuitive that the lack of modularity in design has direct ramifications on the available concurrency. However, it is not clear at this moment, whether improved modularity in a software design helps with concurrency in general. I now briefly discuss our preliminary efforts to further understand this duality. A detailed description of these ideas appear in our early papers, e.g. our GPCE 2010 paper entitled Implicit Invocation Meets Safe, Implicit Concurrency, and our SPLASH/Onward 2010 paper entitled Concurrency by Modularity: Design Patterns, a Case in Point.

Asynchronous, Typed Events

Along one direction, my students Yuheng Long, Sean Mooney, Tyler Sondag, and I have developed the notion of asynchronous, typed events in our language Panini that reconciles the modularity goal promoted by the implicit invocation design style with concurrency goals. Panini's design is inspired from my previous work on Ptolemy and Eos languages.

In implicit invocation design style, some modules (called subjects or publishers) signal events, e.g. reaching a program point, a condition becoming true, etc. Other modules (called observers or subscribers) express interest in receiving notifications when an event is signaled. The key advantage is that subjects can notify such observers without knowing about them (implicitly). Thus, implicit invocation design style decouples subjects and observers.

Asynchronous, typed events provide implicit concurrency in program designs when events are signaled and consumed without the need for explicit locking of shared states. The semantics is similar to other proposals based on message-based communication between concurrent tasks such as in Erlang, however, unlike these actor-based/message-based languages, Panini does not require complete isolation of such tasks. Furthermore, the communication between implicitly concurrent tasks is not limited to value types or record of value types.

The listing below shows an implementation of our billing example in Panini.

event CallEnd {
  Customer c; Timer t; 
}
class Call {
enum State { PENDING, COMPLETE, DROPPED }
 Customer caller, receiver;
 State state = PENDING;
 Call(Customer a, Customer b) {
    caller = a; receiver = b;
  }
 void complete() {
   state = COMPLETE;
   timer.start();
  }
 void drop() {
  state = DROPPED;
  timer.stop();
  announce CallEnd(caller,timer);
 }
}
class Billing {
 when CallEnd do notify;
 void update (Customer c, Timer t) {
  long time = timer.getTime();
  long cost = 0.07 * time;
  caller.addCharge(cost);
 } 
}

The implementation in the listing above uses the features of the Panini language. The method drop in this implementation announces an event of type CallEnd. The declaration of the type of this event is shown at the top of the middle column on lines 1-2. This declaration says that this type of events makes two pieces of contextual information available: a Customer c making the call and a Timer t specifying duration of the telephone call. Both of these are information pertinent to the event "a telephone call just ended."

The class Billing features a new construct in Panini called binding (when CallEnd do ...). This construct says to run the method update whenever any event of type CallEnd is announced in any class (for instance such event is announced in the class Call). As a result, whenever a call ends, the billing information is computed and updated concurrently. Note that the developers of the class Call and Billing didn't need to write even a single line of code to expose concurrency. Design efforts were needed, however, for managing implicit and explicit dependence between these two classes to maximize modularity.

The language semantics of Panini provides race and deadlock freedom and a sequential semantics while exposing potential implicit concurrency in program design. We have implemented a compiler and runtime system for Panini that is available for general distribution from the download links on this web-site. We have tried out several programs, where asynchronous, typed events improve both modularity in program design and potentially available concurrency. Our results show that Panini programs perform as well as their hand-tuned concurrent implementation.

Implicitly Concurrent GOF Design Patterns

Along another direction, in collaboration with Dr. Steven M. Kautz, Wayne Rowcliffe, and Sean Mooney, I have developed a framework that provides implicitly concurrent versions of the standard Gang-of-Four object-oriented design patterns. This framework is attempting to reconcile modularity and concurrency goals by enhancing Gang-of-Four (GOF) object-oriented design patterns. GOF patterns are commonly used to improve the modularity of object-oriented software. These patterns describe strategies to decouple components in design space and specify how these components should interact.

We have enhanced these patterns to also decouple components in execution space, so applying them concomitantly improves the design and potentially available concurrency in software systems. For example, the decorator pattern organizes components in a chain in which each component adds behavior to the previous component in the chain. If the added behavior of components in this chain is independent or can be split into dependent and independent parts, our implicitly concurrent decorator pattern implementation allows processing of independent added behaviors to be performed concurrently. Similarly, the builder pattern decouples the creation logic of complex products from their usage. If object creation is expensive, e.g. creating an object based on the contents of a file on disk, our implicitly concurrent builder pattern implementation allows object creation to interleave with other computation in the program.

For 18 out of the 23 GOF patterns, we have determined that, subject to appropriate usage, our hypothesis is true. For each of these 18 patterns we have created an enhanced version of the design pattern in which use of the pattern increases potential concurrency without additional, explicit effort on the part of the developer to do so. In every case but one, the concurrency-related concerns (such as thread creation and synchronization) are fully encapsulated in a library that we provide, and in no case is the developer ever required to explicitly create a thread or acquire a synchronization lock.

This pattern framework is available for general distribution from the download menu link on this web-page. A paper on this subject that appeared at the 2010 SPLASH/Onward Conference contains more details and it is available from here.

Conclusion

Introducing concurrency has become important for the scalability of today's software systems, however, writing correct, efficient, and fair concurrent programs remains hard. In this work, I have taken the position that building programming language features and design practices that reconcile concurrency and modularity has the potential to solve this problem. Our initial work on the Panini language and the implicitly concurrent GOF design patterns has demonstrated the feasibility of basic ideas, however much work remains to be done. We believe that investigating these ideas further is important for software engineering of correct, efficient and scalable software systems in the multicore era.

Acknowledgments

This work was supported in part by the US National Science Foundation under grant CCF-08-46059. I am thankful to my students Yuheng Long, Sean Mooney, Tyler N. Sondag, Robert E. Dyer, Wayne Rowcliffe and collaborator Steven M. Kautz for countless discussions on this topic.

Page last modified on $Date: 2012/05/22 17:20:47 $