Papers
Conference
Journal
Workshop
PhD and MS Theses
Technical Reports
Other
Articles and Papers
PhD and MS Theses
Technical Reports
Panini
Boa
Ptolemy
Eos
Nu
Sapha
Slede
Tisa
Osiris
Other
Improved separation of concern is important for dealing with increasing complexity of today’s software systems. A number of language designs have been proposed in the last decade with the common goal to improve the separation of concerns by providing better modularization mechanisms e.g. mixins, units, roles, layers, hyperspaces, events, aspects,...
Actor frameworks running on Java Virtual Machine (JVM) platform face two main challenges in utilizing multi-core architectures, i) efficiently mapping actors to JVM threads, and ii) scheduling JVM threads on multi-core. JVM-based actor frameworks allow fine tuning of actors to threads mapping, however scheduling of threads on multi-core is left...
As of today, it is unclear whether aspect-oriented modeling can benefit the model-driven development of software product lines. Although some preliminary studies exist at the requirements and implementation level that investigate the interaction of crosscutting behaviors and product-line variabilities, to the best of our knowledge these interactions at the modeling...
As aspect-oriented programming techniques become more widely used, their use in critical systems, including safety-critical systems such as aircraft and mission-critical systems such as telephone networks, will become more widespread. However, careful reasoning about aspect-oriented code seems difficult with standard specification techniques. The difficulty stems from the difficulty of understanding...
A variety of language features to modularize crosscutting concerns have recently been discussed, e.g. open modules, annotation-based pointcuts, explicit join points, and quantified-typed events. All of these ideas are essentially a form of aspect-oriented interface between object-oriented and crosscutting modules, but the representation of this interface differs. While previous works...
Modular understanding of behaviors and flows of exceptions may help in their better use and handling. Such reasoning tasks about exceptions face unique challenges in event-based implicit invocation (II) languages that allow subjects to implicitly invoke observers, and run the observers in a chain. In this work, we illustrate these...
Aspect-oriented programming (AOP) is a popular technique for modularizing crosscutting concerns. In this context, researchers found that the realization of the design by contract (DbC) is cross- cutting and fares better when modularized by AOP. However, previous efforts aimed at supporting crosscutting contract modularly instead hindered it. For example, in...
In program profiling to assess test set adequacy, a challenge is to select code to be included in the assessment. Current mechanisms are coarse-grained; biased to dominant modularizations; require tedious, error-prone manual selection; and leave tester intent implicit in inputs to testing tools. Aspect-oriented constructs promise to improving testing in...
A variety of dynamic aspect-oriented language constructs are proposed in recent literature with corresponding, compelling use cases. Such constructs, as well as other existing constructs such as cflow, demonstrate the need to dynamically adapt the set of join points intercepted at a fine-grained level. The notion of morphing aspects and...
Aspect-oriented (AO) design and programming methods promise to improve the modularity properties of software-intensive systems. However, AO is also seen as violating fundamental design principles; and we lack a theory to guide its appropriate use. Our work rests on the idea that successful AO techniques have deep roots in implicit...
In this paper, we present a study of repetitiveness of code changes in software evolution. Repetitiveness is defined as the ratio of repeated changes over total changes. Focusing on fine-grained code changes, we model a change as a pair of old and new AST sub-trees within a method. A change...
Deep neural networks (DNNs) are susceptible to bugs, just like other types of software systems. A significant uptick in using DNN, and its applications in wide-ranging areas, including safety-critical systems, warrant extensive research on software engineering tools for improving the reliability of DNN-based systems. One such tool that has gained...
Background: Creating a scalable computational infrastructure to analyze the wealth of information contained in data repositories is difficult due to significant barriers in organizing, extracting and analyzing relevant data. Shared data science infrastructures like Boag is needed to efficiently process and parse data contained in large data repositories. The main...
Background: Creating a scalable computational infrastructure to analyze the wealth of information contained in data repositories is difficult due to significant barriers in organizing, extracting and analyzing relevant data. Shared data science infrastructures like Boag is needed to efficiently process and parse data contained in large data repositories. The main...
In today’s software-centric world, ultra-large-scale software repositories, such as SourceForge, GitHub, and Google Code, are the new library of Alexandria. They contain an enormous corpus of software and related information. Scientists and engineers alike are interested in analyzing this wealth of information. However, systematic extraction and analysis of relevant data...
Central to computing is machine code generation. Upper level undergraduate students studying computing are quite familiar with high-level languages. Most undergraduate programs in computing begin with a course involving computer programming in a high- level language and as students progress through their studies they gain more experience with high-level languages....
Integral to computer science education are computer organization and architecture courses. We present Frances-A, an engaging investigative tool that provides an environment for studying real assembly languages and architectures. Frances-A includes several features that enhance its usefulness in the classroom such as graphical relationships between high-level code and machine code,...
The latest trend towards performance asymmetry among cores on a single chip of a multicore processor is posing new challenges. For effective utilization of these performance-asymmetric multicore processors, code sections of a program must be assigned to cores such that the resource needs of code sections closely matches resource availability...
Type-and-effect systems are a powerful tool for program construction and verification. We describe intensional effect polymorphism, a new foundation for effect systems that integrates static and dynamic effect checking. Our system allows the effect of polymorphic code to be intensionally inspected through a lightweight notion of dynamic typing. When coupled...
Implicit invocation and aspect-oriented languages provide related but distinct mechanisms for separation of concerns. Implicit invocation languages have explicitly announced events, which runs registered observer methods. Aspect-oriented languages have implicitly announced events, called "join points," which run method-like but more powerful advice. A limitation of implicit invocation languages is their...
Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to...
In previous work, Rajan and Leavens presented the design of Ptolemy, a language which incorporates the notion of quantified, typed events for improved separation of concerns. In this work, we present an empirical study to evaluate the effectiveness of Ptolemy’s design by applying it to a series of architectural releases...
Verifying correctness properties of parameterized systems is a long-standing problem. The challenge lies in the lack of guarantee that the property is satisfied for all instances of the parameterized system. Existing work on addressing this challenge aims to reduce this problem to checking the properties on smaller systems with a...
Deep learning has gained substantial popularity in recent years. Developers mainly rely on libraries and tools to add deep learning capabilities to their software. What kinds of bugs are frequently found in such software? What are the root causes of such bugs? What impacts do such bugs have? Which stages...
This paper makes two contributions: a generalization of AspectJ-like languages with first-class aspect instances and instance-level advising, and a mapping of the mediator style for integrated system design into this space. We present Eos as a prototype language design and implementation. It extends C# with AspectJ-like constructs, first-class aspect instances...
The growing popularity of aspect-oriented languages, such as AspectJ, and of corresponding design approaches, makes it important to learn how best to modularize programs in which aspect-oriented composition mechanisms are used. We contribute an approach to information hiding modularity in programs that use quantified advising as a module composition mechanism....
Machine learning models are increasingly being used in important decision-making software such as approving bank loans, recommending criminal sentencing, hiring employees, and so on. It is important to ensure the fairness of these models so that no discrimination is made based on protected attribute (e.g., race, sex, age) while decision...
Deep learning is being incorporated in many modern software systems. Deep learning approaches train a deep neural network (DNN) model using training examples, and then use the DNN model for prediction. While the structure of a DNN model as layers is observable, the model is treated in its entirety as...
In recent years, many incidents have been reported where machine learning models exhibited discrimination among people based on race, sex, age, etc. Research has been conducted to measure and mitigate unfairness in machine learning models. For a machine learning task, it is a common practice to build a pipeline that...
In software development, the term “technical debt” (TD) is used to characterize short-term solutions and workarounds implemented in source code which may incur a long-term cost. Technical debt has a variety of forms and can thus affect multiple qualities of software including but not limited to its legibility, performance, and...
Deep Learning (DL) techniques are increasingly being incorporated in critical software systems today. DL software is buggy too. Recent work in SE has characterized these bugs, studied fix patterns, and proposed detection and localization strategies. In this work, we introduce a preventative measure. We propose design by contract for DL...
Machine learning (ML) is increasingly being used in critical decision making software, but incidents have raised questions about the fairness of ML predictions. To address this issue, new tools and methods are needed to mitigate bias in ML-based software. Previous studies have proposed bias mitigation algorithms that only work in...
Web services are distributed software components, that are decoupled from each other using interfaces with specified functional behaviors. However, such behavioral specifications are insufficient to demonstrate compliance with certain temporal non-functional policies. An example is demonstrating that a patient’s health-related query sent to a health care service is answered only...
The ability to adapt software systems to fix bugs, add/change features without restarting it is becoming important for many domains including but not limited to finance, social networking, control systems, etc. Fortunately, many ideas have begun to emerge under the umbrella term “dyanamic updating" to solve this problem. Dynamic updating...
There is some consensus in the aspect-oriented community that a notion of interface between joinpoints and advice may be necessary for improved modularity of aspect-oriented programs, for modular reasoning, and for overcoming pointcut fragility. Different approaches for adding such interfaces, such as aspect-aware interfaces, pointcut interfaces, crosscutting interfaces, explicit joinpoints,...
The Implicit Invocation (II) architectural style improves modularity and is promoted by aspect-oriented (AO) languages and design patterns like Observer. However, it makes modular reasoning difficult, especially when reasoning about control effects of the advised code (subject). Our language Ptolemy, which was inspired by II languages, uses translucid contracts for...
Subtype polymorphism is an important feature available in most modern type systems which makes code reuse and specialization possible. Recent works on separation of crosscutting concerns have created event interfaces (types) to decouple subjects from handlers. Extending the notion of subtyping to these event interfaces is a logical step. In...
Modern software relies on existing application programming in- terfaces (APIs) from libraries. Formal specifications for the APIs enable many software engineering tasks as well as help developers correctly use them. In this work, we mine large-scale repositories of existing open-source software to derive potential preconditions for API methods. Our key...
A majority of modern software is constructed using languages that compute by producing side-effects such as reading/writing from/to files, throwing exceptions, acquiring locks, etc. To understand a piece of software, e.g. a class, it is important for a developer to understand its side-effects. Similarly, to replace a class with another,...
For a number of reasons, such as to generate object code that is compliant with the existing virtual machines (VM), current compilers for aspect-oriented languages sacrifice design modularity when transforming source to object code by losing textual locality and intermingling concerns in the object code. Sacrificing design modularity has significant...
Software systems must face two challenges today: growing complexity and increasing parallelism in the underlying computational models. The problem of increased complexity is often solved by dividing systems into modules in a way that permits analysis of these modules in isolation. The problem of lack of concurrency is often tackled...
Writing correct and efficient concurrent programs still remains a challenge. Explicit concurrency is difficult, error prone, and creates code which is hard to maintain and debug. This type of concurrency also treats modular program design and concurrency as separate goals, where modularity often suffers. To solve these problems, we are...
Software repositories contain a vast wealth of information about software development. Mining these repositories has proven useful for detecting patterns in software development, testing hypotheses for new software engineering approaches, etc. Specifically, mining source code has yielded significant insights into software development artifacts and processes. Unfortunately, mining source code at...
Verifying that a parameterized system satisfies certain desired properties amounts to verifying an infinite family of the system instances. This problem is undecidable in general, and as such a number of sound and incomplete techniques have been proposed to address it. Existing techniques typically focus on parameterized systems with a...
Implicit deep learning has received increasing attention recently, since it generalizes the recursive prediction rules of many commonly used neural network architectures. Its prediction rule is provided implicitly based on the solution of an equilibrium equation. Although many recent studies have experimentally demonstrates its superior performances, the theoretical understanding of...
In today’s software-centric world, ultra-large-scale software repositories, e.g. SourceForge (350,000+ projects), GitHub (250,000+ projects), and Google Code (250,000+ projects) are the new library of Alexandria. They contain an enormous corpus of software and information about software. Scientists and engineers alike are interested in analyzing this wealth of information both for...
Programming languages evolve over time, adding additional language features to simplify common tasks and make the language easier to use. For example, the Java Language Specification has four editions and is currently drafting a fifth. While the addition of language features is driven by an assumed need by the community...
"Explicit concurrency should be abolished from all higher-level programming languages (i.e. everything except -perhaps- plain machine code.). Dijkstra [1] (paraphrased)." A promising class of concurrency abstractions replaces explicit concurrency mechanisms with a single linguistic mechanism that combines state and control and uses asynchronous messages for communications, e.g. active objects or...
Despite[1] their proven benefits, useful, comprehen- sible, and efficiently checkable specifications are not widely available. This is primarily because writing useful, non-trivial specifications from scratch is too hard, time consuming, and requires expertise that is not broadly available. Furthermore, the lack of specifications for widely-used libraries and frameworks, caused by...
Popularity of data-driven software engineering has led to an increasing demand on the infrastructures to support efficient execution of tasks that require deeper source code analysis. While task optimization and parallelization are the adopted solutions, other research directions are less explored. We present collective program analysis (CPA), a technique for...
Programmers often consult an online Q&A forum such as Stack Overflow to learn new APIs. This paper presents an empirical study on the prevalence and severity of API misuse on Stack Overflow. To reduce manual assessment effort, we design Maple, an API usage mining approach that extracts patterns from over...
Component integration creates value by automating the costly and error-prone task of imposing desired behavioral relationships on components manually. Requirements for component integration, however, complicate software design and evolution in several ways: first, they lead to coupling among components; second, the code that implements various integration concerns in a system...
The contribution of this work is the design, implementation, and early evaluation of a programming language that unifies classes and aspects. We call our new module construct the classpect. We make three basic claims. First, we can realize a unified design without significantly compromising the expressiveness of current aspect languages....
Significant interest in applying Deep Neural Network (DNN) has fueled the need to support engineering of software that uses DNNs. Repairing software that uses DNNs is one such unmistakable SE need where automated tools could be very helpful; however, we do not fully understand challenges to repairing and patterns that...
Many data-driven software engineering tasks such as discovering programming patterns, mining API specifications, etc., perform source code analysis over control flow graphs (CFGs) at scale. Analyzing millions of CFGs can be expensive and performance of the analysis heavily depends on the underlying CFG traversal strategy. State-of-the-art analysis frameworks use a...
Deep neural networks (DNNs) are becoming an integral part of most software systems. Previous work has shown that DNNs have bugs. Unfortunately, existing debugging techniques don’t support localizing DNN bugs because of the lack of understanding of model behaviors. The entire DNN model appears as a black box. To address...
Increasingly larger number of software systems today are including data science components for descriptive, predictive, and prescriptive analytics. The collection of data science stages from acquisition, to cleaning/curation, to modeling, and so on are referred to as data science pipelines. To facilitate research and practice on data science pipelines, it...
Training from scratch is the most common way to build a Convolutional Neural Network (CNN) based model. What if we can build new CNN models by reusing parts from previously build CNN models? What if we can improve a CNN model by replacing (possibly faulty) parts with other parts? In...
Deep Neural Networks (DNNs) are used in a wide variety of applications. However, as in any software application, DNN-based apps are afflicted with bugs. Previous work observed that DNN bug fix patterns are different from traditional bug fix patterns. Furthermore, those buggy models are non-trivial to diagnose and fix due...
Today deep learning is widely used for building software. A software engineering problem with deep learning is that finding an appropriate convolutional neural network (CNN) model for the task can be a challenge for developers. Recent work on AutoML, more precisely neural architecture search (NAS), embodied by tools like Auto-Keras...
Fairness of machine learning (ML) software has become a major concern in the recent past. Although recent research on testing and improving fairness have demonstrated impact on real-world software, providing fairness guarantee in practice is still lacking. Certification of ML models is challenging because of the complex decision-making process of...
Can we take a recurrent neural network (RNN) trained to translate between languages and augment it to support a new natural language without retraining the model from scratch? Can we fix the faulty behavior of the RNN by replacing portions associated with the faulty behavior? Recent works on decomposing a...
Machine Learning (ML) software has been widely adopted in modern society, with reported fairness implications for minority groups based on race, sex, age, etc. Many recent works have proposed methods to measure and mitigate algorithmic bias in ML models. The existing approaches focus on single classifier-based ML models. However, real-world...
Deep learning models are trained with certain assumptions about the data during the development stage and then used for prediction in the deployment stage. It is important to reason about the trustworthiness of the model’s predictions with unseen data during deployment. Existing methods for specifying and verifying traditional software are...
Code intelligence tools such as GitHub Copilot have begun to bridge the gap between natural language and programming language. A frequent software development task is the management of technical debts, which are suboptimal solutions or unaddressed issues which hinder future software development. Developers have been found to “self-admit” technical debts...
Programming languages are essential tools for developers, and their evolution plays a crucial role in supporting the activities of developers. One instance of programming language evolution is the introduction of syntactic sugars, which are additional syntax elements that provide alternative, more readable code constructs. However, the process of designing and...
Ultra-large-scale mining has been shown to be useful for a number of software engineering tasks e.g. mining specifications, defect prediction. We propose a new research direction for accelerating ultra-large-scale mining that goes beyond parallelization. Our key idea is to analyze the interaction pattern between the mining task and the artifact...
API documentation is useful for developers to better understand how to correctly use the libraries. However, not all libraries provide good documentation on API usages. To provide better documentation, existing techniques have been proposed including program analysis-based and data mining-based approaches. In this work, instead of mining, we aim to...
CPU vendors are starting to explore trade offs between die size, number of cores on a die, and power consump- tion leading to performance asymmetry among cores on a single chip. For efficient utilization of these performance- asymmetric multi-core processors, application threads must be assigned to cores such that the...
The key notion in service-oriented architecture is decoupling clients and providers of a service based on an abstract service description, which is used by the service broker to point clients to a suitable service implementation. A client then sends service requests directly to the service implementation. A problem with the...
Modular reasoning about concurrent programs is complicated by the possibility of interferences happening between any two instructions of a task (pervasive interference), and these interferences not giving out any information about the behaviors of potentially interfering concurrent tasks (oblivious interference). Reasoning about a concurrent program would be easier if a...
Separating crosscutting concerns while preserving modular reasoning is challenging. Type-based interfaces (event types) separate modularized crosscutting concerns (observers) and traditional object-oriented concerns (subjects). Event types paired with event specifications were shown to be effective in enabling modular reasoning about subjects and observers. Similar to class subtyping, organizing event types into...
The need for concurrency in modern software is increasingly fulfilled by utilizing the message passing paradigm because of its modularity and scalability. In the message passing paradigm, concurrently running processes communicate by sending and receiving messages. Asynchronous messaging introduces the possibility of message ordering problems: two messages with a specific...
Implicit concurrency between handlers is important for event driven systems because it helps simultaneously promote modularity and scalability. Knowing the side-effect of the handlers is critical in these systems to avoid concurrency hazards such as data races. As event systems are dynamic, because statically known and unknown handlers can register...
We propose Candoia, a novel platform and ecosys- tem for building and sharing Mining Software Repositories (MSR) tools. Using Candoia, MSR tools are built as apps, and Candoia ecosystem, acting as an appstore, allows effective sharing. Candoia platform provides, data extraction tools for curating custom datasets for user projects, and...
The popularity of Python programming language has surged in recent years due to its increasing usage in Data Science. The availability of Python repositories in Github presents an opportunity for mining software repository research, e.g., suggesting the best practices in developing Data Science applications, identifying bug-patterns, recommending code enhancements, etc....
Powered by advances in deep learning (DL) techniques, machine learning and artificial intelligence have achieved astonishing successes. However, the rapidly growing needs for DL also led to communication- and resource-intensive distributed training jobs for large-scale DL training, which are typically deployed over GPU clusters. To sustain the ever-increasing demand for...
In a service oriented architecture, certain requirements can be tested by observing the interface of the service whereas other requirements such as data privacy, confidentiality and integrity cannot be tested in this way. After deployment, a requirements monitor is used to analyze the conformance of a web service to such...
General purpose object-oriented programs typically aren’t embarrassingly parallel. For these applications, finding enough concurrency remains a challenge in program design. To address this challenge, in the Panini project we are looking at reconciling concurrent program design goals with modular program design goals. The main idea is that if programmers improve...
Efficient mapping of message passing concurrency (MPC) abstractions to Java Virtual Machine (JVM) threads is critical for performance, scalability, and CPU utilization; but tedious and time consuming to perform manually. In general, this mapping cannot be found in polynomial time, but we show that by exploiting the local characteristics of...
Frameworks and libraries provide application programming interfaces (APIs) that serve as building blocks in modern software development. As APIs present the opportunity of increased productivity, it also calls for correct use to avoid buggy code. The usage-based specification mining technique has shown great promise in solving this problem through a...
This paper introduces a novel type-and-effect calculus, first-class effects, where the computational effect of an expression can be programmatically reflected, passed around as values, and analyzed at run time. A broad range of designs "hard-coded" in existing effect-guided analyses — from thread scheduling, version-consistent software updating, to data zeroing —...
Monitoring or profiling programs provides us with an understanding for its further improvement and analysis. Typically, for monitoring or profiling, the program is instrumented to execute additional code that collects necessary data. A problem is that program instrumentation is often reported to cause between 10% and 390% time and space...
In earlier work, we showed that the AspectJ notions of aspect and class can be unified in a new module construct that we called the classpect, and that this new model is simpler and able to accommodate a broader set of requirements for modular solutions to complex integration problems. We...
As multi-core processors are becoming common, vendors are starting to explore trade offs between the die size and the number of cores on a die, leading to heterogeneity among cores on a single chip. For efficient utilization of these processors, application threads must be assigned to cores such that the...
Many program analyses and optimizations rely on knowledge of cache behavior. The precision of the underlying cache model is increasingly important with the recent uptake of multi-core and many-core architectures for two reasons. First, per-core cache sizes generally decrease as the number of cores becomes large resulting in more cache...
Compiler and programming language implementation courses are integral parts of many computer science curricula. However, the range of topics necessary to teach in such a course are difficult for students to understand and time consuming to cover. In particular, code generation is a confusing topic for students unfamiliar with low...
The semantic gap between specification and implementation languages for sensor networks security protocols impedes the specification and verification of the protocols. In this work, we present SLEDE, an event-based specification language and its verifying compiler that address this semantic gap. We demonstrate the features of SLEDE through an example specification...
Verifying whether a service implementation is conforming to its service-level agreements is important to inspire confidence in services in a service-oriented architecture. A part of these agreements, in particular those that are functional in nature, can be checked by observing the published interface of the service, but other agreements that...
Software repositories contain an enormous amount of information such as
revisions and bugs. Analyzing this data requires knowledge in mining software
repositories and a large amount of infrastructure. We present our
infrastructure to ease such analyses. Our results show writing analyses with
our framework is simpler and executes faster.
Researchers use shared computing clusters to ask interesting questions and
wish to maximize their utilization. Currently, optimizations focus on
individual programs. We present task fusion to automatically merge multiple
tasks into a single task. An example implementation shows fused tasks take
14-90% less time than running the tasks individually.
Today’s aspect-oriented programming (AOP) languages provide software engineers with new possibilities for keeping conceptual concerns separate at the source code level. For a number of reasons, aspect weavers sacrifice this separation in transforming source to object code (and thus the very term weaving). In this paper, we argue that sacrificing...
Aspect-oriented programming languages such as AspectJ offer new mechanisms for decomposing systems into modules and composing modules into systems. Common ways of using these mechanisms couple aspects to complex, changeable implementation details, which can compromise modularity. The crosscut programming interface (XPI) can significantly improve modularity in the design of programs...
A variety of language features to modularize crosscutting concerns have recently been discussed, e.g. open modules, annotation-based pointcuts, explicit join points, and quantified-typed events. All of these ideas are essentially a form of aspect-oriented interface between object-oriented and crosscutting modules, but the representation of this interface differs. Previous works have...
We propose PASS, a O(n) algorithm for data reduction that is specifically aimed at preserving the semantics of time series data visualization in the form of line chart. Visualization of large trend line data is a challenge and current sampling approaches do produce reduction but result in loss of semantics...
Students in all areas of computing require knowledge of the computing device including software implementation at the machine level. Several courses in computer science curricula address these low-level details such as computer architecture and assembly languages. For such courses, there are advantages to studying real architectures instead of simplified examples....
Separating crosscutting concerns while preserving modular reasoning is challenging. Type-based interfaces (event types) separate modularized crosscutting concerns (observers) and traditional object-oriented concerns (subjects). Event types paired with event specifications were shown to be effective in enabling modular reasoning about subjects and observers. Similar to class subtyping, organizing event types into...
The contribution of this work is the design and evaluation of a programming language model that unifies aspects and classes, as they appear in AspectJ-like languages. We show that our model preserves the capabilities of AspectJ-like languages, while improving the conceptual integrity of the language model and the compositionality of...
Dynamic aspect-oriented (AO) features have important software engineering benefits such as allowing unanticipated software evolution and maintenance. It is thus important to efficiently support these features in language implementations. Current implementations incur unnecessary design-time and runtime overhead due to the lack of support in underlying intermediate language (IL) models. To...
The emergence of aspect-oriented programming (AOP) languages has provided software designers with new mechanisms and strategies for decomposing programs into modules and composing modules into systems. What we do not yet fully understand is how best to use such mechanisms consistent with common modularization objectives such as the comprehensibility of...
In today’s software-centric world, ultra-large-scale software repositories, such as SourceForge, GitHub, and Google Code, are the new library of Alexandria. They contain an enormous corpus of software and related information. Scientists and engineers alike are interested in analyzing this wealth of information. However, systematic extraction and analysis of relevant data...
In our previous work, we presented an aspect-oriented intermediate language, named Nu, to preserve design modularity in object code. Nu is based on two primitives: bind and remove. We showed that maintaining modularity in object code significantly improved the incremental compilation time of aspect-oriented programs. The key contribution of this...
Quantification is a distinguishing characteristic of AspectJ-like aspect-oriented languages. Such languages use advice constructs to modify the behavior of execution points. In this work, we contribute an approach and a language de- sign for quantification based on type hierarchies that we call type-based quantification. The key idea is to superimpose...
The contribution of this work is the design, implementation and evaluation of a new aspect-oriented intermediate language model that we call Nu. The primary motivation behind the design of the Nu model is to maintain the aspect-oriented design modularity in the intermediate code for the responsiveness of incremental compilers and...
Sensor networks are often deployed in hostile situations. A number of protocols are being developed to secure these networks. Current means to verify these protocols include simulation, manual inspection, and running them on sensor network testbeds. These techniques leave room for subtle errors in protocol implementations that can be exploited...
The deployment scenarios for sensor networks are often hostile. These networks also have to operate unattended for long periods. Therefore, fault-tolerance mechanisms are needed to protect these networks from various faults such as node failure due to loss of power, compromise, etc and link failure due to network intrusion, etc....
Implicit invocation and aspect-oriented languages provide related but distinct mechanisms for separation of concerns. Implicit invocation languages have explicitly announced events, which runs registered observer methods. Aspect-oriented languages have implicitly announced events, called “join points,” which run method-like but more powerful advice. A limitation of implicit invocation languages is their...
The choice to use static or load-time weaving techniques in the development cycle of large AspectJ programs is not clear. It is a common practice to iteratively remove errors from programs by making small changes, recompiling, and testing the change. Previous research has shown that incremental compilation of aspect-oriented programs...
Web services are distributed software components, that are decoupled from each other using interfaces with specified functional behaviors. However, such behavioral specifications are insufficient to demonstrate compliance with certain temporal non-functional policies. We show an example demonstrating that a patient’s health-related query sent to a health care service is answered...
Integrated systems are those where components must behave together in order to fulfill overall requirements. In such systems, modularization of integration relationships is important for enabling separate component compilation, testing, and debugging, and for enhanced reuse. Existing languages and approaches for modularizing integration relationships work, but do not solve all...
Verifying correctness properties of parameterized systems is a long-standing problem. The challenge lies in the lack of guarantee that the property is satisfied for all instances of the parameterized system. Existing work on addressing this challenge aims to reduce this problem to checking the properties on smaller systems with a...
Dynamic deployment is an important feature of an aspect-oriented language design that has many applications, e.g. in runtime monitoring, runtime adaptation to fix bugs or add features to long running applications, runtime update of dynamic policy changes, etc. Many recently proposed language designs support these use cases. In previous work,...
Dynamic software updating provides many benefits, e.g. in runtime monitoring, runtime adaptation to fix bugs in long running applications, etc. Although it has several advantages, no quantitative analysis of its costs and revenue are available to show its benefits or limitations especially in comparison with other software updating schemes. To...
The notion of events in distributed publish-subscribe systems implies safe concurrency. However, that implication does not hold in object-oriented (OO) programs that utilize events for modularity. This is because unlike the distributed setting, where publisher and subscriber do not share state and only communicate via messages, additional communication between publisher...
Writing correct and efficient concurrent programs still remains a challenge. Explicit concurrency is difficult, error prone, and creates code which is hard to maintain and debug. This type of concurrency also treats modular program design and concurrency as separate goals, where modularity often suffers. To solve these problems, we are...
Many programmers find writing and reasoning about concurrent programs difficult and can benefit from better abstractions for concurrency. A promising class of such concurrency abstractions combines state and control within a single linguistic mechanism and uses asynchronous messages for communications, e.g. active objects or actors. One hurdle is the need...
Programming languages evolve over time, adding additional language features to simplify common tasks and make the language easier to use. For example, the Java Language Specification has four editions and is currently drafting a fifth. While the addition of language features is driven by an assumed need by the community...
The open world assumption makes the design of a type-and-effect system challenging, especially in concurrent object-oriented languages. The main problem is in the computation of the effects of a dynamically dispatched method invocation, because all possible dynamic types of its receiver are not known statically. Previous work proposes effect annotations...
Futures offer a convenient abstraction for encapsulating delayed computation. It is a mechanism to introduce concurrency through a rewrite of the sequential program. However, managing futures is tedious and requires knowledge of concurrency and its concerns. The notion of transparent futures is used to hide the complexity of futures from...
Emerging trends towards performance-asymmetric multicore processors (AMPs) are posing new challenges, because for effective utilization of AMPs, code sections of a program must be assigned to cores such that the resource needs of the code sections closely match the resources available at the assigned core. Computing this assignment can be...
In earlier work, we showed that the AspectJ notions of aspect and class can be unified in a new module construct that we called the classpect, and that this new model is simpler and able to accommodate a broader set of requirements for modular solutions to complex integration problems. We...
In today’s software-centric world, ultra-large-scale software repositories, e.g. SourceForge, GitHub, and Google Code, are the new library of Alexandria. They contain an enormous corpus of software and related information. Scientists and engineers alike are interested in analyzing this wealth of information. However, systematic extraction and analysis of relevant data from...
Implicit invocation (II) and aspect-oriented (AO) languages provide software designers with related but distinct mechanisms and strategies for decomposing programs into modules and composing modules into systems. II languages have explicitly announced events that run registered observer methods. AO languages have implicitly announced events that run method-like but more powerful...
As concurrent software becomes more pervasive, models that provide both safe concurrency and modular reasoning become more important. Panini is one such model, and provides both sparse and cognizant interference based around the concept of capsules. Additionally, web frameworks, Graphical User Interface (GUI) libraries, and other projects are event-driven in...
Current aspect-oriented (AO) compilation techniques fail to preserve the separation of concerns for post-compilation phases. At the minimum, it makes efficient incremental compilation and unit testing of AO programs challenging. The contribution of this work is an improved approach for aspect-oriented compilation. Our approach rests on a new interface between...
In earlier work, we showed that the AspectJ notions of aspect and class can be unified in a new module construct that we called the classpect, and that this new model is simpler and able to accommodate a broader set of requirements for modular solutions to complex integration problems. We...
The contribution of this work is the design, implementation and evaluation of a new aspect-oriented invocation mechanism for preserving design modularity in object code. We call our mechanism Bind. We make three basic claims. First, it is feasible to realize a programming model that supports Bind to preserve design modularity...
Encouraged by the success of data-driven software engineering (SE) techniques that have found numerous applications e.g. in defect prediction, specification inference, the demand for mining and analyzing source code repositories at scale has significantly increased. However, analyzing source code at scale remains expensive to the extent that data-driven solutions to...
Aspect-oriented languages mostly employ implicit language-defined join point models, where well-defined points in the program are called join points and declarative predicates are used to quantify them. The primary motivation for using an implicit join point model is obliviousness and ease of quantification. A design choice for aspect-oriented intermediate languages...
Constructs of dynamic nature, e.g., history-based pointcuts and control-flow based pointcuts, have received significant attention in recent aspect-oriented literature. A variety of compelling use cases are presented that motivate the need for efficiently supporting such constructs in language implementations. The key challenge in implementing dynamic constructs is to efficiently support...
Finding flaws in security protocol implementations is hard. Finding flaws in the implementations of sensor network security protocols is even harder because they are designed to protect against more system failures compared to traditional protocols. Formal verification techniques such as model checking, theorem proving, etc, have been very successful in...
Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To that end, this work reports on a detailed (manual) examination of 3,280 highly-rated...
As aspect-oriented (AO) programming techniques become more widely used, their use in critical systems such as aircraft and telephone networks, will become more widespread. However, careful reasoning about AO code seems difficult because: (1) advice may apply in too many places, and (2) standard specification techniques do not limit the...
Modular reasoning and concurrent programming are both necessary for scalable development of performant software. Modular reasoning improves scalability by allowing a program to be understood one module at a time. Concurrent programming improves the performance by allowing simultaneous executions of multiple computations in a single program. However, modular reasoning about...
Aspect-oriented programming techniques extend object-oriented programming with new methods to modularize concerns that otherwise would be non-modular. For example, logging concerns are typically scattered across a system but using aspect-oriented techniques they can be localized into a single high-level module. These techniques typically take modular high-level code and statically transform...
Mining software repositories provides developers and researchers a chance to learn from previous development activities and apply that knowledge to the future. Ultra-large-scale open source repositories (e.g., SourceForge with 350,000+ projects, GitHub with 250,000+ projects, and Google Code with 250,000+ projects) provide an extremely large corpus to perform such mining...
Verifying sensor network security protocol implementations using testing/simulation might leave some flaws undetected. Formal verification techniques have been very successful in detecting faults in security protocol specifications; however, they generally require building a formal description (model) of the protocol. Building accurate models is hard, thus hindering the application of formal...
Services in a service-oriented architecture are designed to meet desired functional and non-functional requirements. Conformance of a service implementation to its functional requirements can be tested by observing the interface of the service but it is hard to enforce non-functional requirements such as data privacy and safety properties by monitoring...
Steganalysis deals with identifying the instances of medium(s) which carry a message for communication by concealing their exisitence. This research focuses on steganalysis of JPEG images, because of its ubiquitous nature and low bandwidth requirement for storage and transmission. JPEG image steganalysis is generally addressed by representing an image with...
Increasing the speed of single-core processors has been facing practical challenges. Instead, multi- core architecture has been ascending for the past decade as the dominant architecture. To gain full advantage of multi-core processors, it is unavoidable for programmers to write concurrent programs. However, writing and reasoning about concurrent programs is...
Writing correct and efficient concurrent programs still remains a challenge. Explicit concurrency is difficult, error prone, and creates code which is hard to maintain and debug. This type of concurrency also treats modular program design and concurrency as separate goals, where modularity often suffers. To solve these problems, we are...
Type-and-effect systems are a powerful tool for program construction and verification. Type-and- effect systems are useful because it help reduce bugs in computer programs, enable compiler optimiza- tions and provide program documentation. As software systems increasingly embrace dynamic features and complex modes of compilation, static effect systems have to reconcile...
The process of reading, writing, and reasoning about concurrent programs benefits from better abstractions for concurrency than what many common languages, such as Java, offer. Capsule-oriented programming and the Panini language utilize the idea of combining state and control within a linguistic mechanism along with asynchronous message passing to provide...
Monitoring or profiling programs provides us with an understanding for its further improvement and analysis. Typically, for monitoring or profiling, the program is instrumented to execute additional code that collects necessary data. A problem is that program instrumentation is often reported to cause between 10% and 390% time and space...
Emerging modularization techniques such as aspects and their precursors such as events in implicit invocation languages aim to provide a software engineer with better facilities to separate conceptual concerns in software systems. To facilitate adoption of these techniques in real world software projects, seamless integration into well-accepted practices such as...
The latest trend towards performance asymmetry among cores on a single chip of a multicore processor is posing new software engineering challenges for developers. A key challenge is that for effective utilization of these performance-asymmetric multicore processors, application threads must be assigned to cores such that the resource needs of...
The latest trend towards performance asymmetry among cores on a single chip of a multicore processor is posing new challenges. For effective utilization of these performance-asymmetric multicore processors, code sections of a program must be assigned to cores such that the resource needs of code sections closely matches resource availability...
We propose Candoia, a novel platform and ecosys- tem for building and sharing Mining Software Repositories (MSR) tools. Using Candoia, MSR tools are built as apps, and Candoia ecosystem, acting as an appstore, allows effective sharing. Candoia platform provides, data extraction tools for curating custom datasets for user projects, and...
Performance tuning is the leading justification for breaking abstraction boundaries. We target this problem for message passing concurrency (MPC) abstractions on the Java Virtual Machine (JVM). Efficient mapping of MPC abstractions to threads is critical for performance, scalability, and CPU uti- lization; but tedious and time consuming to perform manually....
Encouraged by the success of data-driven software engineering (SE) techniques that have found numerous applications e.g. in defect prediction, specification inference, etc, the demand for mining and analyzing source code repositories at scale has significantly increased. However, analyzing source code at scale remains expensive to the extent that data-driven solutions...