Software Patterns Survey

In the patterns & practices (p&p) group at Microsoft we have been using software patterns for several years. I’m working with a couple of colleagues on gauging how the folks employing the guidance coming out of p&p (such as guides, application blocks, software factories and reference implementations) have, are, and will be using software patterns.

To do that we assembled a short survey. If you’ve used patterns I encourage you to take it. If you haven’t used patterns yet but your development tools use patterns one way or another then you could also take it since I’m also interested in your perspective.

You can find the survey Software Patterns: Past, Present, Future here. Thank you in advance for your answers.

A Pattern Language for Versioning

Global Context: Software Change

You are releasing a piece of software (i.e., a software artifact) for others to (re)use as a building block in their software. Reusable code fragments could be delivered as libraries, software components, applications, services, etc. Others find your artifact useful and begin using it. They could use the functions or classes within your library; they assemble applications out of your component(s); they compose your application in their workflows; they build business processes that invoke your service(s).

Once others start reusing your software you discover that you need to modify it. There are instances where modification is not possible. For example, software controlling deep sea sensors or satellite equipment may be out of reach and thus impossible to change, even when its users would like to do so. However nowadays that is the exception rather than the norm. But why change software artifacts that others are using?

One common reason for software change is a bug fix. In spite of stringent testing procedures the probability of finding errors after the software passes quality assurance and ships is not zero. Consequently you (or those using your software) may discover an error in released software. After understanding what is causing the error you modify the code to correct it.

Another driver for change is evolution. The requirements change. For example, new legislation may mandate compliance with a standard. Alternatively there could be a change in the environment. For instance, one of the systems your software is integrating with is being phased out and replaced with a new one. The new systems have slightly different interfaces and your software must change to accommodate them.

Regardless of reason changing software could cause failures—an undesired effect—even when the change fixes an existing error. Consequently the value of the change is context-dependent. For some users the change critical: they cannot operate without it. For others the cost of accommodating the change may be higher than the cost of compensating for it: their software may be counting on the bugs being there, and removing them would entail significant work.

Software change and its impact on those using the software provide the shared context for this pattern language.

Non-breaking Change

Not all types of software change cause failure, having an undesired effect on its users. Sometimes nothing breaks following a change. Ideally all changes would non-breaking. In reality this is the exception rather than the rule, which makes this case uninteresting.

Breaking Change

You released a software artifact. After the release you change it to accommodate a bug fix or a new requirement. You release a new artifact that replaces the old one. However its users start having problems; clearly the shift from the old artifact to the new one is not a NON-BREAKING CHANGE.

Why would a software change cause failure?

A change breaks software when the new artifact (i.e., the software with the change applied) violates one explicit or implicit expectations that the users of, or other software that depends on it, have about the old artifact.

Therefore:

A change that violates an expectation about a software artifact impacts its dependents. User interacting with the artifact may experience unexpected behavior. Likewise, other software using the artifact may fail to integrate. These violations have the potential to cause failure, thus making the change a BREAKING CHANGE.

The expectations could be embodied as contracts or assumptions.

The contracts subject to change when a new artifact replaces an old one could be:

  • Syntactic. For example, the number of arguments required by a method call changes from 2 to 3.
  • Semantic. For example, the amount returned by a service changes from comprising no taxes to including the sales tax.
  • Behavioral. For example, a function call changes from being side-effect free to having side-effects.
  • Quality of service (QoS). For example, a credit check service’s availability changes from 6AM-6PM to 2AM-midnight.

Assumptions can be violated. Garlan [ref] discusses the consequences of dealing with implicit assumptions in the context of composite applications.

BREAKING CHANGEs may keep the users of your software artifact from replacing the old with the new. This impacts maintainability in a negative manner and thus may not be feasible.

Substitutability Check

People use software that is subject to change. Since their correct operation depends on this software they want to assess whether a software update (i.e., replacing an old piece of software such as a library with a new one) would cause their system to break.

Can I substitute the old with the new?

Answering this question entails checking the effect of replacing the old with the new.

The easiest way is to replace the old with the new and see whether the system still works as it should. However if there are problems doing so may have irreversible consequences (e.g., life support system). Even if you can do it, what guarantees do you have that your observations cover all the possible effects of the changes? You need a means of testing for breaking changes that doesn’t impact the existing system/has low risk, and offers great accuracy (i.e., no misses).

Therefore:

Perform a substitutability check to determine whether a software artifact (library, component, service, etc.) can be replaced with a new one without causing a breaking change.

No side-effects translates into performing the check outside the live environment.

There are 2 options, analytical and empirical. High accuracy cannot be achieved in through an empirical check unless the artifact is small and simple. An analytical check requires iterating through all changes and assessing their potential impact. This is impossible without a list of what has changed (i.e., a diff), understanding the changes and their potential interactions and side effects. This is feasible only if you have visibility inside the artifact and understand how it works.

There are several dimensions to performing a SUBSTITUTABILITY CHECK.

One dimension covers who performs the check:

  • Person (manual)
    • Read the changes file and assess impact
    • Perform visual inspection
    • Try it out
    • Perform formal analysis
    • Run test suite
  • Software (automatic)
    • Manager component

Another dimension covers when is the SUBSTITUTABILITY CHECK being performed:

  • At evaluation time. You receive an update and decide whether to substitute (upgrade)
  • At build time. The IDE/compiler must pick between the old and the new library to link against
  • At run time. The runtime must pick between the old and the new dynamic library to load. As with other late binding mechanisms, going this route requires accounting for a failed check at run time.

The SUBSTITUTABILITY CHECK is a procedure that determines whether replacing an old software artifact with a new one will cause BREAKING CHANGEs. The next pattern answers the same question without requiring complex analysis.

[More coming soon so stay tuned]

Process Models for the Masses

A decade ago few people dealing with code were interested in business processes. That’s no longer the case. Due to a variety of reasons many developers discuss processes in one form or another. Some want to understand the business process that the appplication they’re building implements. Others want to grok the coordination of the services comprising their SOA. Yet others leverage workflow engines to implement the flow of pages in web applications (i.e., page flow).

A large fraction of these folks assume that processes always involve activities and their sequencing. Right? Wrong! Just because the workflow/orchetsration/coordination/etc. engine they picked to implement your process revolves around activities doesn’t mean that that’s the only way to think about a process. There are many instances when representing a process as a set of activities is unfeasible. Consider for example a web-based shoppping site: at any point you can perform one out of many activities: add something to your shopping cart; remove something from your cart; checkout; update your profile; save the cart for another visit; and so on. How would you represent somethig like this with activities?

What’s missing is the process model, something that not many newcomers are aware of. In a nutshell the process model provides the set of abstractions, relationships, and constraints that allows people to define processes. Several process models are available, each of which is suitable for a particular class of problems.

The most popular process model is activity-based. Its abstractions represent the activities performed as part of the process (or the states between these activities). This model is suitable when the process can be described through activities. It is also easy to understand because the activities resemble the elements of structural programming. However, as the web-based shopping example illustrates, sometimes it’s cumbersome to express a process as a set of activites.

Another process model uses conversations among participants as the fundamental abstraction. This model has its roots in social sciences and was pioneered by Terry Winograd (a computer scientist) and Fernando Flores (a philosopher). Together they founded Action Technologies, a company that in the mid-1980s built the first workflow system employing a conversation-based process model). Winograd and Flores describe the ideas behind their Business Interaction Model and ActionWorks in Understanding Computers and Cognition. Anyone serious about workflow ought to read their book.

A third process model uses the artifact produced as the process unfolds as its key abstraction. Though not as popular as the activity-based process model the artifact is well suited for representing processes where at any point one out of many actions are possible, such as an e-commerce Web site. The work of Richard Hull and Jianwen Su (among others) focuses on this process model. AT&T’s Vortex workflow system represents a research prototype built around the artifact-based process model.

I’m sure there are other process models, including hybrid ones that combine some of the above. Unfortunately the literature is scarce in this area. In addition, the increasing popularity of the activity-based process model (mostly because people don’t know any better) will only bury the other models even deeper.

Feature Extraction Revisited

A visit to pandora.com prompted me to revisit the topic of feature extraction. Tim Westergren’s Music Genome Project is probably one of the coolest ways of exploring feature extraction and relevance feedback:

  • The feature extraction part extracts the “phenotypes” from a piece of music you like and uses those features to find similar tunes.
  • The relevance feedback part uses your input (thumbs up/down) to refine the search.

So starting from “Jimmy Smith” and after a few course corrections the suggestions (e.g., Lou Donaldson’s Funky Mama) started to sound like what I was after.It’s great to see feature extraction and relevance feedback demonstrated in such an intuitive way. It’s also great to see that the Music Genome Project got it right. Others are still having problems employing these technologies right. For example, Amazon’s recommendations insist on recommending based on items that I bought but not for myself. I bet they’ll get more mileage (read sales) if their recommendation algorithms would discriminate between an item’s intended recipient and the person buying it. Are you listening?

Read this book: Linked

Albert-László Barabási’s book Linked: The New Science of Networks is probably one of the books that stand out from the ones I read in 2004.The author does a marvelous job of pointing out that many hubs we know of (including social networks such as St. Paul’s) follow a power law. While reading within about DoS attacks, six degrees of separation, Pareto’s law, Google, The Faloutsos brothers, and other intersting stories, keep in mind that this work comes from a group of statistical physicists rather than computer scientists (though the references to the Bose-Einstein condensation would give that out). Good to see that they’re at it again!

Distributed Systems are Hard

Working with distributed systems is hard. Many programmers who otherwise do a very good job writing application code do not fully grasp the challlenges of distributed computing.One of these challenges stems from having to deal with errors that you’d not encounter when writing application code. For example, Apple’s weather dashboard widget provides a quick view at the 7-day forecast. The forecast data comes from some system(s) across the Internet. In other words, the widget is in fact a small distributed application. However, sometimes pulling the forcast data catches it with the pants down, and you’ll end up seeing 7 consecutive Not a Number: Continue reading →

CIO Got It Right…

My co-author Boris Lublinsky sent a link to an article titled The truth about SOA. It reminded me of several evaluation projects where I recommended that they don’t continue down the SOA path because they don’t need (or are not ready for) these. Glad to see some of my rationale and recommendations reinforced by others!

Sound Engineering and Premature Extrapolation

In a recent post Patrick Logan has a few pointers and quotes about Premature Extrapolation. Henry Petroski’s Design Paradigms: Case Histories of Error and Judgment in Engineering provides a similar perspective, albeit from an angle that has nothing to do with computers.

One of the key messages that Petroski’s book is trying to get across revolves around what Patrick calls premature extrapolations. Paraphrasing Petroski, mediocre engineers scale up designs that worked in the past, hoping that they will also work at a larger scale. Good engineers don’t blindly scale; they start from the other end, analyzing potential failures and then designing to prevent them. One of the many examples he uses to illustrate this point is John A. Roebling’s Brooklyn Bridge (which belongs in the latter category) and Clark Eldridge’s Tacoma Narrows Bridge (which belongs in the former). The book offers many other examples.

I’ve been a fan of Alan Kay’s “Good ideas do not always scale” for many years. I’ve even used it in Chapter 18 of PLoPD4. After reading Petroski’s book I thought that any computer scientist going through it ought to be able to come up with that line.

Silicon Valley’s Secret Sauce

In his latest essay Paul Graham dissects what makes Silicon Valley “the” Silicon Valley. As someone who lived and worked in two (self proclaimed) Silicon Valley-like technology parks (the Silicon Alps and the Silicon Prairie) I found Graham’s discussion of the key ingredients interesting. (BTW, Wired sheds additional light over the Silicon envy.) While Grenoble has great schools and location (with ski slopes a 45 minute bus ride from the campus), and Chambana great schools and cornfields (with tall, thick corn next to the movie theater’s parking lot), neither achieved the Silicon Valley critical mass while I lived there.I also have the benefit of having read an excellent book on the topic, and am waiting for a second one to be published:

  • The Man Behind the Microchip: Robert Noyce and the Invention of Silicon Valley, a great book in at least 3 ways: the story of Robert Noyce; the history of semiconductors (from Ge transistors to Intel’s 4004); history of the Valley.

  • Broken Genius : The Rise and Fall of William Shockley, Creator of the Electronic Age is not out yet, but I’ve added it to my wishlist after hearing an NPR interview with the author.

Pattern Languages of Program Design

After a long gestation the fifth volume of Pattern Languages of Program Design (PLoPD) has been published.

Myself, Markus and James selected among patterns workshopped at PLoP conferences from 1998 through 2004. We structured the book in six parts. Part I focuses on design and contains patterns aimed at people designing object systems. As the Internet and embedded systems continue to expand their reach they bring with them concurrency and resource management problems; Part II contains patterns on these topics. Part III continues the shift from one to many applications and contains patterns for distributed systems. The domain specific patterns from Part IV focus on mobile telephony and Web-based applications. Part V shifts gears to architecture and comprises patterns that tackle composition, extensibility, and reuse. Finally, Part VI offers a smorgasbord of meta-patterns for improving the quality of pattern papers and helping their authors.

Here’s what you can find in each chapter:

  1. The Dynamic Object Model pattern combines ideas from class-based (like Java) and prototype-based (like Self) languages to address dealing with elaborate flexibility requirements.
  2. The Domain Object Manager allows application code to handle transient and persistent domain objects while supporting multiple data stores or application servers, keeping the domain objects independent of the persistence or middleware APIs.
  3. Encapsulate Context provides value to developers who are seeking ways to lower the ripple effects of code changes regardless of programming language, allowing them to manage an increasing number of call parameters without introducing global variables.
  4. A Pattern Language for Efficient, Predictable, Scalable, and Flexible Dispatching Components addresses the challenges associated with developing dispatching components, providing one of the building blocks of a handbook for distributed real-time and embedded middleware.
  5. Triple-T focuses on writing software for real-time processing is hard and poses unique challenges, covering five patterns harvested from time-triggered bus architectures for safety-critical real-time systems, such as the ones employed by Airbus airplanes or BMW and DaimlerChrysler automobiles.
  6. Real Time and Resource Overload Language presents a pattern language for designing reactive systems that gracefully accommodate load bursts, applicable to any systems that process incoming requests, such as web servers, middleware, OLTP, and so on.
  7. Drawing upon examples from several systems that deal with distribution De-Centralized Locking covers a pattern for managing locks in the context of distributed systems.
  8. The Comparand Pattern focuses on dealing with identity when working with objects from different hosts or processes within a distributed system.
  9. Service Discovery tackles a common problem in practical ways, distilling solutions from well-established examples such as SLP, JXTA, UPnP, LDAP, DNS, and the now ubiquitous and IEEE 802.11.
  10. MoRaR is a pattern language focused on mobility and radio resource management.
  11. Content Conversion and Generation on the Web is a pattern language aimed at people building applications that deal with dynamic HTML generation.
  12. Plug-ins represent a popular technique for extending applications, allowing users to late-bind new functionality. The Patterns for Plug-Ins pattern language covers techniques mined from a wide body of software, including operating systems, web browsers, graphics programs, and development environments.
  13. The Grid Middleware Architectural Pattern covers the architectural elements of grid middleware as well as guidelines for implementing and deploying them.
  14. Targeting application developers integrating components written in different languages or built with hetereogenous component concepts, the Patterns of Component and Language Integration distill insight from systems that include Apache Axis and the Simplified Wrapper Interface Generator (SWIG).
  15. Patterns for Successful Framework Development
    Frameworks
    covers a set of patterns for mitigating the mismatch between the recommended practice of building frameworks and the reality of many software development projects.
  16. Distilling from 10 years of writing and reviewing patterns Advanced Pattern Writing provides advice for improving pattern writing.
  17. A Language Designer’s Pattern Language explains how does one generate a pattern language from a system of forces.
  18. The Language of Shepherding focuses on analyzing and providing feedback about patterns.
  19. Patterns of the Prairie Houses uses the design themes of Frank Lloyd Wright’s prairie houses as an exploratory vehicle for showing how people should approach pattern mining and writing.