One aspect of the development of Web-enabled systems that has received increasing attention is information modeling, particularly with respect to aspects such as navigation models and their relationships to the underlying content. These models have, however, typically focused on modeling at a relatively low-level and have failed to address higher-level aspects, such as architectural and even business process modeling. In this paper we introduce a set of formal extensions to an existing modeling language - WebML - that facilitates information modeling at this higher level of abstraction. We argue that these extensions will provide a clearer connection between an understanding of business models and processes, and the lower-level designs typically represented in existing models.
Web, Information Architecture, Design, WebML, Model
From its first introduction in the early 1990s, the Web has continued to evolve at a fast pace. Web-enabled systems are becoming increasingly crucial to almost all sectors in society . This has been accompanied by a rapid increase in the complexity of these applications.
Various approaches have been developed or adapted for representing this complexity. At lower levels we have models of the detailed system design. Functional design models are relatively well-established, with the dominant model (arguably) now UML [2, 9]. In terms of modeling the information design, the situation is somewhat less mature. Typically we wish to model not only the information itself, but also the relationship between the underlying content and the user-perceived views of that content, and the interactions with those views (such as navigational aspects). Whilst existing modeling languages (such as UML) can be used to represent the functional aspects, they are not as effective at representing these informational aspects. The result has been the emergence of a number of informational modeling approaches specifically developed for Web (or hypermedia) applications. Example approaches, such as RMM  and OOHDM , and more recently WebML  and adaptations of UML [1, 4, 8], have provided the ability to model the underpinning content and (to a limited extent) the way in which we interact with this information. However, these approaches still have various limitations. For instance, all current approaches lack the ability to model Web systems at higher levels of abstraction and to link effectively with models of the business model and processes (a good example of the latter that is relevant to the Web is the e3-value business modeling method ).
Work on information architectures  address these issues to a limited extent, though the notations and models used are rarely consistent with those used for lower level information modeling, and as a result these are rarely integrated effectively. Similarly, whilst information architectures often address the development of an understanding of user interactions and engagement with a site  and the way in which this influences the information organization, the nature of the information exchange and the internal inter-relationships between these information domains is often overlooked or not modeled explicitly. Nor do these models usually provide an effective consideration of the information environment in which the system exists.
We propose an extension to an existing modeling language - WebML  - that addresses these limitations. We refer to this extended WebML model as WebML+. A key point in these extensions is that the WebML+ approach is built around the notion of information flows at the level of understanding business processes. This enables the models to form a link between higher level models (specifically, business models) and lower level design models - a characteristic that is crucial in Web-development, where the systems under development often lead to fundamental changes in business processes and models.
WebML+ enables developers to express the core features of a system at a higher level, without committing to detailed architectural designs. It can be considered as an extension to WebML (see  and www.webml.org). The purpose of WebML+ modeling is to define both the internal and the external information flows within a Web system. As with WebML, we have defined both a graphical notation and an XML-based formal notation for representing WebML+ models (though we do not show the formal XML DTD here). The graphical notation is designed to allow it to be effectively communicated to non-technical members of development teams.
Figure 1 shows an example of a WebML+ model for a hypothetical example: FreeMail is a provider of free web-based e-mail that allows users to send and receive messages through a Web interface. In this example we have 3 participating actors: the FreeMail organisation itself as an internal actor, and two external actors, users and advertising companies. Internal actors such as FreeMail manually provide information directly to the system- in this case a set of user policies and rules to the system. With the external actors, advertising companies provides advertisements to the system while it receives invoice information from the system (which is, in turn, derived from the advertisements themselves and the number of impressions of the advertisement). Users provides outgoing emails as well as user information to the system while they receive incoming Emails, storage usage information, and user information and profiles.
So, let us consider what is represented in the system. The large dashed geometrical polygon represents the organisation (i.e. it does not differentiate between those elements that are internal to the system and those elements that are outside the scope of the system but managed by internal business processes). This encloses a set of information units, which are coherent and cohesive domains of information that are managed by, or related to, the system. All information within a single unit shares a common context and a common derivation (this is explained shortly) They do not map directly to pages or sets of pages -- a single web page may contain partial information from multiple units. Similarly, an information unit may be distributed over multiple pages. Different types of information units are represented graphically using different types of icons.
Some information units are provided directly by actors. For example advertisements are provided by advertising companies, and user policy and rules are provided by FreeMail. However, many information units are derived from other units rather than being provided explicitly. These derivations (shown as triangles with incoming and outgoing arrows) capture the inter-relationships between the information units. For example, presented Emails are derived from the incoming Email database, user information database and the advertisement database.
WebML+ is an architectural-level specification language for defining Web systems based on an extension to WebML. WebML+ stresses the definition of flows of information which the developer can decompose into WebML models, and hence provides a mechanism for linking abstract system modeling to the detailed design models in WebML. We argue that our approach also provides a clearer view of an information architecture than the typical site maps that are often adopted. In particular, this approach clarifies the information environment within which the system exists and the inter-relationships between the various sources and sinks of information. This is also an important step is providing a clearer connection between the information and functional perspectives of a system. Ongoing work is focusing on refining the model (including understanding different types of information provision and derivation) and clarifying the relationships with the business models and processes. We have also undertaken experiments of the model aimed at evaluating the extent to which users are able to understand a system based on a WebML+ model as compared to based on a purely textual description. Full details of the formal WebML+ model and the outcomes of the evaluation will be reported elsewhere.