Research #1 -- Fundamental Flaws...

Fundamental Flaws in the "Anchor Point-Anchor Section" Model and Potential Fixes

By
Edward F. Granzow
23 October 1996

Introduction

A well documented interest of government transportation agencies and related groups is the development of workable and flexible methodologies and procedures for overlaying transportation data on digital maps and related GIS products. To a certain extent, this need has been addressed by the major software suppliers in the GIS industry. However, vendor implemented methods has proven to have only limited applicability to effectively describing the elements of the transportation system.

The fundamental reason it is so difficult to describe transportation system features with the available tools is that the primary means of description is through linear measures and references, not Cartesian references. Although this problem is not unique to transportation, the variety and sophistication of methods and models used by transportation generate the most extensive requirements for linear referencing methods.

This problem is quite well recognized in the transportation community and a number of attempts via gatherings of specialist representatives from the industry have been made to address the issues. Three notable efforts in this area were the Workshop on a Generic Data Model for Linear Referencing Systems (August 1994); the Linear Referencing and the Spatial Data Transfer Workshop (January 1996) and the Enterprise Location Referencing Workshop (July 1996). While all of these to some extent have contributed to the development of a standardized and comprehensive solution, no such solution has yet been found.

The product of the Workshop on a Generic Data Model for Linear Referencing Systems held in Milwaukee, WI in August of 1994 was a consensus data model described as having the capability to effectively describe linear features and routes they lie on; economically link those description to a corresponding two dimensional map representation; and allow transformation between different overlaying systems being used to reference linear information. When the Linear Referencing and the Spatial Data Transfer Workshop has held as part of the 1996 Annual Transportation Research Board meeting in Washington, D.C. in January 1996, certain problems and issues with the Milwaukee model were illustrated. These problems were discovered as part of a pragmatic prototypical implementation of the model when an attempt was made to incorporate it into the AASHTO GIS-T Pooled Fund Study; then being conducted to establish a GIS based approach to design of ISTEA Management Systems.

The purpose of this paper is to explore the flaws discovered in the process of attempting model application (AASHTO GIS-T) and to examine some potential changes in the data model which could potentially address those flaws. The should be stated from the outset that the ideas expressed in this paper are provisional at best and the goal is to explore their value, not to indicate that they are a solution.

The paper is divided into five sections. This is the first section. It is followed by a short summary of the salient characteristics of the Milwaukee model; a section describing my understanding of the problems which have been identified; a generic discussion of the data model concepts which I believe may potentially be useful in addressing the identified problems; a more directed discourse on how this concepts can be practically applied to the basic problem of linear referencing; and a discussion of the impacts of the proposed model modifications on it's practical implementation. I should also be clear that it is the opinion of this author that while the Milwaukee model does offer the possibility of extending the state of the art and could potentially be a building block for addressing the full range of linear referencing needs and issues, in its current form and state of development, it is only a component of the solution.

The Milwaukee Model

The Milwaukee model has developed through the joint efforts of about fifty industry professionals at a specialty National Cooperative Highway Research Program conference workshop held over two days in August 1994. The model represents the best efforts of this group to design a data modelling approach to linear referencing which is both flexible and practical.

The primary and innovative feature of the model is a underlying layer of points and connecting segments which provides a kind of Rosetta Stone to translate between any different layers superimposed on it. These points and segments were named Anchor Points and Anchor Sections to denote their fundamental role in understanding all other data maintained as part of the model. A second unique and powerful element of the model is the decoupling of the references defined in the linear system from the connected two dimensional cartographic map representation. This separation was achieved by linking points in the cartographic database to the anchor Point - anchor Section layer. This approach has highly significant positive implications for integrated maintenance of cartographic and linear referenced data.

The anchor point-anchor section base layer supports a series of parallel layers which describe the building blocks for designating specific paths in the linear reference system and their associated metric as Traversals (The more intuitively recognizable name for these paths is routes. This was not used because the confusing number of meanings it already has.). These parallel layers which support traversals consist of Nodes and Links. The set of anchor points is a subset of the set of nodes, but not every node is an anchor point. A single anchor section can have many nodes and links lying along it's length. Typically, each linear referencing system will have it's own node-link layer. These layers then in turn support the traversals associated with that linear referencing system.

Two additional data elements are incorporated into the Milwaukee model. These are Traversal Reference Points and Events. Traversal reference points provide for recalibration of measurement of distance along specific traversals. In the real world these are stationary reference markers such as mileposts. Events are references which can be point or linear phenomena to occurrences along traversals. This completes the vocabulary required to describe the full range of data infrastructure required to indicate a linear referenced element and it's characteristics.

As will be seen in the next section, the problems arising with the Milwaukee model have to do with practical application, not it's theoretical soundness. I submit that the fundamental data structure as developed does adequately address the design specification that was developed as part of the workshop's agenda.

Practical Issues with the Milwaukee Model

The application of the Milwaukee model to practical linear referencing requirements of developing ISTEA management systems in the AASHTO GIS-T Pooled Study were disclosed at the workshop at TRB in January of this year. The fundamental problem which was identified was the need to have an anchor point-anchor section template which incorporates all of the nodes and links referenced at the anchor layer for any dependent linear referencing system. In other words, the anchor point-anchor section model used must be the superset of all anchor point-anchor section combinations used to denote all dependent linear referencing systems. This requirement results primarily from the need to map from one linear referencing system to another and the dependence of the mapping operation on establishment of commonalties between the systems.

While this limitation does not hamper the theoretical workability of the Milwaukee model, it does significantly increase the density of segmentation of the anchor point-anchor section base layer and create a major data maintenance bottleneck for the implementor of the model. It is relatively safe to say that in it's basic configuration the Milwaukee model would not represent a practical option based on anticipated cost-benefit tradeoffs for a typical user.

The case can be made that the problems of the Milwaukee model stem in large part from the design of the anchor point-anchor section component as the static element. It will be advanced in the next section of this paper that it may be possible to adopt a more dynamic model of anchor points and, specifically, anchor sections which can assume operationally dependent states. This is an exploratory concept, but it will be reviewed as far as practical from a conceptual model standpoint below. A more definitive test would be incorporation of the approach into the same or similar prototypical application as the one used to test the basic Milwaukee model.

The Dynamic Anchor Point-Anchor Section Model

The concept of a dynamic anchor point-anchor section component of the Milwaukee model relies upon the layering of the specific node-link systems on the anchor point-anchor section base layer. All linear referenced node locations can be unambiguously identified along the defined anchor sections. Theoretically, because the model is a 1 1/2 dimensional model a single anchor point can be used to create an entire contiguous linear system reference when all other points are mapped from a commonly referenced node-link layer.

The problem of proliferation of references in the anchor point-anchor section layer is a matter of degree and of storage and maintenance requirements. Since the model specifies that anchor points are the end points of anchor sections, any new anchor point occurring on a previously referenced anchor section requires splitting that anchor section into two new anchor sections. When considering the superset required to generate correspondences between all linear referencing systems, this results in a massive base reference layer.

To a large extent, the problem above can be traced to the model assumption that anchor points are of a single type and class. If we regard the issue of connectivity between the anchor point-anchor section layer and dependent node-link layers to be dynamic and virtual rather than static, the anchor point-anchor section system becomes configurable to address the requirements of the task at hand. Anchor points are classified in three types to allow their selective incorporation. The three types suggested are:

Linear Reference-Cartographic Reference Links

These support the link between the linear referenced data and a two dimensional map representation. Typically, these anchor sections will remain stable and point density be relatively low.

Base Points

These represent universal references in use by many or all dependent systems. The major difference between these and classified anchor points is that these do not include a system reference list to indicate when they should be used to define a dynamic anchor point-anchor section configuration. They will always be incorporated into any dynamic configuration being built. It should be noted that their static nature only applies to anchor points and not the anchor sections. The anchor sections will always be a dynamic product of the selected collection of anchor points.

Classified Points

Classified points are those which are defined in relationship to one or more particular node-link layers. They are actively inserted into the dynamic configuration based on the actual requirements of the mapping operation being performed. A unique set of anchor sections is dynamically generated based on the selected collection of classified and base points.

Effective use of a dynamic model of anchor points and anchor sections requires the facility to dynamically generate anchor sections and their related properties. The role of the anchor section is to carry a universal measure of linear distance which can be used to translate different metrics used by different sets of links. Further in a system such as the Milwaukee model which lacks two dimensionally, the anchor point-anchor section layer also removes ambiguity regarding the linear routes in the system being represented. In essence, the anchor point-anchor section base layer indicates when two node-link systems are overlaying the same physical segment. The model proposed requires some type of rule base to determine the connections between the anchor points in an unambiguous manner.

This role is performed by adopting a three dimensional model of network connectivity. The roots of this approach originate in work now being done to develop databases capable of supporting real time vehicle location and routing as part of the Intelligent Transportation Systems initiative. In the effort, the concept of nodes or points being objects specific to routes was developed. For purposes of developing a dynamic anchor point-anchor section model, we take a similar approach. In this model, separation between what we are calling classified points is measured in two dimensions. These are distance which must be derived from the dynamic construction of anchor sections and connectivity; where two points which are coincident, they have separate identifiers however the connectivity measure between them is zero. These provides a means to establish coincidence without using the same reference.

At least two methods can be used to represent these interrelationships. First, a table can be maintained which lists sets of coincident nodes. Second, "connectivity link" records could be defined between connected pairs of nodes. This later matter would have the added overhead of a requirement to navigate through connected node linkages to establish those connections which are indirect. It's advantage is that for any other potential application of the methodology, it is a more flexible data structure and is structurally more consistent with the data structures used in the rest of the Milwaukee model. In reality, the difference is an implementation detail regarding the proximity of storage of related information.

Generation of a Dynamic Anchor Point-Anchor Section Model

Having defined the components of a dynamic anchor point-anchor section layer, the next step is to explore and delineate the use of this components in interrelating linear referenced data. We begin by describing the process of creating a layer resulting from a set of operational requirements.

In the process of translating linear references linked to one node-link system to another, we must create a common vocabulary to allowing such a mapping to occur. Creating this vocabulary basically involves creating a new link-node network representing the superset of the anchor point-anchor section references from two (or more) link-node layers. This is a problem of: first, determining whether we have adequate information available to make such a translation; second, determining the series of procedures and formulae needed to perform the translation; and, third, as our context is pragmatic application, ensuring the such formulae are computationally tractable and practical.

To address the first question, we can examine the elements of the final product, that is the temporary state of an anchor point-anchor section layer to meet these immediate requirements. As stated above, this is the superset of the anchor point-anchor section layers of the two involved linear referencing systems. Using the above defined classes of anchor points, this would involve merging the base points with the combined classified points for both systems and generating a set of anchor sections to connect them. Since anchor points are also nodes in the link-node layer of a linear referencing system, this would give us a dynamic configuration mirroring the static configuration described in the original Milwaukee model specification.

To do this requires three types of information. First, we must know in a common one and a half dimensional space where the anchor points are located. Second, we must know how the anchor points are (and are not) connected to one another. And, finally, we must have a measure of absolute distance between them. All these questions are interrelated. Since in one a half dimensional space the location of an anchor point is determined by it's relative connected location to other anchor points (via anchor sections) and its distance from those other connected anchor points, the information we ultimately must be able to generate is a connectivity mapping of anchor points with a measure of linear separation. This brings us to question two or how can we generate this information?

Although the base anchor point-anchor section layer provides links to two dimensional representations and useful economies in data representation, in theory, no base anchor points would be needed in mapping from one node-link layer to another. To simply the representation of our problem, we will assume that this is the case. In such a case as further informed by our operational requirements, we can delineate three basic categories of "anchor points". Calling our two linear referencing systems System A and System B, we have anchor points common to both A and B. Such points are denoted by separate identifiers but have a connectivity separation of zero. We may also have anchor points which only exist in A. There may also be anchor points which only exist in B. (Note: For "anchor points which only exist in A or B", these points may still be nodes in the other system. However, in that system either they are not classified as anchor points or their is no connectivity reference between them and their counterpart.

The basis of the Milwaukee model is common references between the different node-link layers, so the points held in common anchor the mapping process. It is beyond the scope of this paper, but it should be noted that density of such anchors may largely determine data resolution. However, from a data modelling perspective, if we can unambiguously determine a path from any anchor point to any other anchor point, we can generate the required mapping of connectivity. Since links in the node-link layer already include a distance measure, as long as a measure is available which increases (or decreases) as a linear function of separation, we can also generate a measure of length of the dynamically generated anchor sections.

Since links are coincident with anchor sections in the static model, a series of links can be accumulated in what we will call an anchor segment. For our purposes, an anchor segment is defined as a set of contiguous links between two classified anchor points represented by a single node-link layer. Since anchor segments must represent unambiguous paths (they have the same configuration as anchor sections), any junction in a node-link layer must be a classified anchor point. With these rule, we can now reduce the node-link layer to a connected system of classified anchor points and segments. (Note: in the full model, we would include base anchor points in generating the anchor segments). Both of our prototypical linear referencing systems, A and B, can be abstracted in a similar manner. Now, we must translate our two node-link layer dependent anchor point-anchor segment systems to a single common system.

The problem of performing such a mapping is similar but not the same. Up to now, we have had the luxury of knowing that all abstracted segments were aggregations of their derivatives. However, segments and junctions may exist in system A which are not present in system B or vis-a-versa. We must now compare representations and choose the most aggregated one. This is necessary to enforce the rule of determining an unambiguous path. If we follow a anchor segment in one system that does not have a counterpart in the other system, we may return to a common anchor point creating a parallel path and thereby a data model ambiguity. Also, from the standpoint of minimizing computational overhead, we wish to minimize the number of points and sections to those required for the translation process.

Examined from this standpoint, it must be possible to create an unambiguous path between any two common (to the two linear referencing systems) anchor points using the anchor segments derived from one of the node-link layers. On the surface, they may seem to negate most of the advantages of the dynamic model. However, this is not likely to be the case. First, since the set of common anchor points is only related to the linear reference systems being used in a specific translation exercise, the operational superset of derived anchor points and sections is still likely to be much smaller than that required to support universal references to the set of all node-link layers. Second, a relatively simple user intervention process could be used to address such ambiguities involving identifying additional common anchor points. The new common point references could become part of the permanent database alleviating the need for repeating intervention in the future. This is particularly attractive considering recent innovations in computer mapping and graphics and the ease of implementing such procedures.

Once a connectivity mapping was defined for both systems A and B subject to the above requirements, any classified anchor point in either system can be mapped to the other system. Operationally, one system could be a "donor" system and the other a "recipient" system. The recipient system would have all classified anchor points from the donor system mapped onto it's anchor segments. Simultaneously, new anchor segments would be created based on distance and direction to the defined set of common points. Information regarding membership of classified points in one, the other or both linear reference systems would be kept as part of the derivative database. Theoretically, this "anchor point-anchor section" layer would provide all necessary information to map between the now merged layers.

While in theory, we have created a foundation for translation between two linear referencing systems, we have not actually reproduced a dynamic version of the anchor point-anchor section model. Our model differs in that a link from a given node-link layer may actually be made up of two or more anchor sections. ( In the Milwaukee model, links were always a subset of anchor sections). We will explore the operational implications of this data structure difference below.

Application of a Dynamic Anchor Point-Anchor Section Model

In the last section we addressed the processes and steps involved in generating a dynamic anchor point-anchor section base layer based on the operational requirements of a particular linear reference translation task. In the last section of the paper, we will explore the additional utility this approach may have in expanded uses of linear referenced data. The purpose of this section is to describe how the generated dynamic layer could actually be utilized in the translation process. The need to expand on this subject is due to the difference in the characteristics of the layer it is possible to generate using the dynamic methodology versus the specified characteristics of the anchor point-anchor section layer developed as part of the Milwaukee model. As stated in the last section, this difference is that the dynamic model permits a link in a node-link layer to be composed of multiple anchor sections.

The highest degree of data in the Milwaukee model is the event. Events are measured based on traversal reference points which are in turn based on traversals which are based on the linear referencing systems node-link layer which is then overlaid on the anchor point-anchor section base layer. This section will attempt to illustrate the process of navigating through the interrelationships of these elements; translating an event being maintained in one linear referencing system to an event in another system. An a priori assumption being made is that the two systems share "linear space" so that the event can exist in both systems.

The process of defining references for a given event for each data class is the same as the one used in the Milwaukee model in the context of a given node-link layer. First, the event reference is translated from a relationship to the traversal to the underlying node-link layer: the route of the event is identified along with the denoted traversal reference point; the sequence of links (including their flow direction) and by measuring linear reference metric offsets in the route, the link location of the traversal reference point and from there the link location of the event (for linear events, start and end locations) are used to locate the event as a "link event" with a linear reference metric measure from the starting or ending node defining the link. The link must then be place in the context of the classified anchor point. This or the reverse process gives us an event location in the context of the node-link layer referenced to a classified anchor point. Translation to the destination linear referencing system requires two pieces of information. First, we must establish the linear location of the event relative to the identifiers (node-link layer) used to describe the destination system; second, we must translate the metric to that of the destination system.

Assuming that 1) we have a rule database describing the algorithmic relationship between a base linear measure used to describe the length of anchor sections and the metric used for a given node-link layer and it's traversals or; 2) all measurement systems are linearly correlated with distance, we can translate the metric distance from the classified anchor point to the event to a "universal" distance by applying the formula in the first case or by developing a scaling factor. Following the nonbranching linear route in either direction will lead us to either a common anchor point or a classified anchor point of the destination node-link layer. We can, while finding this point, accumulate the "universal" distance from the event. Again, using the inverse of the above procedure, we can locate the event's link offset in the destination metric. Once this is done, the process described in the paragraph above is used to relate the event to first a traversal and then a traversal reference point. This would complete the operation of mapping the event to another linear reference system.

Summary and Conclusions

We have set out to explore the possibility of mitigating the operational problems of the linear referencing and cross referencing model developed by discipline specialists in Milwaukee in 1994 with a modified conception of what is called the anchor point-anchor section layer. The indication is that redefining this layer as dynamic and transient data may offer a way to reduce what are probably unreasonable and unrealistic data input and maintenance requirements associated with the original model. We explored this approach to redefinition of the layer both in terms of it's theoretical workability and the nature and scale of the process which would be required to support such an implementation.

From this, we can conclude that this is potentially a workable approach and warrants further investigation. We can also see that implementation probably requires certain sets of rules be obeyed in defining information dependent on the anchor point-anchor section references; notably all paths between common reference points be nonbranching. From this brief investigation, the added overhead of a dynamic model seems manageable from the standpoint of production application. However, the specifics of how implementation could be accomplished definitely requires more complete review.

The approach taken of using the node-link layer classified anchor points and explicitly creating data links to describe coincidence to other points associated with other node-link layers has potential significance beyond the problem of mapping linear referenced information from one system to another. The ability to dynamically link two such systems may offer a starting point for addressing issues such as dependencies of traversals on other traversals which occur in subject areas such as defining complex traffic flow relationships. In any case, a model such as the Milwaukee model, which can be feasibly implemented in a necessary component to development of a comprehensive linear data relationship model for transportation systems.