`e'Properties for Agent-Oriented Information Systems
Amy Unruh, Malcolm Taylor, Marian Nodine,
Gale Martin, Jerry Fowler and Bogdan Czejdo
MCC
3500 W. Balcones Center Drive, Austin, TX 78759
1. Problem Statement
Modern industry requires reliable information access, monitoring, analysis and processing in a varied and changing environment. To achieve adaptability in this environment, agent-oriented information systems need to be able to recombine and adapt easily and in many ways to keep up with changing demands. To achieve reliability and longevity in this environment, dynamic heterogeneous information systems require greater tolerance for operating faults and user mistakes than has been demonstrated in the current research prototypes of agent systems. The lack of this tolerance also impacts the scalability, extensibility and customizability of the agent system.

2. Position
2.1 Background
Agent systems are ideal venues for enabling the integration and interoperation of diverse information sources, as evidenced by our work in InfoSleuth. InfoSleuth [1] has been successfully used for integration of heterogeneous databases. Its dynamic architecture, based on semantic brokering and a global ontology, is well-suited to this type of application, even when the component databases may leave the system at unpredictable times. InfoSleuth has also been successfully deployed for applications such as business intelligence and patent validation which require analysis of large volumes of text data. However, as InfoSleuth has matured, we have migrated the InfoSleuth technology to support increasingly long-lived and demanding information-oriented tasks, with mixed success. These applications are typically less tolerant of occasional system failures.

InfoSleuth is being used as a significant component of the EDEN, a collaborative effort of several government and non-govern-ment agencies to allow common access to their combined store of environmental data related to remediation and control techniques. This problem is cast as a typical multidatabase problem. The InfoSleuth agents use focused ontologies as a semantic framework for integrating information from multiple sources. Useful general-purpose agents include those that extract information from individual databases, relational query processors, and domain-specific value mappers. The current EDEN demonstration enables multiple environmental databases throughout the US and Europe to be accessed via a web browser.

Under a joint project with the US Department of Agriculture (USDA), MCC has developed an application that supports the laboratory protocols required to interpret genetic material taken from livestock. DNA sequencing machines produce imaged sequence files that must undergo a series of analysis steps, such as conversion to a sequence of ATGC bases, extraction of component sequences (vectors), and comparison with other sequences or genes from livestock and other species that have been entered into genomic databases from the worldwide genetic research community. InfoSleuth agents automate this process. This application includes a "workflow" or "planning" component not yet found in the EDEN application described above, and this requirement places stronger requirements on the longevity and cooperative behaviors of agents.

A third application of InfoSleuth has been that of acquiring, integrating, and monitoring technical competitive intelligence (CI) information from open sources. A primary activity in the CI domain is to correlate information from open sources, discover trends and associations across these sources, and detect significant shifts in trends over time. Thus, this application focuses more on analysis and extraction of information.

2.2 Dynamic Systems and Instability
Dynamic agent systems, whether information-oriented or not, operate in a very unstable environment. On the agent side, we have agents with varying capabilities entering and leaving the agent community, potentially during the execution of some relevant tasks. These arrivals and departures may be deliberate, but there may also be unexpected events or faults. For instance, the hardware, operating system or virtual machine may fail or operate incorrectly. Alternatively, an agent may enter the system which implements a new or improved service, which may be useful in improving the quality of the current tasks. Also, a remote agent may respond with some unexpected or incorrect result. On the user side, we have a situation where the users are not only encumbered with their usual flakiness, but also the agent system has compounded this by blinding them to some degree to the actual capabilities of the system and its internal operation. For example, users may under-specify requests, resulting in long processing times. Alternatively, users may specify inappropriate requests, due to a lack of understanding of the current capabilities of the agent system. Also, users may change their minds on exactly what they want. The agent system may not give the users adequate feedback to be able to correct these problems.

2.3 `e'Properties for Agent-Based Systems
We present several design considerations with respect to the reliability issues discussed above. An agent-oriented information system should be eclectic, in that different pieces must be able to be put together in different ways to satisfy different needs. It must be ergonomic in that the adjustment of the agent-based system to fit different needs should be easy and comfortable. It should be exposed in that the internal operation of the agent-based system should be explainable to the users of the agent-based system as needed.
Eclectic: A typical agent-oriented information system should be able to service a wide variety of information tasks, including short-term queries over multiple and diverse information sources, subscriptions to classes of information with personalized information filtering, and ongoing comparisons and trend analysis. As many information-type functions are usable across one or more of these areas, agents should be able to fit themselves together in different ways to satisfy different needs as specified by different user requests. One issue to be addressed is that tasks follow dynamic patterns of interaction throughout an agent system -- the system must be able to reassemble itself in different ways to satisfy different needs. This requires some level of planning and/or process enactment within the agent system, either explicit or implied. The second issue is that, since different agents will fill similar tasks at different times, the agent-based system must provide a fairly sophisticated ability to match agents to required tasks. This may be combined with an ability within certain agents to negotiate over the terms for executing specific tasks.
Ergonomic: For an agent system to be agile, it must allow for easy adjustment of the agents to fit the individualized needs of the user. This in turn means that agent systems must conform to common paradigms of interaction so that they are `good fits' with respect to each other, and do not cause unnecessary stress on individual agents. This includes both fit when the agents are interacting when there are no faults and when there are faults. At least two issues need to be addressed consistently by the agents in the agent community. One is when agents keep their results in-memory, and when they persistently store intermediate results. In the situations where agents operates primarily in-memory, then both an agent that generates a result and the agent that uses that result as input must be up at the time the result is transmitted. Also, it makes sense to consider various wait-and-retry strategies. These issues do not impact more distributed systems that make the intermediate results persistent, e.g. by posting them to a virtual warehouse that other agents can access. The second issue is that there can be `transactional groups' of activities. If there is no semantic requirement that a set of operations is atomic, then it is relatively easy to design a model in which the task can be restarted to pick up where it left off by looking at the warehouse contents. Conversely, if there are transactional requirements (e.g. several groups of inserts must all be successfully completed) then the agents' control model will also require transaction management, if only to deal with a situation where it goes down in the middle of a transaction.
Exposed: Users must be exposed at an appropriate level to what functionality is currently available to them, both before, during and after the execution of their tasks. With this, there are also two issues that need to be addressed. One is that the users may be unaware of the exact nature of the currently- and potentially- available agents in the agent community, and may need this before he can specify tasks to the system in a meaningful way. This awareness is crucial, especially in a dynamic agent system, to keep the users from under-specifying or inappropriately specifying their requests. Also, it serves as a good venue for notifying the user of meaningful "improvements" to the system. The second issue is that the user may receive a response concerning some task, but that response may make no sense to them. Dealing with this in an agent system is complicated because the system itself has managed the whens, wheres and hows of the actual processing of the request, leaving the user in the dark. If the agent system can explain what happened, this facilitates both the ability of the users to make intelligent use of the system, and the users' comfort level with the system.

3. Research Questions
In light of the previous discussion, we propose the following research questions:
Eclectic: What are the best methods for planning and/or process enactment? What are good paradigms for describing and matching information processing services? What impact does the need to fit agents together have on agent communication languages and conversations?
Ergonomic: How do you handle issues involved with longer-running tasks that may outlive any specific agent involved in the task? How do you deal gracefully with failure? What types of transaction/recovery paradigms work in this environment? Under what circumstances is it best to communicate intermediate results directly between agents, as opposed to making the results persistent?
Exposed: At what level is providing the user an understanding of the agent system behavior helpful, and how is this information best presented to the user. What information needs to be maintained during task execution to provide the user with adequate explanation of what happened?

References
[1] M.Nodine et al, Active information gathering in InfoSleuth, International Journal of Cooperative Information Systems 9(1/2):3-28, 2000.