The major goal of the MEMESE project is to have an infrastructure at hand that provides the means for the rapid creation as well as the easy maintenance and adaptation of arbitrary domain-dependent meta search engines. A meta search engine is in essence a piece of software that does not perform a search procedure itself but rather allocates a search request to a number of mutually independent information providers. Then it collects the individual results and merges them into a single result set, possibly enhancing the response with additional value such as ranking and/or other kind of decision support.
The capability to search the data space of numerous information providers is an integral feature of networked enterprises. In particular, whenever independent organizations are willing to participate in some kind of Virtual (intercompany) Organization, such a feature gives the opportunity to develop new access channels to customers and suppliers. The example in Figure 1 illustrates the core concept. Motivations for such networking are, for instance, that a number of small organizations can combine their capacities in order to increase their market share by attracting more customers (such as P 3 and P 4 in the example), or that some large organizations intend to work together in order to create additional access channels to specific customers or to respond to the requirements of a specific business scope. In the example, P 1 represents a large organization that removes its single access point and creates two specialized channels, a stand-alone channel and one that is part of a relationship with other Providers.
Figure 1. Virtual Organizations are creating additional access channels
The principal problem when exploring a distributed data space is the complexity stemming from heterogeneous data formats, mutually deviating semantics, various kinds of interfaces and access protocols, and varying control and information flows of the individual information providers. Additionally, requesters use various access devices such raising the need to maintain adequate interfaces in order to handle requests and to deliver the appropriate format of search results. This problem leads to the task to align the search process for each information provider individually. Moreover, whenever the information provider changes its process, alters its business rules or its interface, the search process has to be checked whether some re-alignment is required. Typically, all alignment and harmonization has to be carried out manually by skilled personnel, which often requires a lot of time and money.
The MEMESE project follows up on the promise to ease the creation and adaptation by utilizing a Service-oriented Architecture (SOA) together with means that support (semi-)automatic configuration, specifically expert systems based on logic programming (LP) techniques. The SOA helps to maintain the whole lifecycle of various components (pieces of software, service interfaces, business rules, ontologies, etc.), while utilizing LP gives the opportunity to (semi-)automatically create valid configurations.
The MEMESE project consists of two major parts:
The first part encompasses the creation of an infrastructure and the development and integration of necessary components for the specific needs of a domain-independent meta search engine. The infrastructure is based on a construct known as Service Oriented Architecture (SOA). The most important benefit of SOA is reusability. SOA promotes this benefit by enabling services to be shared by different applications running on different platforms. This facilitates enterprise integration. Distributed business processes can be created and maintained more easily, and applications, either bought or self-developed, can be deployed more easily in different contexts.
The following components have been identified as the basic building blocks of a SOA for domain-independent search engines:
- Registry/Repository: The core of a SOA is a component that allows registering services and related artefacts. This application is called a registry; in addition, if artefacts can be registered and stored too, it is called a repository. This component is used to manage the lifecycle of various artefacts and provides search functionality.
- Authentication/Authorization: Component responsible for the process of verifying that a potential partner in a conversation is capable of representing a certain person or organization (authentication). Authorization is required to evaluate access control information and to determine whether an agent is allowed to have the specified types of access to a particular resource.
- Trust: A generic component that allows maintaining trust using references, remarks, and other common means in order to derive qualitative or quantitative statements about system users.
- User preferences: A component that allows modeling preferences of a particular user or certain user types, i.e. to define a user profile. A user profile is used to guide the search process and is also an important means for decision support. The ranking component may utilize this component in order to arrange the result with respect to particular preferences derived from the user profile.
- Ontology: This component provides typical functionality such as reasoning and querying of ontologies. Domain-dependent ontologies are plugged in to this component as needed.
- Ranking: Generic component that provides functionality such as multi-criteria decision making that help to rank candidates. The actual ranking function is usually domain-dependent (price, distance, weight, etc.) and is plugged in when a particular meta search engine is created.
- User Interface: Interfaces for a number of possible access devices or means, such as Web Service, Web browser and mobile phone.
- Workflow: A component that is able to manage the correct execution order of applications and services.
- Reporting: Generic component that allows to inspect the current configuration and to derive qualitative or quantitative statements about the system.
- Administration: Component to manage the configuration and various aspects, such as user rights, ranking function adjustment, maximum number of results, etc.
- Configuration support: An expert system that helps to configure new meta search engines or to adapt existing ones.
While the first part requires mainly software engineering skills, the second part calls for an accurate knowledge of the domain. This mandatory domain knowledge comprehends not only the terminology of the particular domain but also a profound knowledge of the business models of the industry under consideration, and the rules that govern its business. To this purpose, having industrial partners operating in this specific domain is quite helpful. Consequently, the set-up for the second part of the project (the proof of concept) is as follows:
- Domain is the transportation industry, in particular, freight advertisers and freight requesters.
- Competence Center EC3 develops the technical infrastructure of the meta search engine.
- FREECOM internet services GmbH, with its sound knowledge of the logistics industry, is the provider of domain knowledge and also the prospective operator of the domain-dependent vertical meta search engine.
- Freight exchanges and companies intending to offer freight directly – without the help of 3rd parties such as brokers or exchanges – are the information providers
- Independent freight carriers as users of the system.
Companies such as importers, exporters, wholesalers, and manufacturers place freight offers on freight exchanges to enable transport companies search for appropriate adverts and to bid to move their goods. Transport companies, which often are independent carriers having a single truck, are using such platforms to create or optimise a tour. In the most basic case, a carrier begins a tour at a start location, picks up the freight at a certain location, and delivers the good at a certain destination location. From this point on the truck is empty and the carrier is returning home to his start location. In this basic case, the truck is empty for at least half part of the tour. In order to avoid such a situation, carriers are looking for suitable fresh freight orders, called “back load”, at freight exchanges. The meta search capabilities developed in the MEMESE project helps to create a single access point for the carrier. Moreover, companies may provide an interface to their databases containing freight offers and in such a way bypassing freight exchanges.
Additionally, value creating features such as appropriate ranking, detection of duplicates, tour planning, Geographical Information System (GIS) support, or load optimisation can be bundled at this access point, too.
Dorn, J., Hrastnik, P., Rainer, A. and Starzacher, P., Web Service based Meta-search for Accommodations, Information Technology & Tourism, Volume 10, Number 2, 2008 , pp. 147-159(13)
Dorn, J. and Naz T., Structuring Meta-search Research by Design Patterns, in Proceedings of International Computer Science and Technology Conference, San Diego, 2008