Advanced Uses of FIRST
The General Searching problem
The Thesaurus

The Thesaurus, one of the central milestones of FIRST, works coupled to the Logical Tree and linked to it by a Basic Keyword Set, common to both. Subjects and keywords are the two faces of the IR coin with the basic set acting as a Gateway.

As our maps will be cloned and hosted in different Websites, the Thesaurus for each Major Subject will be settled and customized to them. By default we deliver each map in a particular Website consisting of a set of attractions and a search engine. The attractions will have the form of pairs (, URL) that from the point of user curiosity satisfaction will behave as keywords, presenting a collection of pages hosted in that URL. So our Thesaurus will have two main sections, the real keywords section and the attractions that will depend of the design of the Website that eventually hosts the map. Both “objects” will trigger the users’ curiosity.

Within the keywords section we are going to have single and compound keywords. Some compound keywords should point either directly towards subjects and sub subjects, without grammatical connectors such as: the, and, or, what, which, at, on, etc., or, indirectly via a logical operator, for instance AND, linking its keywords components.

Notes: compound keywords will be processed as unordered. The design should mark those single/compound keywords that correspond to subjects. For instance, a query to a compound (x, y) that points towards a subject should have as an answer all the i-URL’s belonging to that subject level and (optionally perhaps) to (x AND y).

Another important characteristic of our Thesaurus oriented methodology FIRST is that the retrieval process is “by level of specificity” meaning that the whole collection of i-URL’s will be evenly distributed by level of specificity. Let’s explain this characteristic a little further: For each level of the tree we define the basic keywords that identify the scope and the widest spectrum of that level. Starting at the root and going downwards we are going to have the more general keywords that point to i-URL’s that deal matters authoritative at that specific level. For instance “computing” in the first level will point towards i-URL’s dealing with computing at general level. Going downward we may find the keyword “networking” pointing directly to i-URLs’, dealing with all matters concerning networking at its widest spectrum. And we may go downwards with the keyword “tcp/ip” pointing directly to authorities dealing extensively and intensively with that specific subject. Taking advantage of this characteristic of no-inheritance, the retrieve process will present users an even amount of i-URL’s; let’s say from 10 to 20 documents for any keyword instead of a cumulative amount as we go upwards. .

 Go to Darwin Tour Guide Forward