7. Exercise answers

[other answers are possible]In this section answers to some of the exercises in the book are provided. The exercises themselves appear at the end of Chapters 1 to 6. Exercises give one the opportunity both to apply, and also to advance one's knowledge. The exercises in this book are intended to provoke thoughtto complete some of them requires a major mental effort. In no case is the answer given here the only correct answer. Answers by readers should, in general, be more detailed and show other avenues of approach.

7.1 Microtext answers

[layout]Answer "Layout": Part of the difficulty in answering this question is that standards for hypertext are lacking. If one assumes that a Guide-like hypertext is the target, then one answer to the question might be that the Language or Architecture would need added to it: an outline facility for displaying selected levels of the outline, pop-up displays for the temporary display of additional information, folding to hide sections behind a button, and a linkage facility to enable users to follow links automatically.

[independent versus embedded]Answer "Independent versus Embedded": If the author of the independent semantic net microtext assures that the term to label a node also occurs exactly once within the associated text block, then translation from the independent-net representation to the embedded-net representation is straightforward. In the translation of independent to embedded, the node is simply shown within the text itself, rather than independently. In general, an embedded semantic net microtext can not be translated automatically into a meaningful independent semantic net microtext.

[natural language]Answer "Natural Language": A system relying upon device-dependent syntactic knowledge needs to be regularly used to prevent that knowledge from fading from memory. An infrequent user would prefer to converse in a method that is stable in memory; such a method would involve semantics rather than syntactics. Natural language interaction is the semantically richest interaction style for people.

[museum layout]Answer "Museum Layout": To find a certain guidebook in a museum one might follow these steps: look at the map of the museum and find the visitor information desk, move to the visitor information store using the mouse and directional commands, select the guidebook section, from the menu of available guidebooks choose one.

[visual formalisms]Answer "Visual Formalisms": In the traditional layout the child x of y appears beneath y and to the right of y. An alternative layout would place children within their parent (see Figure "Visualization determined by Relationships").


Figure 99: "Visualization determined by Relationships" Another way to represent an outline that the computer generates based on the outline relationships.

[meaning of previous]Answer "Meaning of Previous": The Previous link may go from a node to the sequentially preceding node. For instance, in a book with a node for page i, the Previous link may point to page i-1. Alternatively, a Previous link may point to the beginning of a section or to the outline. For instance, a Previous link at a node within Section 2.3 might point to the beginning of Section 2. Last, but most interesting from a microtext perspective, a Previous link may point to the node which the user last visited. This is a dynamic link, since where it points depends on the latest history.

[depth versus breadth]Answer "Depth versus Breadth": The breadth-first traversal gives a broad overview from one topic and then gives a broad overview about each subtopic. The depth-first traversal goes deeply into the topic's first subtopic before later proceeding to another subtopic.

[depth-first algorithm]Answer "Depth-first Algorithm": The algorithm to traverse the graph and print the document can be sketched in this pseudo-code form: traverse ( edge) to-visit-list $<-$ visit ( edge ) mark edge as visited while to-visit-list is non-empty choose edge {if edge deadend value is yes then do not visit else if edge has already been visited then do not visit else traverse ( edge )} remove edge from to-visit-list visit ( edge ) insert edge and its paragraph(s) in the document identify and return edges whose source equals target of edge

[pronoun references]Answer "Pronoun References": The `he' in the second sentence refers to Bluto and not to Popeye. This use of a pronoun binds the two sentences together. There is a further tie inferred by the two names. Though not explicitly mentioned, Popeye and Bluto are, within some cultures, well-known cartoon enemies, and readers from those cultures mentally create the tie because of a global, mental model of the characters.

[path costs]Answer "Path Costs": The depth-first searches are consistently less costly than the breadth-first searches in this graph.

7.2 Macrotext answers

[set and logical operators in queries]Answer "Set and Logical Operators in Queries" : A query for `boy dog' retrieves exactly the document {boy, dog}, while a query for `boy dog' returns all three documents. The analog of the set operations `intersection', `union', and `complement', are the Boolean operators `and', `or', and `not', respectively.

[a relevance formula]Answer "A Relevance Formula": One popular relevance measure between a document and a query is the cosine function: [Salt83] All documents whose cosine(document,query) value are greater than some threshold are considered relevant to the query and retrieved.

[discrimination analysis]Answer "Discrimination Analysis": Say that the distance between two vectors $V sub 1$ and $V sub 2$ is given by distance($V sub 1 , V sub 2$). The document space will be maximally separated, when the average distance between each document and the centroid is maximized, that is when is maximum.

[hierarchy search benefit]Answer "Hierarchy Search Benefit": For a collection of n documents about n comparisons are required to determine the documents closest to a query. If hierarchies are determined, then the number of comparisons required is the logarithm of n. In other words, the unordered collection requires exponentially more search effort than the hierarchically ordered collection. The search over the hierarchy concludes with a cluster of documents which are close to the query.

[validity of indexing]Answer "Validity of Indexing": For each document there is an test set of descriptors and an ideal set of descriptors The following steps produce a numerical measure of the validity of the indexing for a given document: [Wess79] Pair identical terms and remove them from the test set and the ideal set. For each pair score one point. If that pairing produces p pairs, then p is added to the validity of the indexing. Pair off remaining terms (in the test set) one level apart from any terms in the original ideal set and remove them from the test set. For each pair score one half point and if there are q such pairs, then add $q over 2$ to the validity score. If there are remaining terms in the test set that are one level apart from any terms in the original ideal set remove them from the test set and add 0 points to the final tally. Score $ -1 over 2$ for all remaining terms in the indexer's set. If there are r such terms in the indexer set, then subtract $r over 2$ from the validity score. In the last step the preceding terms are summed and divided by the number m of terms in the ideal set: This measure has intuitively appealing characteristics to it. It takes account of the similarity between terms rather than just the absolute agreement.

[searching for paragraphs]Answer "Searching for Paragraphs": The word frequency formulas are amenable to many equally valid variations. The logarithm (log) is frequently used to dampen the effect of a variable, as is the case in this answer. where $FREQ sub ij$ is the number of occurrences of term j in block i, n is the number of blocks in the collection, and $BLOCKFREQ sub j$ is the number of blocks containing j. In order to incorporate the intrinsic and extrinsic weights TOTALWEIGHT is proposed: where y is the number of immediate descendants of block i, j is a query term in i, and d is an immediate descendant of block i [Fris88]. This propagation function can be called recursively from the leaf blocks to the root block of the document to determine the most relevant block to a given query.

[translations]Answer "Translations": In addition to the ways described in the exercise, the group language and the common language may be related by the superset and the intersection relation.

[inconsistency]Answer "Inconsistency": One could change one semantic net so that both have `hand injury' under `arm injury'. More careful examination reveals, however, another problem. In one anatomy section `arm' is broader-than `hand'. A pattern has been maintained between the disease and anatomy sections. Thus to change the disease section without also changing the anatomy section, would introduce a second-order inconsistency [Rada87a].

7.3 Grouptext answers

[node types]Answer "Node Types": One might agree that node names should all be nouns and in singular form. In a hierarchy, if one term is of a certain type, then its descendant terms should be of the same type.

[versions]Answer "Versions": A reference to an entity may refer to a specific version of that entity along a specific branch of the version graph or to the latest version of the entity that matches some particular description.

[cooperative database]Answer "Cooperative Database": A group of users are going to be identified to a database program (called $database sub program$) Each user has a copy of the database on his workstation and initiates a program on his workstation (called $workstation sub program$) that notifies the file server when a file has been changed. The $database sub program$ will be told when the user started editing the file. When the user requests through $workstation sub program$ that he wants to save changes, the $database sub program$ will check whether anyone has written the file since he started editing. If yes, then his file will be stored under a temporary name, else $database sub program$ overwrites the original file with his new file. The $database sub program$ and the $workstation sub program$ may send notice to the users involved in a conflict. In other words, if $user sub 1$ started work on $file sub permanent$ at time $time sub 1$, $user sub 2$ started work on a copy of the same file at time $t sub 1+delta$, $user sub 2$ saved changes to the file at time $t sub 1+delta+epsilon$, and $user sub 1$ tried to write to the file at time $t sub 1+delta+epsilon+gamma$, then $user sub 1$ and $user sub 2$ should be told that $user sub 1$ has his file in $file sub temporary$ on the file server, while $user sub 2$'s changes are in $file sub permanent$.

[acetate mode]Answer "Acetate Mode": An appropriate decomposition of a document into objects and harmonization with the special annotation operations would allow a form of acetate annotation that did not require freezing of the source document. In this case, the users of the system could decide whether or not to incorporate the annotations in the document and whether to return to authoring mode.

[capitalist versus communist]Answer "Capitalist versus Communist": The advantages of collaborative authoring are concerned with using resources efficiently. A group of people has varying skills. This is analogous to division of labour when making anything in a `production line' environment. The disadvantages of collaborative authoring include the administrative overhead and that specialization results in boredom. In money terms the profits are split equally between the participants, and one person's share may be quite small. One can compensate for this by charging a higher price to represent the increase in quality. In so doing, however, one would reduce the quantity sold and the profits, unless the demand curve is `inelastic', i.e., does not respond to price changes. Forced collaboration may also discriminate against the person who could write a document alone and thus gain all the benefits.

[student collaboration]Answer "Student Collaboration": Collaborative writing is impeded by the relative lack of a mechanism in the school for students to interact and work together. Writing tools in a school may be difficult to share and products may be difficult to reproduce. Teachers aren't trained in instilling collaboration in students. When the process can be managed it has the advantages that the teacher can supervise others (namely, the students themselves) in the process of teaching one another. For short essays about personal experiences, a student may be best writing alone and then doing revisions with the help of feedback from a peer. For large, factual documents which can be prepared over a long period of time, collaboration may motivate and guide people.

[scientist evaluation]Answer "Scientist Evaluation": In the first instance the Dean might promote the academic who had authored the most papers in the database. But this might give preference to quantity over quality. One person may have written 100 papers which hardly anyone else has read, whereas another person may have written only one paper but it may be referenced by every other paper in the database. Surely, the second person has a legitimate claim to be the one to be promoted? In this style, each academic is assigned a score equal to the number of times that his papers are cited in the database. This method is, however, flawed in several ways, one of which is similar to the flaw of simply counting publications, namely, the references may come from papers which themselves are important or from ones which are unimportant. (N.B. the above strategy for evaluating people is actually used in some institutions of higher learning).

[efficiency exceeds one]Answer "Efficiency Exceeds One": A task specification must match with the memory of the authors in order that the writing task can be performed. Given writing $task sub X$ and $memory sub X$, then the writing task can be finished in one step, but given writing $task sub Y$ and $memory sub X$, writing $task sub Y$ can not possibly be done with $memory sub X$. When two people collaborate, they must communicate, which takes time. For $task sub Y$ with two authors each having complementary halves of $memory sub Y$, the writing can now be done, if and only if collaboration occurs. If writing now takes one-half a step, then if the cost of communication is less than one-half a step, the two writers have an efficiency greater than one.

7.4 Expertext answers

[isomorphism]Answer "Isomorphism": An example of an isomorphism is given for a semantic net about viral infections, but many other examples can be readily imagined. cause(flu) = virus, cause(infection) = infectious agent, parent(flu) = infection, and parent(virus) = infectious agent. then the isomorphism is parent(cause(flu)) = cause(parent(flu)).

[outline]Answer "Outline": Analogical inheritance of various kinds is present in the outline. In particular: $Inh sub x,y,z G$, where x=subtopic, y=part-of, and z=located-on. $Inh sub x,y,z G$, where x=subtopic, y=$(subtopic sup 2 ) sup -1$, and z=implies. $Inh sub x,y,z G$, where x=subtopic, y=$(subtopic sup 2 ) sup -1$, and z=$implies sup -1$.

[metric]Answer "Metric": The definition is Here m is the cardinality of X and n is the cardinality of Y. (See [Rada89b]) for proof that this is a metric)

[resolution algorithm]Answer "Resolution Algorithm": No initial marking is associated with the graph, except that `source' transitions (which are always enabled) will deposit tokens in their destination places. These source transitions represent the initial facts. The full firing algorithm follows (by Paul Dunne in [Rada90]): Fire all source transitions. For all places, p: If p contains a tuple $alpha$ and there is an edge (t,p) labeled $beta$, where $beta$ negates $alpha$, replace this edge by an edge (p,t) labeled $beta$. Fire every enabled transition with exactly one output place. A transition, t, is enabled, if and only if, for all places p which are inputs of t should (p,t) be labeled $beta$ there is a token $alpha$ which matches $beta$ resident in p. Firing of such a transition does not remove tokens from places and results in the corresponding output token being placed in the unique output for t. Repeat from step(2) until: some place contains identical tuples of opposite polarity (in which case the set of premises encoded are inconsistent); or every transition has fired at least once; or no transition can be enabled. In the last two cases the set of premises encoded are consistent.

[messaging]Answer "Messaging": Messages from the supervisor must be given high priority for processing. For example, if the supervisor has sent a message which requires action in a fixed time, then those who received the message might be reminded before the deadline about the requirement.

7.5 Conclusion answer

[influence]Answer "Influence": Suppose that the universe consists of three people, and each month each individual produces a document of the same length (i.e., the same number of bits). It is these bits from these documents on which influence will be calculated. Documents A and B are ignored after having been written, and document C is read by all three people. The three people now write documents and base them on the document C. These three documents face a similar fate in that Two of the three are ignored, while the third again is incorporated by all three people. The following situation obtains: At any time t there exists a document C that represents one-third of the system's bits. For all times after t, the descendants of C occupy all of the population. This process repeats indefinitely. In this closed world, perpetual influence has been achieved through document C.