|
|
Dictionary Definition Link
Try It Out
- Set up the Plugin
h
- Start WBI. Set up your web
browser to use WBI as a
proxy.
- Register the Definition Link plugin.
At the WBI console, type (on one line)
register com/ibm/wbi/projects/dictionary/dictionary.reg
- Check to see whether the plugin is registered and enabled. Go to the
WBI Setup page. The Definition Link plugin
should be listed in the table with a checkmark next to its name. If the plugin
is not listed, try registering it again. If the checkmark is not there, click
on the box to the left of the plugin name.
- Open another browser window. Use that window to try out the plugin, and use
this window to display the documentation. (To open another window using
Microsoft Internet Explorer, go to File -> New -> Window. To open a window
using Netscape Navigator, go to File -> New -> Navigator Window.)
- Choose which dictionaries to load.
- Visit Some Web Pages
- Having Trouble?
What It Does
The Dictionary Definition Link plugin scans the document for words that appear in
one of its dictionaries, editing the document as appropriate to add a link from the
word (or phrase) to its definition on the web. To try
distinguish these added links from
ones that appear in the original document, the plugin italicizes
the anchor text of the links it adds.
How It Works
Architecture
In general, the Dictionary Definition Link plugin performs three functions:
- editing the HTTP stream to make sure all returned
content-type fields are correct (circumventing bugs in
some Web servers)
- editing pages to insert links for
known terms
- generating pages that allow the user to
load or unload dictionaries, or choose among dictionaries for a word
that appears in several of them.
MEG Model
The AddLinksEditor calls a method in the DefinitionLinkPlugin
to find each word's definition URL. In the special case that a
word appears in two dictionaries, that method returns a URL that
triggers a DictionaryChoiceGenerator, which provides a list of
dictionaries that define a word, with links to the definitions
and to full descriptions of each dictionary. If the user
selects the link for a dictionary's description, the
DictionaryChoiceGenerator also parses the resultant query and
marks up the description.
If the user goes to the special URL, http://_dictionary/setup, the
ControlPanelGenerator creates a form that shows which
dictionaries are available, and which of those are loaded,
allowing the user to load or unload them by changing and
submitting the form. The same generator accepts query data from
that form and performs the requested action.
Scenario: Accessing an External Web Page
Typically, then, the processing path is as follows: The user
enters a URL or clicks on a link to a Web page. A request is
generated and sent to WBI. A WBI generator (possibly the
HttpDefaultGenerator) fetches the page from the server,
producing an http response. An object representing the response
passes through the FixContentTypeEdittor, which might correct the
content type for the page, and then to the
AddLinksEditor, which changes the contents of the page by
adding links from words to their definitions.
Implementation Details
- The dictionary
The dictionary is a hashtable that maps words to definition
URLs. We represent these on disk as serialized sequences of key-value
pairs (strings) rather than as serialized hashtables so that we
don't run into hashtable version conflicts between JDK versions.
These dictionaries were created by crawling the definition
sites, such as the
On-Line Medical Dictionary and Duhaime's Law Dictionary
picking out words and recording their links.
Because many words found in these dictionaries were
common English words, we removed those
that appeared in a non-specialized dictionary.
- Editing pages
The AddLinksEditor parses each page with an HtmlEditor.
For each chunk of text between tags (except for the text of
existing links), it picks out words using whitespace and
punctuation as delimiters. If a one- or two-word phrase appears
in the dictionary, the editor wraps a link around the phrase,
generating the new block of hypertext with an HtmlHelper (from
the PersonalHistoryPlugin). Before the link, the editor adds an
<i> tag, and after the link, it adds an an
</i> tag to try to distinguish these inserted
links from ones that appear in the original document.
- Some key WBI classes that were used:
Known Problems
- On some systems, the On-Line Medical Dictionary takes a long time
to load. The delay comes from the computational overhead of deserializing
the strings in the data file to build the hashtable.
- Under Linux, the Java interpreter crashes with an OutOfMemoryError.
The dictionaries take up a good chunk of memory, and the operation of
deserializing them takes even more. As of Java 1.2, Linux versions of
the JVM do not properly allocate heap memory. Instead, the
heap size must be set explicitly using the -mx flag.
java -mx50M Run works without problems.
(On machines with less than 50 MB of physical memory, the operating system
will provide virtual memory as needed.)
Source Files
- dictionary.reg
- Contains the information necessary to register the plugin.
- dictionary.ini
- Contains information about the available dictionaries.
- AddLinksEditor.java
- Contains the class definition for AddLinksEditor, which scans documents
for medical terms and adds links to definitions.
- DefinitionLinkPlugin.java
- Contains the class definition for DefinitionLinkPlugin, the plugin
itself.
- ControlPanelGenerator.java
- Contains the class definition for ControlPanelGenerator, which
creates the form for choosing which dictionaries to load.
- DictionaryData.java
- Contains the class definition for DictionaryData, which represents
important data for a dictionary--its name, an HTML description of its contents,
and a reference to the dictionary itself (a hashtable mapping words to URLs).
It also contains methods for loading and unloading serialized dictionaries.
- DictionaryChoiceGenerator.java
- This one generates pages for choosing among dictionaries
in the case that a word appears in more than one.
- MakeChangesGenerator.java
- This one takes the
query data from ControlPanelGenerator, compares it against the
current load status of all the dictionaries, and decides which
dictionaries to load and unload. While the expensive part of
the loading process, namely the deserialization, is grinding on,
the MakeChangesGenerator marks up a web page asking the user to
wait. Once the wait is over, the Generator adds a button
that lets the user return to the control panel.
- omd.data
- Contains the data for the On-line Medical
Dictionary, represented as a sequence of words that map to
definition URLs. This is the full version of the dictionary,
minus certain common, non-medical words.
- duhaime.data
- Contains the data for
Duhaime's Law Dictionary.
Acknowledgments
-
The On-line Medical
Dictionary is provided as a public service by the
CancerWEB Project.
We thank Dr. Graham Dark
for granting us permission to link to the dictionary.
The CancerWEB project allows access to the dictionary free of
charge in the hope
that it will be useful, but without any guarantees of accuracy;
for
further information on the
OMD's use, please read its terms of
use.
-
Duhaime's Law
Dictionary
provides definitions of basic law terms in plain language. We
thank Lloyd Duhaime, the lawyer who wrote the dictionary and who
publishes it on his
Web site as a public
service.
Duhaime's
Law Dictionary is provided in the hopes that it will be
useful,
but without any guarantee of accuracy of its contents. For
further information
about Duhaime's law firm and its activities, visit
its site.
|