PROPOSAL
NO.: 0414504
INSTITUTION: LexiClone
NSF PROGRAM: INFORMATION & DATA MANAGEMENT
PRINCIPAL INVESTIGATOR: Geller, Ilya S
TITLE: Lexical Cloning: the novel approach for textual information
processing and knowledge flow control
RATING:Poor
REVIEW:
What is the intellectual merit of the proposed activity?
There are some weaknesses in the description of the project and the
demonstration of the ability of the proposer to carry it out.
The qualifications of the proposer to conduct this research are not well
documented. His CV shows three associate degrees, and his experience
includes programming for a securities firm and 5 years as principal scientist
and executive officer of LexiClone, a technology startup. He shows 3
publications, one a 2004 TREC report (a general article, not an experimental
report), one a Russian internet article, and one a US Patent for the LexiClone
software. He reports TREC results in which "the superiority of this
technology was demonstrated" (according to the Web site, 3rd in QA and 9th
in Novelty Tracks). The citations to this work are to a participants-only
website so further information is not available.
The reports of previous work are very limited; there are 4 citations, one to
the LexiClone web page, one to the LexiClone Patent, one to the TREC results (a
closed site as noted above), and one to a philosophy text. There is no
evidence of grounding in prior research.
As noted in the summary statement, the goals of the project are very ambitious
and yet there is no evidence that the work will build on prior knowledge, nor
is there a clear indication of the underlying mechanisms which will be
explored. The goals which have been set are very broad and ambitious and
there is no indication of how difficult they are in reality; one of the goals
for instance seems to be the creation of text in an author's voice, a very
difficult problem.
There is no indication of how the project will be evaluated, which in IR terms
is a serious omission.
What are the broader impacts of the proposed activity?
The goals of the project are the goals of much IR research: to create a
retrieval mechanism which is effective and which is personalizable to the user.
However it is questionable how well these goals can be met, particularly
since the proposed methodologies are not documented in detail. The
proposer is affiliated with a technology startup company whose product,
LexiClone, is the basis for this research. As it is the subject of a US
patent, it is not clear how this will impact the dissemination of results and
the contribution to research in the IR field.
Summary Statement
The "lexical clone" in the title appears to be a computer-generated
summary of all the text an individual has written/read/queried, based on
"triads" which are equated with "meaningful key phrases from a
sentence". It is proposed that the lexical clone will act as a
filter to obtain useful information for the users. The proposed work
involves building an English part of speech dictionary; creating a
summary/profile, using the summary to filter text, adding AI (learning)
techniques to the clones, creating new texts on demand, and developing
principles to automatically structure text.