NSF PROGRAM: INFORMATION & DATA MANAGEMENT
PRINCIPAL INVESTIGATOR: Geller, Ilya S
TITLE: Lexical Cloning: the novel approach for textual information processing and knowledge flow control
What is the intellectual merit of the proposed activity?
The main goal of this proposal is to use lexical cloning based technology for adaptive, and user-oriented personalized search for textual information.
The PI has identified the following research tasks:
1) Development of an electronic dictionary.
2) Development of a program to summarize textual data and lexical profiles.
3) Development of a program to perform user-oriented search
4) Investigation of an "education" process
5) On-demand new text creation using user profiles.
6) Development of a technique to structure data dynamically based on lexical profiles.
"Lexical Cloning" concepts, previously developed and patented by the PI, will be used to accomplish the above tasks. These Concepts include data summarization and personalized search.
1) Data summarization: This technology involves:
a. Breaking document into sentences
b. Identifying nouns, verbs, adjectives, etc. in each sentence
c. Creating several combinations of the identified components: each combination is referred to as "triad"
d. Representing the document by a list of triads and their counts.
2) Personalized Search: This technology involves is based on the above summarization technique. The user profile is created by collecting and summarizing a list of documents that characterize the user's behavior. The search would involve comparing
a. Summaries of documents
b. Summary of query terms
c. Summary of user's documents (user profile)
The proposal is not well-written, especially the first part that describes the proposed research tasks. There are several errors, and vague sentences. Practically, there are no references to related research, and there is no evaluation plan.
Research tasks 2, 3, and 6 (above) are basically the core of the patented technology. The PI does not mention any additional work beyond what has already been done.
For research task 1, development of an electronic dictionary, it is not clear why the PI wants to develop one. Few dictionaries have already been developed and are available (e.g. WordNet).
The objective of task 5 is not clear. What is the benefit of creating new text using user's profiles? The PI claims that it will "enable better understanding of the human creativity and ability to communicate "
The objective of task 5 is to make the profiles "learn" and adapt, and provide them with "Artificial Intelligence" capability. The PI does not describe a way to achieve this or even to show it is possible.
The user profile, created from a collection of documents provided by the user, may provide hints on the user's education and preferences. However, I don't see how it could capture the user's cultural and social background and his psychological profile, as the PI claims.
What are the broader impacts of the proposed activity?
The proposed work may lead to efficient personalized search engines. However, the core technology has already been developed and patented.
I rate this proposal as Fair. The proposal is not well-written. It has several grammatical errors. There are no references to related literature, no evaluation plan, and no education plan. The contribution of the proposed work beyond what has already been done and patented is not clear.