Document Creation

Document Part
There are 23 buttons for users to create the documents. Each button represents one term that can be add to the documents. Among these terms, there are 3 terms are stop words those will be ignored for calculating the vector value. In document part, users can create at most 16 documents. When the users click one document term button, the term will be showed in the first text box. During the documents creation, users also need the following functional buttons.

  1. "Reset", it is used to clear all the content in document part.
  2. "Delete", users can use it to erase the content in the first text box.
  3. "Add", a key button, is used to put the content into the document list and get the frequency of each term in the document. In my Javascript part, to make sure to get the correct frequency of each term, I created a set of calculate functions.

The calculate() function will find term "we", then count the number of this term in one document. After that, it will call function calculate2() to get the frequency of next term and so on. Therefore, in Adddocuments() function, it just need to call function calculate() to get all the frequencies. In this function, the co-ordinates of the vector are calculated by the formular:
         CVEC[i] = ( term i frequency ) / sqrt [ ( term i freq )2 + ( term i+1 freq )2 +......+ ( term n freq )2 ],
where i=0 and i < 20 because of 20 terms.

Query Part
In query part, users can create queries and then get the similarity between one query and the documents. Users can pick any terms from "Query Term" list and choose a weight from "Weight" list. Then add the term and weight into text box. "Submit" calculates the co-ordinate of query by the same formula as above since we can treat query as a document. At the same time, all the co-ordinates will send to FastMap algorithm calculation part to calculate the similarities by Euclidean Distance and Cosine Measure.

 
Created By: Shruti Parikh, Sueyeon Syn, Kittipong Techapanichgul, Zhiwen Yu

December 16, 2004