Document Creation
Document Part
There are 23 buttons for users to create the documents. Each
button represents one term that can be add to the documents.
Among these terms, there are 3 terms are stop words those will
be ignored for calculating the vector value. In document part,
users can create at most 16 documents. When the users click
one document term button, the term will be showed in the first
text box. During the documents creation, users also need the
following functional buttons.
- "Reset", it is used to clear all the content in document
part.
- "Delete", users can use it to erase the content in the first
text box.
- "Add", a key button, is used to put the content into the
document list and get the frequency of each term in the document.
In my Javascript part, to make sure to get the correct frequency
of each term, I created a set of calculate functions.
The calculate() function will find term
"we", then count the number of this term in one document. After
that, it will call function calculate2() to get the frequency
of next term and so on. Therefore, in Adddocuments() function,
it just need to call function calculate() to get all the frequencies.
In this function, the co-ordinates of the vector are calculated
by the formular:
CVEC[i]
= ( term i frequency ) / sqrt [ ( term i freq )2
+ ( term i+1 freq )2 +......+ ( term n freq )2
],
where i=0 and i < 20 because of 20 terms.
Query Part
In query part, users can create queries and then get the similarity
between one query and the documents. Users can pick any terms
from "Query Term" list and choose a weight from "Weight" list.
Then add the term and weight into text box. "Submit" calculates
the co-ordinate of query by the same formula as above since
we can treat query as a document. At the same time, all the
co-ordinates will send to FastMap algorithm calculation part
to calculate the similarities by Euclidean Distance and Cosine
Measure.