ABOUT THE TOOL TOOL ARCHITECTURE HOW TO USE THE TOOL? TOOL DEMONSTRATION

 

Shortcut to the topics

  • How to control the space ?

The space displaying from the tool can be mainly controlled by the mouse. The following mouse actions shown in the table below leads to the different results presenting in the space.

    Mouse Actions
    Results

    Drag Mouse Pointer Horizontally

    Rotate the documents along X Axis
    Drag Mouse Pointer Vertacally Rotate the documents along Y Axis
    Click Left-Button once at a Document Show Distance between selected document and other documents
    Click Left-Button on Documents Show Distance among selected documents
    Click Left-Button on the Space Area (not Documents) Show Document Information (Only ID and Title is shown)
  • How to interpret the information from the space representation ?

Each of the icon (image) represent a document. In some cases, documents that indicate specific file types are represented as file icon to help users to easily figure out the file type of those documents. The applet includes a set of common application icons as shown in the folowing table.

    Icon
    Specific File Type (Extension)
    Description
    .doc
    Word Document
    .fla
    Macromedia Flash
    .htm or .html
    Html Files
    .mdb
    Microsoft Access
    .mov
    Movie File
    net
    Internet Resources
    office
    Microsoft Office
    .ppt
    Microsoft Powerpoint
    .psd
    Photoshop File
    .rar
    Rar File
    .txt
    Text File
    .wma
    Window Media Player File
    .xls
    Microsoft Excel File
    .zip
    Zip File

Besides, another important information that can be interpreted from the space is the "similarity distance" among the documents. Specifically speaking, a document that closes to another document tends to have small value of similarity distance, which means that the content of the two documents are pretty much the same. In contrast, if two documents are far from each other, it means that the two documents are quite diffrent in content.

From the image above, the document no.1 is closer to document2 than to document3. Also, the size of document is represented according to the distance between the closest point to user's eyes and the point of the document object.

Furthermore, to increase user's ability in percepting the distance in the space, the applet uses color with different intensity for presentation. The color bar located at the bottom and the leftmost of the space informs user the colors used to present each distance level. Light colors tend to show the tight connection between two documents while dark colors show a loose connection between two connection.

From the image above, the brightest yellow shows connections among the documents that seems to close to each other. It means that the content of those documents are similar while the darker color networks present the connection to faraway documents.

  • How to see the actual distance values of the network ?

Users can see the similarity distance value between two documents by clicking on the two observed documents. Users may select observed documents more than two in case that users want to see the similarity distance value inside a set of documents (it is the subset of the documents in space).

If users want to see the distance between a document and other documents in space, they can simply click on a wanted document once.

 

Besides, if users want to see the whole similarity distance values in the space in nice format, users can select "show similarity distance table" in view menu tab. The whole distance values will be put in the matrix to increaser users' readabiity.

  • How to import similarlity distance data to the applet ?

There are two ways for importing the similarity distance data to the tool to display. First way is to indicate the URL of a specific datasource containing a xml data file as applet parameter. For this case, the tool will open as service mode. By simply add "param" tag inside the "applet" tag, the data will be loaded to the tool. If the file has incorrect format, the tool will open as standalong mode instead. An example on how to indicate the datasource is shown below.

<applet code="SpaceDisplay.class" codebase = "bin/" name="3dspace" width="700" height="500">
<param name=datasource value="http://www.sis.pitt.edu/~ktech/XML/dist1.xml">
</applet>

Another way is to run the tool as standalone and then open the similarity distance data file, which only text or xml file format is accepted. Also, the tool allows users to drag a data file (but it must be .txt or .xml only) and drop it to the applet.

Step 1

 

Step 2

 

Step 3

  • How to create a similarlity distance data file used in the tool ?

The tool accepts two file formats that are text file (.txt) and xml file (.xml). For the text file case, the file structure is shown below.

A,B,C,D,E
0, 10, 20, 30, 40
10, 0, 10, 20, 30
20, 10, 0, 10, 20
30, 20, 10, 0, 10
40, 30, 20, 10, 0

The unique name will be given of the first row. Second row presents the distance between document A and any other document in space. Specifically speaking, document A is far from document B 10 units;document A is far from document C 20 units and so are other documents. Because it has 5 documents in the space, there are 5 rows in similarity distance expression part. You can download an example of text file used as input here: distancetest.txt

However, I would recommend users to create the similarlity distance data file in xml format because it brings more powerful input expression. For xml format case, The tool requires two main part of document information for space display.

The first part is individual information of a document in space. It must contain id, title, size, content and type of the document.

<document id="1" title="Document Title A" sizeMB="1.0" content="Test Content" fileType="txt">

The second part is called "similaritydistance" part which define the similarity distance between a document and any other document. It must be indicated inside the document tag.

<similaritydistance id="2" distance="10.0" />
<similaritydistance id="3" distance="100.0" />
<similaritydistance id="4" distance="30.0" />
<similaritydistance id="5" distance="100.0" />
<similaritydistance id="6" distance="200.0" />
<similaritydistance id="7" distance="20.0" />
<similaritydistance id="8" distance="40.0" />
<similaritydistance id="9" distance="50.0" />
<similaritydistance id="10" distance="20.0" />

The ids will be uniquely assign by users to any given document. However, users must keep in mind that ids will be used to indicate dissimilar documents. So, please ensure that the ids are not duplicate or redundant in order to prevent an incorrect space presentation. Each document can be described in xml fle as shown below.

<document id="1" title="Document Title A" sizeMB="1.0" content="Test Content" fileType="txt">
<similaritydistance id="2" distance="10.0" />
<similaritydistance id="3" distance="100.0" />
<similaritydistance id="4" distance="30.0" />
<similaritydistance id="5" distance="100.0" />
<similaritydistance id="6" distance="200.0" />
<similaritydistance id="7" distance="20.0" />
<similaritydistance id="8" distance="40.0" />
<similaritydistance id="9" distance="50.0" />
<similaritydistance id="10" distance="20.0" />
</document>

According to the expression above, the distance between document no.1 and the document no.2 is equal to 10 units and so are other pairs. An example of the complete xml file which has 10 documents in the space is provided as follow:

<?xml version="1.0" encoding="UTF-8" ?>
<documents documentcount="10">
<document id="1" title="Document Title A" sizeMB="1.0" content="Test Content" fileType="txt">
<similaritydistance id="2" distance="10.0" />
<similaritydistance id="3" distance="100.0" />
<similaritydistance id="4" distance="30.0" />
<similaritydistance id="5" distance="100.0" />
<similaritydistance id="6" distance="200.0" />
<similaritydistance id="7" distance="20.0" />
<similaritydistance id="8" distance="40.0" />
<similaritydistance id="9" distance="50.0" />
<similaritydistance id="10" distance="20.0" />
</document>
<document id="2" title="Document Title B" sizeMB="1.0" content="Test Content" fileType="doc">
<similaritydistance id="1" distance="10.0" />
<similaritydistance id="3" distance="75.0" />
<similaritydistance id="4" distance="85.0" />
<similaritydistance id="5" distance="15.0" />
<similaritydistance id="6" distance="150.0" />
<similaritydistance id="7" distance="20.0" />
<similaritydistance id="8" distance="40.0" />
<similaritydistance id="9" distance="50.0" />
<similaritydistance id="10" distance="20.0" />
</document>
<document id="3" title="Document Title C" sizeMB="1.0" content="Test Content" fileType="htm">
<similaritydistance id="1" distance="100.0" />
<similaritydistance id="2" distance="75.0" />
<similaritydistance id="4" distance="60.0" />
<similaritydistance id="5" distance="100.0" />
<similaritydistance id="6" distance="200.0" />
<similaritydistance id="7" distance="15.0" />
<similaritydistance id="8" distance="10.0" />
<similaritydistance id="9" distance="5.0" />
<similaritydistance id="10" distance="40.0" />
</document>
<document id="4" title="Document Title D" sizeMB="1.0" content="Test Content" fileType="txt">
<similaritydistance id="1" distance="30.0" />
<similaritydistance id="2" distance="85.0" />
<similaritydistance id="3" distance="60.0" />
<similaritydistance id="5" distance="20.0" />
<similaritydistance id="6" distance="20.0" />
<similaritydistance id="7" distance="20.0" />
<similaritydistance id="8" distance="40.0" />
<similaritydistance id="9" distance="140.0" />
<similaritydistance id="10" distance="40.0" />
</document>
<document id="5" title="Document Title E" sizeMB="1.0" content="Test Content" fileType="html">
<similaritydistance id="1" distance="100.0" />
<similaritydistance id="2" distance="15.0" />
<similaritydistance id="3" distance="100.0" />
<similaritydistance id="4" distance="20.0" />
<similaritydistance id="6" distance="200.0" />
<similaritydistance id="7" distance="100.0" />
<similaritydistance id="8" distance="40.0" />
<similaritydistance id="9" distance="40.0" />
<similaritydistance id="10" distance="60.0" />
</document>
<document id="6" title="Document Title F" sizeMB="1.0" content="Test Content" fileType="N/A">
<similaritydistance id="1" distance="200.0" />
<similaritydistance id="2" distance="150.0" />
<similaritydistance id="3" distance="200.0" />
<similaritydistance id="4" distance="20.0" />
<similaritydistance id="5" distance="200.0" />
<similaritydistance id="7" distance="40.0" />
<similaritydistance id="8" distance="50.0" />
<similaritydistance id="9" distance="60.0" />
<similaritydistance id="10" distance="200.0" />
</document>
<document id="7" title="Document Title G" sizeMB="1.0" content="Test Content" fileType="zip">
<similaritydistance id="1" distance="20.0" />
<similaritydistance id="2" distance="20.0" />
<similaritydistance id="3" distance="15.0" />
<similaritydistance id="4" distance="20.0" />
<similaritydistance id="5" distance="100.0" />
<similaritydistance id="6" distance="40.0" />
<similaritydistance id="8" distance="100.0" />
<similaritydistance id="9" distance="50.0" />
<similaritydistance id="10" distance="20.0" />
</document>
<document id="8" title="Document Title H" sizeMB="1.0" content="Test Content" fileType="office">
<similaritydistance id="1" distance="40.0" />
<similaritydistance id="2" distance="40.0" />
<similaritydistance id="3" distance="10.0" />
<similaritydistance id="4" distance="40.0" />
<similaritydistance id="5" distance="40.0" />
<similaritydistance id="6" distance="50.0" />
<similaritydistance id="7" distance="100.0" />
<similaritydistance id="9" distance="50.0" />
<similaritydistance id="10" distance="15.0" />
</document>
<document id="9" title="Document Title I" sizeMB="1.0" content="Test Content" fileType="xls">
<similaritydistance id="1" distance="50.0" />
<similaritydistance id="2" distance="50.0" />
<similaritydistance id="3" distance="5.0" />
<similaritydistance id="4" distance="140.0" />
<similaritydistance id="5" distance="40.0" />
<similaritydistance id="6" distance="60.0" />
<similaritydistance id="7" distance="50.0" />
<similaritydistance id="8" distance="50.0" />
<similaritydistance id="10" distance="50.0" />
</document>
<document id="10" title="Document Title J" sizeMB="1.0" content="Test Content" fileType="N/A">
<similaritydistance id="1" distance="20.0" />
<similaritydistance id="2" distance="20.0" />
<similaritydistance id="3" distance="40.0" />
<similaritydistance id="4" distance="40.0" />
<similaritydistance id="5" distance="60.0" />
<similaritydistance id="6" distance="200.0" />
<similaritydistance id="7" distance="20.0" />
<similaritydistance id="8" distance="15.0" />
<similaritydistance id="9" distance="50.0" />
</document>
</documents>

You can download an example of MXL files used as input here: distancetest.xml or icontest.xml

Copyright 2006 ADVisE 3D. All rights reserved.
Have any question, comment or suggestion, please contact kittipong@alum.mit.edu