Computer-aided Interactive Classification: Applications of VIBE

Robert R. Korfhage, David S. Dubin


Tools like the VIBE visualization system permit human analysts to use both an understanding of a data set's content and a recognition of structure that the visualization reveals. But what happens when a database's semantics are hidden from the analyst? What guidelines or heuristics can he or she use to reveal the "correct" underlying structure? Results of two experiments conducted at the University of Pittsburgh support the claim that VIBE analysts can uncover a meaningful clustering even without semantic clues. In one experiment artificial data sets were created in which some of the variables discriminate one or more clusters and the other half contribute only random noise. Variable selection guidelines based on computed discrimination value were used in an attempt to distinguish between the signal and noise variables. In a second experiment, a human analyst's encoding of 714 short phrases to 23 overlapping and inter-related categories was stripped of meaningful titles and relabeled with integers. A VIBE analyst was able to highlight relationships among the 23 categories solely on the basisof co-assignment of the phrases.

Full Text: