Abstract
This paper introduces a locality discriminating indexing (LDI) algorithm for text categorization. The LDI algorithm offers a manifold way of discriminant analysis. Based on the hypothesis that samples from different classes reside in class-specific manifold structures, the algorithm depicts the manifold structures by a nearest-native graph and a invader graphs. And a new locality discriminant criterion is pro- posed, which best preserves the within-class local struc- tures while suppresses the between-class overlap. Using the notion of the Laplacian of the graphs, the LDI algo- rithm finds the optimal linear transformation by solving the generalized eigenvalue problem. The feasibility of the LDI algorithm has been successfully tested in text categorization using 20NG and Reuters-21578 databases. Experiment re- sults show LDI is an effective technique for document mod- eling and representations for classification.