Efficient data mining from large text databases

Arimura Hiroki; Sakamoto Hiroshi; Arikawa Setsuo

Repository | Book | Chapter

(2002) Progress in discovery science, Dordrecht, Springer.

Efficient data mining from large text databases

Hiroki Arimura , Hiroshi Sakamoto , Setsuo Arikawa

pp. 123-139

In this paper, we consider the problem of discovering a simple class of combinatorial patterns from a large collection of unstructured text data. As a framework of data mining, we adopted optimized pattern discovery in which a mining algorithm discovers the best patterns that optimize a given statistical measure within a class of hypothesis patterns on a given data set. We present efficient algorithms for the classes of proximity word association patterns and report the experiments on the keyword discovery from Web data.

Publication details

DOI: 10.1007/3-540-45884-0_6

Full citation:

Arimura, H. , Sakamoto, H. , Arikawa, S. (2002)., Efficient data mining from large text databases, in S. Arikawa & A. Shinohara (eds.), Progress in discovery science, Dordrecht, Springer, pp. 123-139.

This document is unfortunately not available for download at the moment.