Research Prerequisites:
In [1] it uses tree edit distance for measuring the similarity between xml files without mentioning the term frequent trees. Their algorithm is based on the minimum operations needed for transforming one tree to another.
[3] Recommends that xml file be presented with a tag tree pattern and then try to find these patterns as much as possible. This way of presentations takes a lot of space for large xml files. In [4] lee defined the similarity as the common paths, but this definitions is not precise. Although [5] has a strong base of theory, but it seems impossible to implement. In [6] Mr. Ling presents a model for finding the association rules of xml. In [7] Mr. Zaki has presented an algorithm named tree miner which finds the frequent patterns in a data structure called scope-list. This one is one of the most effective algorithms for finding frequent patterns semi-structured data which can be expanded to trees. In [8] a way is recommended for solving the problem of Apriori. And recently [9] showed that when there are a lot of association rules to be found, XAR-Miner method works better that Mine Rule.
[1] C.H. Moh, E.P. Lim and W.K. Ng, DTD-Miner: A tool for mining DTD from XML documents, Proceedings of the 2nd Int. Workshop on Advance Issues of E-Commerce and Web-Based Information Systems, Milpitas, California, pp.144-151, June, 2000.
[2] J.W. Lee, K. Lee and W. Kim, Preparations for semantics-based XML mining, Proceedings of the 2001 IEEE International Conference on Data Mining, pp.345-352, San Jose, California, December, 2001.
[3] T. Miyahara, T. Shoudai, T. Uchida, K. Takahashi and H. Ueda, Discovery of frequent tree structured patterns in semistructuredWeb documents, Proceedings of the Fifth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Hong Kong, China, pp.47-52, April 2001.
[4] S. Nestorov, S. Abiteboul and R. Motwani, Extracting schema from semi-structured data, Proceedings of ACM SIGMOD International Conference on Management of Data, Seattle, Washington, pp.295-306, June 1998.
[5] Feng, L., Dillon, T. S., Weigand, H., Chang, E., An XML-Enabled Association Rule Framework. In Proceedings of DEXA 2003, pp 88-97, Prague, Czech Republic, 2003.
[6] Feng, L. & Dillon, T. S., Mining XML-Enabled Association Rule with Templates. In Proceedings of KDID 04, 2004.
[7] Zaki, M. J., Efficient Mining of Trees in the Forest. SIGKDD '02, Edmonton, Alberta, Canada, ACM, 2002.
[8] Park, J. S., Chen, M.-S., et al., Using a Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE Transactions on Knowledge and Data Engineering 9(5): 813-825, 1997.
[9] Zhang, J., et al. On Efficient and Effective Association Rule Mining from XML Data. In DEXA 2004. 2004. Zaragosa, Spain: LNCS.
|