Á¤º¸°úÇÐȸ ³í¹®Áö D : µ¥ÀÌŸº£À̽º
Current Result Document : 1 / 7
´ÙÀ½°Ç
ÇѱÛÁ¦¸ñ(Korean Title) |
¿¬°ü ´Ü¾î ¸¶ÀÌ´×À» »ç¿ëÇÑ À¥¹®¼ÀÇ Æ¯Â¡ ÃßÃâ |
¿µ¹®Á¦¸ñ(English Title) |
Feature Extraction of Web Document using Association Word Mining |
ÀúÀÚ(Author) |
°í¼öÁ¤
ÃÖÁØÇõ
ÀÌÁ¤Çö
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 30 NO. 04 PP. 0351 ~ 0361 (2003. 08) |
Çѱ۳»¿ë (Korean Abstract) |
´Ü¾îÀÇ ¿¬°ü¼ºÀ» ÀÌ¿ëÇÏ¿© ¹®¼ÀÇ Æ¯Â¡À» ÃßÃâÇÏ´Â ±âÁ¸ÀÇ ¹æ¹ýÀº ÁÖ±âÀûÀ¸·Î ÇÁ·ÎÆÄÀÏÀ» °»½ÅÇØ¾ß ÇÏ´Â ¹®Á¦Á¡, ¸í»ç±¸¸¦ ó¸®ÇØ¾ß ÇÏ´Â ¹®Á¦Á¡, »öÀξ ´ëÇÑ È®·üÀ» °è»êÇØ¾ß ÇÏ´Â ¹®Á¦Á¡ µîÀ» Æ÷ÇÔÇÑ´Ù. º» ³í¹®¿¡¼´Â ¿¬°ü ´Ü¾î ¸¶ÀÌ´×À» »ç¿ëÇÏ¿© ¹®¼ÀÇ Æ¯Â¡À» È¿À²ÀûÀ¸·Î ÃßÃâÇÏ´Â ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. Á¦¾ÈÇÑ ¹æ¹ýÀº Apriori ¾Ë°í¸®ÁòÀ» »ç¿ëÇÏ¿© ¹®¼ÀÇ Æ¯Â¡À» ´ÜÀÏ ´Ü¾î°¡ ¾Æ´Ñ ¿¬°ü ´Ü¾î º¤ÅͷΠǥÇöÇÑ´Ù. Apriori ¾Ë°í¸®ÁòÀ» »ç¿ëÇÏ¿© ¹®¼·ÎºÎÅÍ ÃßÃâµÈ ¿¬°ü ´Ü¾î´Â À̸¦ ±¸¼ºÇÏ´Â ¼ö¿Í ½Å·Úµµ¿Í ÁöÁöµµ¿¡ µû¶ó Â÷À̸¦ º¸ÀδÙ. µû¶ó¼ º» ³í¹®¿¡¼´Â ¹®¼ ºÐ·ùÀÇ ¼º´ÉÀ» Çâ»ó ½ÃÅ°±â À§ÇØ ¿¬°ü ´Ü¾î¸¦ ±¸¼ºÇÏ´Â ´Ü¾îÀÇ ¼ö¿Í ½Å·Úµµ¿Í ÁöÁöµµ¸¦ °áÁ¤ÇÏ´Â È¿À²ÀûÀÎ ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. ¿¬°ü ´Ü¾î ¸¶ÀÌ´×À» ÀÌ¿ëÇÑ Æ¯Â¡ ÃßÃâ ¹æ¹ýÀº ÇÁ·ÎÆÄÀÏÀ» »ç¿ëÇÏÁö ¾ÊÀ¸¹Ç·Î ÇÁ·ÎÆÄÀÏ °»½ÅÀÇ Çʿ伺ÀÌ ¾øÀ¸¸ç, »öÀξ ´ëÇÑ È®·üÀ» °è»êÇÏÁö ¾Ê°íµµ Apriori ¾Ë°í¸®ÁòÀÇ ½Å·Úµµ¿Í ÁöÁöµµ¿¡ µû¶ó ÀÚµ¿À¸·Î ¸í»ç±¸¸¦ »ý¼ºÇϹǷΠ´Ü¾îÀÇ ¿¬°ü¼ºÀ» ÀÌ¿ëÇÏ¿© ¹®¼ÀÇ Æ¯Â¡À» ÃßÃâÇÏ´Â ±âÁ¸ ¹æ¹ý¿¡ ´ëÇÑ ¹®Á¦Á¡À» ÇØ°áÇÑ´Ù. Á¦¾ÈÇÑ ¹æ¹ýÀÇ ¼º´ÉÀ» Æò°¡Çϱâ À§ÇØ Naive Bayes ºÐ·ùÀÚ¸¦ ÀÌ¿ëÇÑ ¹®¼ ºÐ·ù¿¡ Àû¿ëÇÏ¿© Á¤º¸À̵æ, ¿ª¹®ÇåºóµµÀÇ ¹æ¹ý°ú ºñ±³Çϸç, ¶ÇÇÑ »öÀξîÀÇ ¿¬°ü¼º°ú È®·ü ¸ðµ¨À» ±â¹ÝÀ¸·Î ´Ü¾îÀÇ ¿¬°ü¼ºÀ» ÀÌ¿ëÇÏ¿© ¹®¼ ºÐ·ù¸¦ ÇÏ´Â ±âÁ¸ÀÇ ¹æ¹ý°ú °¢°¢ ºñ±³ÇÑ´Ù. |
¿µ¹®³»¿ë (English Abstract) |
The previous studies to extract features for document through word association have the problems of updating profiles periodically, dealing with noun phrases, and calculating the probability for indices. We propose more effective feature extraction method which is using association word mining. The association word mining method, by using Apriori algorithm, represents a feature for document as not single words but association-word-vectors. Association words extracted from document by Apriori algorithm depend on confidence, support, and the number of composed words. This paper proposes an effective method to determine confidence, support, and the number of words composing association words. Since the feature extraction method using association word mining does not use the profile, it need not update the profile, and automatically generates noun phrase by using confidence and support at Apriori algorithm without calculating the probability for index. We apply the proposed method to document classification using Naive Bayes classifier, and compare it with methods of information gain and TF,IDF. Besides, we compare the method proposed in this paper with document classification methods using index association and word association based on the model of probability, respectively. |
Å°¿öµå(Keyword) |
Ư¡ ÃßÃâ
¿¬°ü ´Ü¾î ¸¶ÀÌ´×
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|