ÇѱÛÁ¦¸ñ(Korean Title) |
¿ÂÅç·ÎÁö ±â¹ÝÀÇ À¥ ÆäÀÌÁö ºÐ·ù ½Ã½ºÅÛ |
¿µ¹®Á¦¸ñ(English Title) |
Web Page Classification System based upon Ontology |
ÀúÀÚ(Author) |
ÃÖÀçÇõ
¼Çý¼º
³ë»ó¿í
ÃÖ°æÈñ
Á¤±âÇö
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 11-B NO. 06 PP. 0723 ~ 0734 (2004. 10) |
Çѱ۳»¿ë (Korean Abstract) |
º» ³í¹®Àº ¿ÂÅç·ÎÁö(ontology)¿¡ ±â¹ÝÇÑ ÀÚµ¿ÈµÈ À¥ ÆäÀÌÁö ºÐ·ù ½Ã½ºÅÛÀ» Á¦¾ÈÇÑ´Ù. À¥ ÆäÀÌÁöÀÇ ºÐ·ù¸¦ À§ÇÏ¿© ù ¹ø° ´Ü°è¿¡¼´Â °¢ À¥ ÆäÀÌÁö°¡ ¼ÓÇÑ ¹üÁÖ(category)¸¦ ´ëÇ¥ÇÒ ¼ö ÀÖ´Â ´Ü¾î¸¦ ¼±Á¤Çϸç, À̸¦ À§ÇÏ¿© ´Ü¾îºóµµ¿Í ¹®¼ºóµµ¸¦ °öÇÑ °ªÀ» °è»êÇÑ´Ù. µÎ ¹ø° ´Ü°è¿¡¼´Â ù ¹ø° ´Ü°è¿¡ ÀÇÇØ ¼±ÅÃµÈ ´Ü¾îÀÇ Á¤º¸À̵æ(information gain)À» °è»êÇØ ºÐ·ù È®·üÀÌ ³ôÀº ´Ü¾î¸¦ ¿ì¼±ÀûÀ¸·Î ¼±Á¤ÇÑ´Ù. µÎ ´Ü°è¸¦ ÅëÇÏ¿© ¼±Á¤µÈ ´Ü¾îµé°ú À¥ ÆäÀÌÁöÀÇ ºÐ·ù Á¤º¸¸¦ °¡Áö°í, ±â°èÇнÀ¿¡ ÀÇÇÏ¿© ÄÄÆÄÀÏµÈ ±ÔÄ¢(compiled rules)À» »ý¼ºÇÑ´Ù. »ý¼ºµÈ ±ÔÄ¢Àº ÀÓÀÇÀÇ À¥ ÆäÀÌÁöµéÀ» µµ¸ÞÀÎ ¿ÂÅç·ÎÁö¿¡ ÀÇÇØ Á¤ÀÇµÈ ¹üÁÖ º°·Î ºÐ·ùÇÒ ¼ö ÀÖµµ·Ï ÇÑ´Ù. º» ³í¹®ÀÇ ½ÇÇè¿¡¼´Â ÁÖ¾îÁø À¥ ÆäÀÌÁö ÁýÇÕ¿¡¼ °¢ ¹üÁÖ º°·Î Æò±Õ 240°³ÀÇ ´Ü¾î·ÎºÎÅÍ 78°³ÀÇ ´Ü¾î¸¦ °á°úÀûÀ¸·Î ¼±Á¤ÇÏ¿´À¸¸ç, À̸¦ ¹ÙÅÁÀ¸·Î À¥ ÆäÀÌÁö ºÐ·ù ±ÔÄ¢À» »ý¼ºÇÏ¿´´Ù. ½ÇÇè °á°ú¿¡¼ Á¦¾ÈÇÑ ½Ã½ºÅÛÀÇ Æò±Õ ºÐ·ù Á¤È®µµ´Â ¾à 83.52%·Î ÃøÁ¤µÇ¾ú´Ù. |
¿µ¹®³»¿ë (English Abstract) |
In this paper, we present an automated Web page classification system based upon ontology. As a first step, to identify the representative terms given a set of classes, we compute the product of term frequency and document frequency. Secondly, the information gain of each term prioritizes it based on the possibility of classification. We compile a pair of the terms selected and a web page classification into rules using machine learning algorithms. The compiled rules classify any Web page into categories defined on a domain ontology. In the experiments, 78 terms out of 240 terms were identified as representative features given a set of Web pages. The resulting accuracy of the classification was, on the average, 83.52%., |
Å°¿öµå(Keyword) |
À¥ ÆäÀÌÁö ºÐ·ù
Web Page Classification
¿ÂÅç·ÎÁö
Ontology
Á¤º¸À̵æ
Information Gain
±â°èÇнÀ
Machine Learning
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|