• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ÇÐȸÁö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ÇÐȸÁö > µ¥ÀÌÅͺ£À̽º ¿¬±¸È¸Áö(SIGDB)

µ¥ÀÌÅͺ£À̽º ¿¬±¸È¸Áö(SIGDB)

Current Result Document : 6 / 11 ÀÌÀü°Ç ÀÌÀü°Ç   ´ÙÀ½°Ç ´ÙÀ½°Ç

ÇѱÛÁ¦¸ñ(Korean Title) ´ë±Ô¸ð À¥ ¹®¼­ÀÇ ½Ç½Ã°£ ÀÚ¿¬¾î 󸮸¦ À§ÇÑ µ¥ÀÌÅÍ ¼öÁý¡¤ÀúÀå ½Ã½ºÅÛ ¼³°è ¹× ±¸Çö
¿µ¹®Á¦¸ñ(English Title) Design and Implementation of Data Collection and Storage System for Real-Time Natural Language Processing of Large-Scale Web Documents
ÀúÀÚ(Author) ÇöÀϼº   À±À翬   ÃÖº´¼­   ÀÌÀÍÈÆ   À̻󱸠  Richeg Xuan   Jaeyeun Yoon   Byeongseo Choe   Lg-hoon Lee   Sang-goo Lee  
¿ø¹®¼ö·Ïó(Citation) VOL 34 NO. 02 PP. 0059 ~ 0073 (2018. 08)
Çѱ۳»¿ë
(Korean Abstract)
ºòµ¥ÀÌÅÍ ½Ã´ë¿¡ ºòµ¥ÀÌÅÍ ½Ã½ºÅÛ ±¸Ãà ¹× È°¿ëÀ» À§ÇØ µ¥ÀÌÅ͸¦ ¼öÁýÇÏ°í ÀúÀå ¹× Ã³¸®ÇÏ´Â ÀÏÀº °¡Àå ±âº»ÀûÀ̸鼭µµ ÇÙ½ÉÀûÀÎ ÀÏÀÌ´Ù. ÀÎÅÍ³Ý ÅؽºÆ® µ¥ÀÌÅÍ´Â ´ëÇ¥ÀûÀÎ ºòµ¥ÀÌÅÍÀÌ°í, ´ë¿ë·®ÀÇ ÅؽºÆ® µ¥ÀÌÅÍ ¼öÁý ¹× ó¸®¿Í ÀÚ¿¬¾î 󸮿¡ ´ëÇÑ ¼ö¿ä´Â Áö¼ÓÀûÀ¸·Î Áõ°¡ÇÏ°í ÀÖ´Ù. º» ³í¹®¿¡¼­´Â ´ë±Ô¸ð À¥ ¹®¼­ÀÇ ÅؽºÆ® µ¥ÀÌÅ͸¦ ¼öÁýÇÏ°í ÀúÀåÇÏ´Â ½Ã½ºÅÛÀ» ¼³°èÇÏ°í ±¸ÇöÇÑ´Ù. µ¥ÀÌÅÍ ¼öÁý ºÎºÐ¿¡¼­´Â API°¡ Á¦°øµÇÁö ¾Ê´Â ´Ù¾çÇÑ À¥ »çÀÌÆ®·ÎºÎÅÍ ÅؽºÆ® µ¥ÀÌÅ͸¦ ¼öÁýÇÒ ¼ö ÀÖ´Â ¼³°è¸¦ Á¦¾ÈÇÑ´Ù. ¶ÇÇÑ µ¥ÀÌÅ͸¦ ºü¸£°í È¿À²ÀûÀ¸·Î ¼öÁýÇϱâ À§ÇÑ º´·ÄÈ­ ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. ÀúÀå ½Ã½ºÅÛÀº ´Ù¾çÇÑ ÀÚ¿¬¾î ó¸® ¸ðµâ¿¡ Àû¿ëÇÒ ¼ö ÀÖ°í ½Ç½Ã°£ ÀÚ¿¬¾î 󸮸¦ Áö¿øÇϱâ À§ÇØ Àθ޸𸮠µ¥ÀÌÅͺ£À̽º °ü¸® ½Ã½ºÅÛÀ» »ç¿ëÇÔÀ¸·Î½á ½ÇÇà ¼Óµµ¸¦ Çâ»ó½ÃÄ×´Ù. º» ³í¹®ÀÇ ½ÇÇè¿¡¼­´Â ½ÇÁ¦·Î À¥ ¹®¼­ÀÇ ´ë±Ô¸ð ÅؽºÆ® µ¥ÀÌÅ͸¦ ¼öÁýÇÏ°í ó¸®ÇÏ´Â ½ÇÇèÀ» ÅëÇØ ½Ã½ºÅÛÀÇ À¯È¿¼ºÀ» È®ÀÎÇÏ¿´´Ù.
¿µ¹®³»¿ë
(English Abstract)
In the big data era, collecting and processing data is the most fundamental and central thing for big data system implementation and utilization. Internet text data is the one of the most representative big data. The demand of collection of these big data and natural language processing thereof is steadily increasing. In the paper, we propose a system for collecting and storing text data of large-scale web documents. The proposed data collection system can collect data from various websites which support no API. In addition, the massive text data can be collected quickly and efficiently through various parallelization methods for performance improvement. The proposed storage system can be applied to various natural language processing modules and the execution speed is improved by using in-memory DBMS for real-time natural language processing. The validity of the proposed system is verified by our experiments to collect actual large web documents.
Å°¿öµå(Keyword) Big data   Natural language processing   Data crawling   Real-time processing   ºòµ¥ÀÌÅÍ   ÀÚ¿¬¾î 󸮠  µ¥ÀÌÅÍ Å©·Ñ¸µ   ½Ç½Ã°£ 󸮠 
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå