Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)
Current Result Document :
ÇѱÛÁ¦¸ñ(Korean Title) |
DramaQA: °èÃþÀû ÁúÀÇÀÀ´ä°ú ÇÔ²²ÇÏ´Â µîÀåÀι° Á᫐ ºñµð¿À ½ºÅ丮 ÀÌÇØ |
¿µ¹®Á¦¸ñ(English Title) |
DramaQA: Character-Centered Video Story Understanding with Hierarchical QA |
ÀúÀÚ(Author) |
ÀÌÁø¿ì
¿øÁ¤ÀÓ
À±ÁöÈñ
JinWoo Lee
Jung-Im Won
JeeHee Yoon
ÃÖ¼ºÈ£
¿Â°æ¿î
ÇãÀ¯Á¤
ÀåÀ¯¿ø
¼¾ÆÁ¤
À̽ÂÂù
À̹μö
À庴Ź
Seongho Choi
Kyoung-Woon On
Yu-Jung Heo
Youwon Jang
Ahjeong Seo
Seungchan Lee
Minsu Lee
Byoung-Tak Zhang
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 27 NO. 01 PP. 0001 ~ 0007 (2021. 01) |
Çѱ۳»¿ë (Korean Abstract) |
º» ³í¹®Àº ºñµð¿À ½ºÅ丮ÀÇ Æ÷°ýÀû ÀÌÇظ¦ À§ÇÑ »õ·Î¿î ºñµð¿À ÁúÀÇÀÀ´ä µ¥ÀÌÅͼ DramaQA ¸¦ Á¦¾ÈÇÑ´Ù. DramaQA µ¥ÀÌÅͼÂÀº 1) Àΰ£Áö´ÉÀÇ ÀÎÁö ¹ß´Þ ´Ü°è¿¡ ±âÃÊÇÑ ÀΰøÁö´É ½Ã½ºÅÛ¿¡ ´ëÇÑ Æò°¡ ÁöÇ¥·Î¼ÀÇ °èÃþÀû ÁúÀÇÀÀ´ä µ¥ÀÌÅͼ°ú 2) ½ºÅ丮ÀÇ Áö¿ªÀû ÀÏ°ü¼ºÀ» ¸ðµ¨¸µÇϱâ À§ÇÑ µîÀåÀι° Áß½ÉÀÇ ºñµð¿À ÁÖ¼®À» Á¦°øÇÏ´Â °ÍÀ» ¸ñÇ¥·Î ÇÑ´Ù. DramaQA µ¥ÀÌÅͼÂÀº TV µå¶ó¸¶ ¡°¶Ç ¿ÀÇØ¿µ¡±À» ÀÌ¿ëÇÏ¿© Á¦À۵ǾúÀ¸¸ç, 23,928°³ÀÇ ´Ù¾çÇÑ ±æÀÌÀÇ ºñµð¿À·ÎºÎÅÍ °¢°¢ 4°³ÀÇ ³À̵µ Áß Çϳª¿¡ Æ÷ÇԵǴ 17,983°³ÀÇ ÁúÀÇÀÀ´ä ½ÖÀ» Æ÷ÇÔÇÑ´Ù. µ¥ÀÌÅͼÂÀº µîÀåÀι° Á᫐ ½Ã°¢Àû ÁÖ¼®ÀÌ µÇ¾îÀÖ´Â 217,308ÀåÀÇ À̹ÌÁöµé°ú »óÈ£ÂüÁ¶°¡ ÇØ°áµÈ ½ºÅ©¸³Æ®¸¦ Á¦°øÇÑ´Ù. ¶ÇÇÑ, ¿ì¸®´Â ºñµð¿À ÁúÀÇÀÀ´ä¿¡ ´ëÇÑ µîÀåÀι°Áß½ÉÀÇ Ç¥ÇöÀ» È¿°úÀûÀ¸·Î ÇнÀÇϱâ À§ÇÑ Dual Matching Multistream ¸ðµ¨À» Á¦¾ÈÇÏ°í DramaQA µ¥ÀÌÅͼ¿¡ Àû¿ëÇÏ¿© µîÀåÀι° Áß½ÉÀÇ ºñµð¿À ½ºÅ丮 ÀÌÇØ ¹æ¹ýÀ» Á¦½ÃÇÑ´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
In this paper, we propose a novel video question answering (Video QA) task, DramaQA, for obtaining a comprehensive understanding of a video story. The DramaQA focuses on two perspectives: 1) hierarchical QAs as an evaluation metric based on the cognitive developmental stages of human intelligence, and 2) character-centered video annotations to model the local coherence of the story. Our dataset is built upon the TV drama ¡°Another Miss Oh¡± and contains 16,191 QA pairs from 23,928 video clips of various lengths, with each QA pair belonging to one of four difficulty levels. We provide a total of 217,308 annotated images with rich character-centered visual annotations and coreference resolved scripts. In addition, we provide analyses of the dataset as well as a Dual Matching Multistream model which effectively learns character-centered representations of the video to answer questions about the video.
|
Å°¿öµå(Keyword) |
Â÷¼¼´ë ½ÃÄö½Ì
º¯ÀÌ ºÐ¼®
Genome Variant Call Format(GVCF) ÆÄÀÏ ¼ÒÆ®/¸ÓÁö
½ºÆÄÅ©
ºÐ »êº´·Äó¸®
next-generation sequencing (NGS)
variant analysis
Genome Variant Call Format(GVCF) File Sort/Merge
Spark
parallel/distributed computing
ºñµð¿À ÁúÀÇÀÀ´ä
ºñµð¿À ½ºÅ丮 ÀÌÇØ
ÁúÀÇÀÀ´ä Æò°¡ÁöÇ¥
µîÀåÀι° Á᫐ ºñµð¿À ÁÖ¼®
video question and answering
video story understanding
evaluation metric for QA
character-centered video annotation
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|