Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)
ÇѱÛÁ¦¸ñ(Korean Title) |
ÇÏµÓ ±â¹Ý ´ë±Ô¸ð ÀÛ¾÷ ¹èÄ¡ ¹× ó¸® ±â¼ú ¼³°è |
¿µ¹®Á¦¸ñ(English Title) |
Design of a Large-scale Task Dispatching & Processing System based on Hadoop |
ÀúÀÚ(Author) |
±èÁ÷¼ö
±¸¿£ Ä«¿À
±è¼¿µ
Ȳ¼ø¿í
Jik-Soo Kim
Nguyen Cao
Seoyoung Kim
Soonwook Hwang
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 43 NO. 06 PP. 0613 ~ 0620 (2016. 06) |
Çѱ۳»¿ë (Korean Abstract) |
º» ³í¹®¿¡¼´Â ´ë±Ô¸ðÀÇ ÀÛ¾÷À» °í¼º´ÉÀ¸·Î ó¸®Çϱâ À§ÇÑ Many-Task Computing(MTC) ±â¼úÀ» ±âÁ¸ÀÇ ºòµ¥ÀÌÅÍ Ã³¸® Ç÷§ÆûÀÎ Hadoop¿¡ Àû¿ëÇϱâ À§ÇÑ MOHA(Many-Task Computing on Hadoop) ÇÁ·¹ÀÓ¿öÅ©¿¡ ´ëÇØ ±â¼úÇÑ´Ù. ¼¼ºÎÀûÀ¸·Î´Â MOHAÀÇ ±âº» °³³ä°ú °³¹ß µ¿±â, ºÐ»ê ÀÛ¾÷ Å¥¿¡ ±â¹ÝÇÑ PoC(Proof-of-Concept) ¼öÇà °á°ú¸¦ Á¦½ÃÇÏ°í ÇâÈÄ ¿¬±¸ ¹æÇâ¿¡ ´ëÇؼ ³íÀÇÇÏ°íÀÚ ÇÑ´Ù. MTCÀÀ¿ëÀº °¢°¢ÀÇ Å½ºÅ©µéÀÌ ¿ä±¸ÇÏ´Â I/O 󸮷®Àº »ó´ëÀûÀ¸·Î ¸¹Áö ¾ÊÁö¸¸, µ¿½Ã¿¡ ´ë·®ÀÇ Å½ºÅ©µéÀ» °í¼º´ÉÀ¸·Î ó¸®ÇؾßÇÏ°í À̵éÀÌ ÆÄÀÏÀ» ÅëÇؼ Åë½ÅÇѴٴ Ư¡À» °¡Áö°í ÀÖ´Ù. µû¶ó¼ ±âÁ¸ÀÇ »ó´ëÀûÀ¸·Î Å« µ¥ÀÌÅÍ ºí·Ï »çÀÌÁî¿¡ ±â¹ÝÇÑ Hadoop ÀÀ¿ë°ú´Â ¶Ç ´Ù¸¥ ÆÐÅÏÀÇ µ¥ÀÌÅÍ Áý¾àÇü ¿öÅ©·Îµå¶ó°í ÇÒ ¼ö ÀÖ´Ù. ÀÌ·¯ÇÑ MTC ±â¼ú°ú ºòµ¥ÀÌÅÍ ±â¼úÀÇ À¶ÇÕÀ» ÅëÇØ ¸ÖƼ ÀÀ¿ë Ç÷§ÆûÀ¸·Î ÁøÈÇÏ°í ÀÖ´Â Hadoop »ýÅ°迡 ½Å±Ô ÇÁ·¹ÀÓ¿öÅ©·Î¼ ´ë±Ô¸ð °è»ê°úÇÐ ÀÀ¿ëÀ» ½ÇÇàÇÒ ¼ö ÀÖ´Â MOHA¸¦ Ãß°¡ÇÏ¿© ±â¿©ÇÒ ¼ö ÀÖÀ» °ÍÀÌ´Ù. |
¿µ¹®³»¿ë (English Abstract) |
This paper presents a MOHA(Many-Task Computing on Hadoop) framework which aims to effectively apply the Many-Task Computing(MTC) technologies originally developed for high-performance processing of many tasks, to the existing Big Data processing platform Hadoop. We present basic concepts, motivation, preliminary results of PoC based on distributed message queue, and future research directions of MOHA. MTC applications may have relatively low I/O requirements per task. However, a very large number of tasks should be efficiently processed with potentially heavy inter-communications based on files. Therefore, MTC applications can show another pattern of dataintensive workloads compared to existing Hadoop applications, typically based on relatively large data block sizes. Through an effective convergence of MTC and Big Data technologies, we can introduce a new MOHA framework which can support the large-scale scientific applications along with the Hadoop ecosystem, which is evolving into a multi-application platform. |
Å°¿öµå(Keyword) |
Many-Task Computing
ÇϵÓ
ºòµ¥ÀÌÅÍ Ç÷§Æû
¸ÖƼ·¹º§ ½ºÄÉÁÙ¸µ
MOHA
Many-Task Computing
Hadoop
Big Data platform
multi-level scheduling
MOHA
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|