자율주행을 위한 멀티에이전트 심화 강화학습

이홍석; 박은수; 김승일; Hongsuk Yi; Eunsoo Park; Seungil Kim

연구문헌

국내 논문지

홈 > 연구문헌 > 국내 논문지 > 한국정보과학회 논문지 > 정보과학회 컴퓨팅의 실제 논문지 (KIISE Transactions on Computing Practices)

정보과학회 컴퓨팅의 실제 논문지 (KIISE Transactions on Computing Practices)

Current Result Document : 6 / 11 이전건 다음건

한글제목(Korean Title)	자율주행을 위한 멀티에이전트 심화 강화학습
영문제목(English Title)	Multi-agent Deep Reinforcement Learning for Autonomous Driving
저자(Author)	이홍석 박은수 김승일 Hongsuk Yi Eunsoo Park Seungil Kim
원문수록처(Citation)	VOL 24 NO. 12 PP. 0670 ~ 0674 (2018. 12)
한글내용 (Korean Abstract)	자율주행은 도로에서 차선 변경, 추월, 양보 등을 할 때 정교한 상황판단 기술을 적용해야 하는 멀티-에이전트 문제이다. 자율주행차량들의 연속적인 행동을 제어하기 위하여 본 논문에서는 심화 결정론적 정책 경사 강화학습 알고리즘을 적용하였다. 이를 위하여 차선변경이 빈번히 발생하는 도로 환경을 시뮬레이터로 구현하였고, 강화학습에서 적용된 보상은 개별 차량이 목적지 차선에 도착하면 높은 보상을 받지만, 차량이 다른 목적지 차선에 도착할 경우나 차량끼리 충돌이 발생할 경우에는 벌칙을 받도록 설계하였다. 16개의 멀티-에이전트 차량을 학습한 결과 학습시간이 충분할수록 차선변경을 제어할 수 있음을 알았다. 하지만 심화 강화학습과 시뮬레이터 환경의 한계로, 학습이 진행되는 과정에서 보상 값이 급격히 감소하였고, 이로 인하여 차량 주행은 매우 불안정한 주행을 하였다.
영문내용 (English Abstract)	Autonomous driving is a multi-agent problem, wherein the host vehicle must adopt sophisticated human driving negotiation skills with other drivers on the road when overtaking, giving away. In this paper we apply deep reinforcement learning to the problem of forming long-term driving. More specifically, we use deep deterministic policy gradient algorithms, termed actor- critic algorithm. A reward function promoting longitudinal velocity, while penalizing transverse velocity and divergence from the track center, is used to train multi-agents. The actor-critic algorithm was trained and evaluated in a synthetic environment. Results reveal that our deep reinforcement learning approach can generalize and adapt well to weaving sections on real roads.
키워드(Keyword)	인공지능 자율주행 심화 강화학습 멀티-에이전트 Artificial intelligence autonomous driving deep reinforcement multi-agent
파일첨부	PDF 다운로드