第46回AIチャレンジ研究会

日時： 2016年 11月9日（水） 10:00-17:15
開催場所：慶應義塾大学日吉キャンパス来往舎大会議室
( http://www.keio.ac.jp/ja/access/hiyoshi.html 10番の建物 )
テーマ：「ロボット聴覚」
屋外環境理解，極限環境における音理解，生物音響理解，日常環境や RoboCupなど実環境へのロボット聴覚の展開，
ロボット聴覚・音環境理解 (聴覚による情景分析)、
ロボット聴覚機能のための音響技術、音声処理、対話処理、音楽ロボット
音声に限らず音一般の知覚・理解
参加費・予稿集代：無料
これまでと同様，本研究会の予稿集は，当日 USBメモリにて配布，および，本ホームページからも公開します．
担当幹事：
中臺一博（(株)ホンダ・リサーチ・インスティチュート・ジャパン/東京工業大学）
公文誠（熊本大学）
中村圭佑（(株)ホンダ・リサーチ・インスティチュート・ジャパン）

10:00-11:00 招待講演：ノンパラメトリックベイズと深層学習に基づく音声データからの教師なし語彙獲得
〜記号創発ロボティクスによる知能と言語へのアプローチ〜

○谷口忠大 (立命館大学)

11:00-11:25 量子化 Deep Neural Network のための有界重みモデルに基づく音響モデル学習

○武田龍 (大阪大学), 中臺一博 (ホンダRI), 駒谷和範 (大阪大学)

本研究では，Neural Network に基づく音響モデルの省メモリ・高速化のため，パラメータを量子化した Neural Network の実現を目指す．本稿では，量子化誤差を抑えるため，層単位でネットワークの重みパラメータを正規化するのではなく，ノード単位に正規化する有界重みモデルを用いて学習を行う．実験により，認識精度を維持したまま重みパラメータを 2-bit まで量化できることを確認した．

11:25-11:50 部分共有型DNNを用いた音源同定器の学習

○森戸隆之 (東京工業大学), 杉山治 (京都大学), 小島諒介 (東京工業大学), 中臺一博 (東京工業大学/HRI-JP)

災害地における要救助者の捜索のための音源同定器を学習する手法として部分共有型DNNを提案する。

11:50-13:00 昼休み

13:00-14:40 【合同企画】優秀賞講演会（シンポジウムスペース）

14:40-15:05 ウグイスに対するプレイバック実験におけるマイクロホンアレイを用いたさえずりの方向分布分析

○炭谷晋司 (名古屋大学),　松林志保 (名古屋大学), 鈴木麗璽 (名古屋大学)

マイクロホンアレイを用いた野鳥の生態観測の可能性の検討として，ウグイスを対象としたプレイバック実験におけるさえずりの方向分布分析を報告する．

15:05-15:30 空間情報を用いた鳥の歌分析

○小島諒介 (東京工業大学), 杉山治 (東京工業大学), 干場功太郎 (東京工業大学), 鈴木麗璽 (名古屋大学), 中臺一博 (東京工業大学)

マイクロホンアレイを用いた鳥の歌の分析を可能とするため，鳥の空間的位置関係を考慮したモデルを提案し，実際のデータを用いて評価する．

15:30-15:55 UAV 搭載マイクアレイを用いた高雑音環境下における音イベント検出・識別の並列最適化

○杉山治 (京都大学), 小島諒介 (東京工業大学), 中臺一博 (ホンダRI/東京工業大学)

無人航空機 (UAV) に搭載したマイクアレイは,近くにノイズを発生するローターがあるため,常に高雑音環境にさらされる.本稿では,このような UAV に搭載したマイクアレイを用いて音源検出・音源識別をする際に現れる特有の課題に触れ,それらを解決するための並列最適化手法を提案する.

15:55-16:00 break

16:00-16:25 言語情報を用いた談話機能推定及びロボット頭部動作生成への応用

○劉超然 (ATR/HIL), 石井カルロス(ATR/HIL), 石黒浩 (ATR/HIL)

本稿では発話音声から、発話句境界・相槌などの談話機能の推定を試みた。また、推定結果を用いたロボット頭部動作生成を評価した。

16:250-16:50 Sequential Deep Learning for generating dancing motion

○ Nelson Yalta (Waseda University), Kazuhiro Nakadai (Honda Research Institute Japan) and Tetsuya Ogata (Waseda University)

Dance is a human social activity through it, human can share emotion, culture or entertainment. In recently years, robots have invaded our life in many activities such as manufactural process or social activities as dance. In dancing, robots have shown a good performance following the rhythm of a music using beat-tracking algorithm. In this work, we show an implementation of a deep learning based on sequential learning model for dancing robots. The model is trained without middle states, mixing audio information and the captured motion from a person, it can substitute a beat-tracking algorithm. The model generates quasi-realistic dance pattern motion without constraints from the music information and its start position, and following the rhythm beat from a multiple sound source music.

16:50-17:15 Using utterance timing to generate gaze pattern

○ Jani Even (ATR/HIl), Carlos Ishi (ATR/HIL) and Hiroshi Ishiguro (ATR/HIL)

This paper presents a method for generating the gaze pattern of a robot while it is talking. The goal is to prevent the robot's conversational partner from interrupting the robot at inappropriate moments. The proposed approach has two steps: First, the robot's utterance are split into meaningful parts. Then, for each of these parts, the robot performs or avoids eyes contact with the partner. The generated gaze pattern indicates the conversational partner that the robot has finished talking or not. The efficiency of the approach is illustrated by measuring the reduction of the speech overlap during conversations when using the proposed gaze pattern.

公知日について

公知日は 2016/11/9 となります。

リンク

人工知能学会 AI チャレンジ研究会