===============================================================
第７回人工知能学会ＡＩチャレンジ研究会 (SIG-Challenge) 開催案内
===============================================================

テーマ：「ＣＡＳＡ（音環境理解、聴覚情景分析）の新展開」
          (Computational Auditory Scene Analysis)

日時：１９９９年１１月２日（火）９時～１７時３０分

場所：青山学院大学 総合研究所 会議室（東京都渋谷区）
      青山学院大学正門脇
      地下鉄表参道 徒歩７分、渋谷駅徒歩１５分

概要：1990 年の Bregman 先生の著書『Auditory Scene Analysis』
      (MIT Press) による問題提起が契機となって活発化した
      ＣＡＳＡの研究は、この10年間における聴覚に関する知見の
      蓄積やマルチメディア社会への転換を背景として、更に発展
      しつつあります。第７回ＡＩチャレンジ研究会では、
      ＣＡＳＡの今後の展開を占うために様々な観点からの８件の
      研究発表が行われます。また、Bregman 先生による基調講演も
      予定されています。
      皆様におかれましても奮ってご参加下さいますよう、お願いい
      たします。

なお, Bregman 先生の基調講演は, 音響学会共催の特別講演となります. 
タイトルと講演概要は下記に添付しました. 

照会先（照会はE-mailでお願いします）：
      東京理科大学 & 科学技術振興事業団 北野共生システムプロジェクト
      奥乃 博
      E-mail: okuno@nue.org

参加費： 無料
資料代： ２, ０００円 （必要な方のみ）
==================================================

=======================================================================
Meeting of JSAI Special Interest Group on AI Challenges (SIG-Challenge)
=======================================================================
Theme: ``Computational Auditory Scene Analysis (CASA)''

09:00-10:15	Opening Address and Keynote Address

09:00-09:15
Opening Address
	Hiroshi G. Okuno (JST/Science Univ. of Tokyo)

09:15-10:15
Keynote Address: Auditory Scene Analysis by Humans and by Computers
	Albert S. Bregman (Department of Psychology, McGill University)

-- break --

10:30-12:00	Session 1

Vowel Segregation in Background Noise using the Model of
Segregating Two Acoustic Sources
	Masashi Unoki (ATR HIP/JAIST) and Masato Akagi (JAIST)

A Method of Blind Separation for Convolved Speech Signals 
	Mitsuru Kawamoto (RIKEN), Kiyotoshi Matsuoka (Kyushu Inst. Tech.),
	Noboru Ohnishi (Nagoya Univ.)

Blind Signal Separation Using Directivity Pattern
	Satoshi Kurita*, Hiroshi Saruwatari*, Shoji Kajita**,
	Kazuya Takeda* and Fumitada Itakura**
	* Graduate School of Engineering, Nagoya University
	** Center for Information Media Studies, Nagoya University

-- lunch --

13:15-14:45	Session 2

Search for Auditorily Meaningful Parts using STRAIGHT
        Parham Zolfaghari (CREST/ATR-HIP) and
        Hideki Kawahara (Wakayama Univ./CREST/ATR-HIP)

An Auditory Strategy for Separating Size and Shape Information of
Sound Sources
	Toshio Irino (ATR HIP) and Roy D. Patterson (CNBH, Cambridge Univ.)

Speech Recognition Based on Space Diversity Taking Room Acoustics
into Account
	Yasuhiro Shimizu*, Shoji Kajita**,
	Kazuya Takeda* and Fumitada Itakura**
	* Graduate School of Engineering, Nagoya University
	** Center for Information Media Studies, Nagoya University

-- break --

15:00-16:30	Session 3

Music Scene Description:
A Predominant-F0 Estimation Method for Detecting Melody and Bass Lines
	Masataka Goto (Electrotechnical Laboratory)

A Method of Peak Extraction and Its Evaluation for Humanoid
	Kazuhiro Nakadai (JST ERATO),
	Hiroshi G. Okuno (JST/Science Univ. of Tokyo), and
	Hiroaki Kitano (JST/Sony CSL)

Research Issues of Humanoid Audition
	Hiroshi G. Okuno (JST/Science Univ. of Tokyo),
	Kazuhiro Nakadai (JST), and Hiroaki Kitano (JST/Sony CSL)

16:45-17:30	Discussion and Closing Address

Moderated Discussions: 
	Is CASA-related Research Needed for Engineering and Psychology?
 	Moderator: Kunio Kashino (NTT Communication Science Laboratories)
 	Leading Discussant: Albert S. Bregman
 		     (Department of Psychology, McGill University)

Closing Address
	Hiroshi G. Okuno (JST/Science Univ. of Tokyo)

==================================================
TITLE:

Auditory Scene Analysis by humans and by computers.

ABSTRACT:

The paper will present data collected on human auditory scene analysis
(ASA)-- the process of organizing the auditory input derived from a
mixture of sounds into representations of the individual acoustic
sources that contributed to the mixture.  Then recent attempts to
achieve ASA by computers -- computational auditory scene analysis
(CASA)-- will be discussed in the light of human data.  The data
include the major known cues for the parsing of auditory sense data,
and more importantly, the properties of the ASA system. The behavior
of this system can be described by a number of statements: (1) Cues
compete with and support one another, probably in non-additive ways,
in determining the parsing of a signal into streams; (2) Constraints
on the parsing may be propagated from one part of the
frequency-by-time field to other parts; (3) Grouping tendencies may be
subject to consistency requirements; (4) Biases toward finding a
stream of sound with certain properties may build up with the
accumulation of evidence and then dissipate over time; (5) Sudden
changes in the acoustic input play a central role in the allocation of
computational resources to parts of the signal; (6) Bottom-up
processes of a fairly primitive form interact with top- down processes
that involve complex schemas, making it necessary for a model to
provide an interface between the two.  Acoustic demonstrations of
these principles will be played and discussed.

--------------------------------------------------