下載本文檔
版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
1、IEEE Robotics and Automation Letters (RAL) paper presented at the2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019Decoding the Perceived Difficulty of Communicated Contents by Older People: Toward Conversational Robot-Assistive Elderly C
2、areSoheil Keshmiri1 and Hidenobu Sumioka1 and Ryuji Yamazaki2 and Hiroshi Ishiguro1,3long.However, enabling robots to interact with humans is a complex task and even more so when it comes to humans verbal communication: a conversation that resonates with one person may not sound the same to another,
3、 people lose their attention in different paces, individuals perceive difficulty of a topic in their own ways. Despite substantial advances in facial feature analysis 6, such facial expressions may not be as informative in case of verbal communication. For instance, a frowning face while listening t
4、o a conversation might signal attention or difficulty in following an statement than discomfort or anger. Such contextual effects during a verbal communication are highly subjective (i.e., vary from individual to individual) and internalized: they may not be immediately available through conventiona
5、l responses such as facial expression.Brain as the base for behavioural responses can help alleviate some of these shortcomings. In particular, brain- based approach to human-robot interaction is well-suited for verbal communication in which robotic media need to track the perceived complexity of co
6、nversational topic by their human companions in order to sustain their interaction through modulation of the communicated content. Such an ability can especially be helpful when these agents interact with individuals who struggle with expressing themselves (e.g., overstressed or shy persons and indi
7、viduals with such diseases as selective mutism).In this article, we aim at online estimation of the older peo- ples perceived difficulty of communicated contents during verbal communication based on pattern of their prefrontal cortex (PFC) activation. We focus on storytelling as a first step toward
8、decoding of the conversational communication since stories scripts can be kept intact and repeated to different individuals without any change in their contents, thereby allowing for the control of such confounders as subtle differences in conveyed information. In this context, the core issue is how
9、 to evaluate the individuals perceived difficulty of a verbally communicated content, considering the lack of an objective quantification for such perceptions. Here, we hypothesize that the perceived difficulty of a verbal communication is reflected in the cognitive load that a person experiences. I
10、n cognitive psychology, the cognitive load refers to the effort that is endured by the working memory (WM): the core component of the human cognition that includes language comprehension 7. Previous studies have formulated such simple WM tasks as mental arithmetic (MA) 8 and n-back 9 to quantitative
11、ly evaluate the level of cognitive load. Furthermore, functional imaging hasprovidedAbstract In this study, we propose a semi-supervised learn- ing model for decoding of the perceived difficulty of communi- cated content by older people. Our model is based on mapping of the older peoples prefrontal
12、cortex (PFC) activity during their verbal communication onto fine-grained cluster spaces of a working memory (WM) task that induces loads on humans PFC through modulation of its difficulty level. This allows for differential quantification of the observed changes in pattern of PFC activation during
13、verbal communication with respect to the difficulty level of the WM task. We show that such a quantification establishes a reliable basis for categorization and subsequently learning of the PFC responses to more naturalistic contents such as story comprehension. Our contribution is to present eviden
14、ce on effectiveness of our method for estimation of the older peoples perceived difficulty of the communicated contents during an online storytelling scenario.I. INTRODUCTIONA distinct attribute of robots in comparison with other media is their physical embodiment which allows for a sense of togethe
15、rness 1. Research suggests that children who read with the learning-companion robot consider their reading companion to support their reading comprehension and that it motivates a deepening social connection 2. Along the same direction, Mann et al. 3 find that people are more responsive to robots th
16、an computer-based healthcare systems. Additionally, Keshmiri et al. 4 identify that tele- communicating through a humanoid results in the older peoples brain to exhibit a similar activation pattern as in- person communication.These findings unanimously identify the potential of robots for improving
17、the accessibility, consistency, and quality of our public and medical care services. At the same time, they also imply the necessity for increased social interaction ability of robots 5 if we are to harness their potentials and positive impacts on our social lives in its earnest. After all, social i
18、nteraction is a bidirectional communication channel and interactive media that can comprehend their human com- panions expectations to respond accordingly is the minimum requirement if such interactions and relationships are to last*This research was supported by JST CREST Grant Number JP- MJCR18A1,
19、 JSPS KAKENHI Grant Number JP19K20746, and ImPACTGrant Number 2014-PM11-07-01.1Soheil Keshmiri and Hidenobu Sumioka are with Advanced Telecom- munications Research Institute International (ATR), Kyoto, Japan, Hi-roshi Ishiguro is with Graduate School of Engineering Science, Os- aka University, Japan
20、. soheil,sumiokaatr.jp 2Ryuji Ya- mazaki is with School of Social Sciences, Waseda University, Japan. rysaoni.waseda.jp 3Hiroshi Ishiguro is the with Graduate School of Engineering Science, Osaka University, Japan, and the Visiting Director of Hiroshi Ishiguro Laboratories (HIL) at ATR.ishigurosys.e
21、s.osaka-u.ac.jpCopyright 2019 IEEEa considerable evidence that shows the neural correlates of WM process reside in PFC 8, 9, 10.We propose to evaluate the perceived difficulty of commu- nicated contents during verbal communication via cognitive loads that are estimated based on brain activities duri
22、ng simple WM tasks. Specifically, we first organize cluster spaces that are formed through application of K-mean al- gorithm 11 on the near-infrared spectroscopy (NIRS) time series of older peoples PFC activity in response to induced cognitive load by n-back (n = 1, 2) auditory task (referred to as
23、NBT hereafter). In this task, participants are required to recall the reoccurrences of sequential (i.e., n = 1) or every- other (i.e., n = 2) occurrence of numerical values (1 through 9). We use NBT since it forms a better basis for quantifica- tion of the verbally communicated contents, considering
24、 its effect on PFC 12 and its ability in identifying the change in PFC activation in response to individuals emotions and change in mood 10. Next, we map older peoples PFC activity during an easy/hard listening task (referred to as EHL hereafter) onto NBT clusters. This mapping serves as a refinemen
25、t that allows for objective quantification of the brain activation during verbal communication based on well- defined clusters of the n-back, thereby including the PFC information during EHL that is not available in a pure WM task setting. EHL is designed to induce different level of cognitive loads
26、 on older peoples PFC by modulating its communicated information. This mapping process results in quantification of the frontal activities during EHL according to their proximity to the NBT clusters centroids (i.e., their centers): a process referred to as cross-labeling (e.g., label 1 if PFC activi
27、ty is closer to n = 1 cluster or 2, otherwise). Last, we use these cross-labeled PFC activities to train a linear supervised classifier for decoding of the older peoples PFC responses to online communicated topics.We show that our method can capture cognitive load ofthe older people during a natural
28、 storytelling scenario and that its estimation is associated with the older peoples self- assessment of the difficulty of the story. Our contribution to human-robot interaction is to form the first (to the best of our knowledge) preliminary step toward a conversational- based robot-assistive elderly
29、 care via enabling these media to predict the difficulty of their verbally communicated content (e.g., while telling a story in an elderly care) as perceived by the older people.perceived difficulty of stories. In the following section, we explain each step in details.A. Choice of Feature SpacePrevi
30、ous results 13, 14 indicated that differential en- tropy (DE) (i.e., average information content of a continuous random variable) significantly outperforms feature spaces that are predominantly used for classification of f/NIRS time series of human subjects PFC activity. Due to these results, we use
31、d linear estimate of DE for extracting features of the PFC activity.B. Clustering of NIRS times Series of n-back WM taskFigure 1 (A) and (B) show this process. We formed our n-back WM cluster spaces through application of K- mean algorithm 11 with two centroids on DE feature vectors of every five-se
32、cond-long NIRS time series of PFC activity during one- and two-back WM tasks. This resulted in formation of two clusters (i.e., C1 and C2 clusters, Figure 1 (B). We computed a DE feature vector (i.e., V in Figure 1 (A) for a given n-back NIRS time series of PFC activity of each participant as 15:12H
33、(Xj ) =log (2e )2(1)Xj22jthwhereis the variance of thenon-overlapping seg-Xjment of entire time series X of the participants PFC activity. It is worthy of note that the interpretation of C1 and C2 as representatives of PFC activation in response to easy/difficult communicated contents finds evidence
34、 in differential PFC activation in response to one- and two-back WM tasks 9. In this study, we used data from 13 that pertained to twenty eight adults frontal activities (eleven males and seventeen females, M = 30.96, SD = 10.84) who performed one- and two-back tasks.C. NBT-Based Cross-Labeling of E
35、HLFigure 1 (C) illustrates this step. We mapped DE feature vectors of participants NIRS PFC activity during EHL onto C1 and C2 cluster spaces based on their L2-norm distances (i.e., Euclidean distance) to centroids of these clusters. We labeled these vectors as easy (i.e., 1) if they were closer to
36、C1s center or difficult (i.e., 2) if they were closer to C2s center. This resulted in formation of clusters L (short for lower cognitive load) and H (short for higher cognitive load) that excluded NBT and were solely based on EHL. As a result, the EHLs labeling with respect to NBT established a corr
37、espondence between PFC activity in response to verbal communication and clusters of NBT.D. Training a Linear Supervised ModelFigure 1 (D) shows this process. We used 80.0% of EHL cross-labeled data for training while utilizing the remainder 20.0% for cross-validation (CV) to train our linear supervi
38、sed classifier. We used the linear supervised classifier in 13 that is based on a modified canonical linear regression. We chose this model due to its significantly improved accuracyII. METHODOLOGYFigure 1 shows an overview of our method. it consists of five steps A) feature extraction i.e., calcula
39、ting the infor- mation content of the brain activity, B) clusters formation using the participants PFC activity in response to induced cognitive load by NBT, C) NBT-Based cross-labeling of older peoples PFC activities during EHL which involves their labeling based on their proximity to NBT clusters
40、centroids (e.g., label 1 if PFC activity is closer to n = 1 cluster or 2, otherwise), D) training a linear supervised model with cross-labeled EHL data, and E) online estimation of theFig. 1. Models schematic diagram. (A) DE feature vectors for PFC activity of the participants in response to one- an
41、d two-back WM tasks were calculated, using equation (1). (B) These feature vectors were used to form clusters C1 and C2 through application of K-mean algorithm 11. (C) C1 and C2 clusters were utilized for labeling the DE feature vectors of EHL time series of PFC activity of human subjects via mappin
42、g these vectors onto C1 and C2 clusters based on their L2-norm (i.e., Euclidean distance) to the centroids of C1 and C2 (i.e., their respective centers), resulting in formation of EHL- based clusters, L (short for lower cognitive load) and H (short for higher cognitive load). (D) This cross-labeled
43、data was further used for training a linear supervised classifier 13 for online classification of PFC activity of older people in response to communicated contents. During training, 80.0% of EHL was used as training data. We used the remainder data of EHL for cross-validation (CV). (E) Trained linea
44、r supervised model was used for online estimation of the perceived difficulty of communicated contents by older people during conversation. (F) Once the session was over, the model counts the number of DE feature vectors that are classified as members of L or H clusters. Subsequently, it labeled the
45、 session as difficult/easy if number of DE feature vectors assigned to H/L during the session was larger than those in L/H, thereby returning this count along with the average of the L2-norms of the DE feature vectors of the selected cluster.in comparison with dominantly adapted classifiers for NIRS
46、- based n-back WM task in the literature.During the training, we adapted a brute-force search that started with a single feature (i.e., length one feature vector) through ten (i.e., feature vectors of length ten). For eachof these lengths 2, we also checked whether inclusionof polynomial degrees to
47、capture the interaction between the elements of a given feature vector can improve the performance. Therefore, we checked for polynomial degrees zero (i.e., no polynomial feature) through seven. We found that feature vectors of length four combined with polynomial degree of two yielded the highest p
48、rediction accuracy. There- fore, we used the length four feature vectors with polynomial degree of two.E. Online Estimation of the Perceived DifficultyWe used our trained linear model for online estimation of the perceived difficulty of communicated contents by older people during storytelling exper
49、iment. At every prediction cycle (i.e., every 20-second in current implementation), our model summarized the current PFC activity time series of the older people into its calculated feature vector. Next, it utilized the trained linear model to estimate the correspondence between the feature vector o
50、f the current PFC activity to two clusters. It then returned the magnitude of the induced difficulty of the communicated topic at that prediction cycle (i.e., L2-norm of its feature vector) along with its estimated label (i.e., whether closer to the Ls or Hs centroid). The older peoples perceived di
51、fficulty was estimated based on total number of DE feature vectors that were classified as members of L or H clusters.III. EXPERIMENTSWe conducted two experiments to verify the ability of our model in capturing the older peoples perceived difficulty ofverbal communication. In the first experiment, w
52、e verified that the trained model with the recorded data during EHL task had the ability to classify the NBT. In the second, we verified that the trained model was capable of estimating the perceived difficulty during storytelling (i.e., STE). Consider- ing the two-class labeling in our approach (i.
53、e., L = 1 and H = 2), the chance level accuracy was 50.0%.All participants were free of neurological and psychiatric disorders and had no history of hearing impairment. Subjects were seated in an armchair with head support in a sound- attenuated testing chamber, with instructions to fully relax whil
54、e their eyes closed. All experiments were carried out with written informed consents from all subjects.We used a minimalist design humanoid called Telenoid (Figure 2 (b) in our experiments. Motion of Telenoid was generated based on voice of the operator, using an online speech-driven head motion sys
55、tem 16. We placed Telenoid on a stand in an approximately 1.4 meter distance from the seat of the participant (Figure 2 (a).Near Infrared Spectroscopy (NIRS) 17 was used to collect PFC activity of the participants. We chose NIRS due to its non-invasive operational setup, portability, and relative im
56、munity to body movement 18. In our experiments, we acquired NIRS time series data of the participants using aFig. 2. (a) Experimenter demonstrates experimental setup. (b) Telenoid.ing a one-minute-long resting data that was then followed by its corresponding topic. We kept the communicated contents
57、intact in all the sessions. However, we randomized the order of easy/difficult contents among participants. Every subject participated in all of these settings.For model verification, we used the original labeling of NBT data for one- and two-back WM tasks (i.e., prior to K- mean clustering) from 13
58、. This enabled us to determine whether induced cognitive loads during NBT formed a proper basis for quantification of the cognitive demands on PFC during verbal communication. We considered our models prediction a true positive (tp) if its estimation and the NBTs original label were both 2 (i.e., difficult, Section II-C). Similarly, we considered it a true negative (tn) if estimated and original labels were both 1 (i.e., easy, Section II-C). Otherwise, we considered the estimate a false positive (fp) (i.e., predicted label = 2 and original label = 1) or a false negative (
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2026年鄭州財稅金融職業(yè)學院單招職業(yè)技能筆試備考試題帶答案解析
- 2026年廈門工學院單招職業(yè)技能筆試模擬試題帶答案解析
- 2026年阿拉善職業(yè)技術學院單招綜合素質考試參考題庫附答案詳解
- 2026年三門峽社會管理職業(yè)學院高職單招職業(yè)適應性測試備考試題帶答案解析
- 2026年湖南交通職業(yè)技術學院高職單招職業(yè)適應性測試備考題庫帶答案解析
- 2026年永城職業(yè)學院單招職業(yè)技能考試模擬試題帶答案解析
- 2026年延安職業(yè)技術學院高職單招職業(yè)適應性測試模擬試題帶答案解析
- 2026年松原職業(yè)技術學院單招職業(yè)技能考試備考題庫帶答案解析
- 2026年吉林工業(yè)職業(yè)技術學院單招職業(yè)技能筆試參考題庫帶答案解析
- 未來五年Spa氣泡按摩機企業(yè)數(shù)字化轉型與智慧升級戰(zhàn)略分析研究報告
- 2025新疆阿瓦提縣招聘警務輔助人員120人參考筆試題庫及答案解析
- 貴州國企招聘:2025貴州鹽業(yè)(集團)有限責任公司貴陽分公司招聘考試題庫附答案
- 股東會清算協(xié)議書
- 2026年湖南工程職業(yè)技術學院單招職業(yè)傾向性測試題庫及完整答案詳解1套
- 2025-2026學年秋季學期教學副校長工作述職報告
- 2025年春國家開放大學《消費者行為學》形考任務1-3+課程實訓+案例討論參考答案
- GB/T 3098.5-2025緊固件機械性能第5部分:自攻螺釘
- 第7課 月亮是從哪里來的 教學課件
- 2026年服裝電商直播轉化技巧
- 2025-2026學年小學美術浙美版(2024)二年級上冊期末練習卷及答案
- 會所軟裝合同范本
評論
0/150
提交評論