Department of InformaticsAssociate Professor

Last Updated :2024/07/05

■Researcher basic information


  • Ph.D.(2009/03 Keio University)

Research Field

  • Informatics / Human interfaces and interactions / Human Computer Interface
  • Informatics / Biological, health, and medical informatics

■Research activity information


  • Mizuho Nishio; Takaaki Matsunaga; Hidetoshi Matsuo; Munenobu Nogami; Yasuhisa Kurata; Koji Fujimoto; Osamu Sugiyama; Toshiaki Akashi; Shigeki Aoki; Takamichi Murakami
    Informatics in Medicine Unlocked Elsevier BV 46 101465 - 101465 2352-9148 2024
  • Mizuho Nishio; Hidetoshi Matsuo; Yasuhisa Kurata; Osamu Sugiyama; Koji Fujimoto
    Cancers MDPI AG 15 (5) 1535 - 1535 2023/02 
    We aimed to develop and evaluate an automatic prediction system for grading histopathological images of prostate cancer. A total of 10,616 whole slide images (WSIs) of prostate tissue were used in this study. The WSIs from one institution (5160 WSIs) were used as the development set, while those from the other institution (5456 WSIs) were used as the unseen test set. Label distribution learning (LDL) was used to address a difference in label characteristics between the development and test sets. A combination of EfficientNet (a deep learning model) and LDL was utilized to develop an automatic prediction system. Quadratic weighted kappa (QWK) and accuracy in the test set were used as the evaluation metrics. The QWK and accuracy were compared between systems with and without LDL to evaluate the usefulness of LDL in system development. The QWK and accuracy were 0.364 and 0.407 in the systems with LDL and 0.240 and 0.247 in those without LDL, respectively. Thus, LDL improved the diagnostic performance of the automatic prediction system for the grading of histopathological images for cancer. By handling the difference in label characteristics using LDL, the diagnostic performance of the automatic prediction system could be improved for prostate cancer grading.
  • バイタルデータターミナル(VDT)導入効果の検討 VDT利用率と看護師へのアンケート調査
    疋田 智子; 黒田 知宏; 杉山 治; 竹村 匡正
    医療情報学 (一社)日本医療情報学会 42 (4) 161 - 171 0289-8055 2023/01 
    バイタルサインの測定における看護師の転記作業は,看護師にとって業務負担になっているだけでなく,転記間違いのリスクや,リアルタイムに多職種とのデータを確認できないなどの問題点がある.一方で,バイタル測定機器においても近接通信機能を持ったデバイスが普及しつつあり,京都大学医学部附属病院においては,これらのデバイスを用いて自動的にデータを収集できるバイタルデータターミナル(Vital Data Terminal:VDT)システムを構築した.これらのシステムは2016年の病院情報システムの更新時に一般病棟1,000床に導入し,看護師も個人認証タグを身につけることで「誰が」「誰に」「何時に」「何を」測定したのかが判別できるものであった.本研究では,本システムの導入によって実際に看護業務の負荷が軽減したのかを検証することを目的とし,実際にアンケート調査を行った.結果は,VDT入力率が2016年は23%台で推移していたものが,2021年3月には45%前後まで上昇した.また,VDTを使用する看護師684人にアンケートを実施した結果は「VDT導入は業務軽減につながったと思いますか」「とても思う」94名19.6%,「まあまあ思う」244名50.9%であり,70.3%のスタッフが業務軽減につながったと回答し,VDT導入による業務軽減が一定程度達成されたことが示唆された.(著者抄録)
  • 看護記録の自動構造化に向けた音声入力インタフェースの提案
    西田 菜都子; Liu Chang; 杉山 治; 山本 豪志朗; 黒田 知宏
    医療情報学連合大会論文集 (一社)日本医療情報学会 42回 866 - 869 1347-8508 2022/11
  • アラーム音自動検出システムに適したマイクロホン下限性能の検討
    岸本 和昌; 竹村 匡正; 山本 豪志朗; 杉山 治; 小島 諒介; 黒田 知宏
    医療情報学連合大会論文集 (一社)日本医療情報学会 42回 841 - 843 1347-8508 2022/11
  • 症例報告を対象とした固有表現抽出手法の比較
    市川 花菜; 藤本 晃司; 杉山 治; 岸本 和昌; 西尾 瑞穂; 山本 豪志朗; 黒田 知宏
    医療情報学連合大会論文集 (一社)日本医療情報学会 42回 1155 - 1158 1347-8508 2022/11
  • Manabu Shimoto; Kosai Cho; Masahiro Kurata; Mayu Hitomi; Yoichi Kato; Shinji Aida; Osamu Sugiyama; Norio Maki; Shigeru Ohtsuru
    Disaster Medicine and Public Health Preparedness Cambridge University Press (CUP) 16 (6) 1 - 3 1935-7893 2022/04 
    Abstract During the 2016 Kumamoto earthquake, 10 hospitals took responsibility for complete evacuation, in what has become regarded as one of the largest evacuations of patients in 1 seismic disaster. We aimed to examine the reasons for evacuation and to assess hospital vulnerability as well as preparedness for the earthquake. A multidisciplinary team conducted semi-structured interviews with the hospitals 6 months after the earthquake. The primary reasons for the decision to evacuate hospitals were categorized into 3: 1) Concern for structural safety (4 facilities), 2) Damage to the facility water system (7 facilities), and 3) Cessation of regional water supply (5 facilities). All hospitals decided on immediate evacuation within 30 hours and could not wait for structural engineers to inspect the affected buildings. Damage to sprinklers or water facilities caused severe water shortages and flood, thus requiring weeks to resume inpatient care. The earthquake revealed the vulnerability of rapid building-inspection systems, aging buildings, and water infrastructure.
  • KISHIMOTO Kazumasa; TAKEMURA Tadamasa; SUGIYAMA Osamu; KOJIMA Ryosuke; YAKAMI Masahiro; NAMBU Masayuki; FUJII Kiyotaka; KURODA Tomohiro
    Transactions of Japanese Society for Medical and Biological Engineering Japanese Society for Medical and Biological Engineering 60 (1) 8 - 15 1347-443X 2022/03 [Refereed]
    Accidents may occur in hospitals when the medical staff fail to notice the alarm ringing at a distance or in a closed room. In many hospitals, patient monitoring devices are connected to the hospital information system through a network, but some medical devices cannot be connected because they do not produce any external output. If the staff can detect the alarm ringing in a hospital room from some distance, they can provide more efficient and proactive medical care. In this study, alarm sounds were collected using a monaural microphone, and a machine learning classifier was constructed using deep neural networks. The classifier was evaluated using a simulation dataset of polyphonic alarm sounds, superimposed with the environmental sounds of a hospital ward. Data were collected from four devices, and two training datasets were created with a logarithm spectrogram using Mel filter bank (MFB) and custom filter bank (CFB). In addition, two classifiers were developed for 16 classes based on a combination of the four devices. One classifier was trained on MFB and the other on CFB. The classifiers evaluated the simulation dataset with a signal-to-noise ratio (SNR) of 30, 20, 10, and 0 dB. The classifier trained on CFB had a micro F1 score of 72.7% and an area-under-the-ROCcurve of 0.963 at an SNR of 0 dB. This micro F1 score was 4.5 points higher than that of the score of the classifier trained on MFB. In addition, the misidentification rate of the environmental sounds (class without all devices) was 1.2%. Therefore, the classifier could not reliably distinguish between the alarm sound and environmental sounds, but the possibility as a notification system was presented.
  • Shusuke Hiragi; Jun Hatanaka; Osamu Sugiyama; Kenichi Saito; Masayuki Nambu; Tomohiro Kuroda
    JMIR formative research 6 (3) e28877  2022/03 
    BACKGROUND: Hospital bed management is an important resource allocation task in hospital management, but currently, it is a challenging task. However, acquiring an optimal solution is also difficult because intraorganizational information asymmetry exists. Signaling, as defined in the fields of economics, can be used to mitigate this problem. OBJECTIVE: We aimed to develop an assignment process that is based on a token economy as signaling intermediary. METHODS: We implemented a game-like simulation, representing token economy-based bed assignments, in which 3 players act as ward managers of 3 inpatient wards (1 each). As a preliminary evaluation, we recruited 9 nurse managers to play and then participate in a survey about qualitative perceptions for current and proposed methods (7-point Likert scale). We also asked them about preferred rewards for collected tokens. In addition, we quantitatively recorded participant pricing behavior. RESULTS: Participants scored the token economy-method positively in staff satisfaction (3.89 points vs 2.67 points) and patient safety (4.38 points vs 3.50 points) compared to the current method, but they scored the proposed method negatively for managerial rivalry, staff employee development, and benefit for patients. The majority of participants (7 out of 9) listed human resources as the preferred reward for tokens. There were slight associations between workload information and pricing. CONCLUSIONS: Survey results indicate that the proposed method can improve staff satisfaction and patient safety by increasing the decision-making autonomy of staff but may also increase managerial rivalry, as expected from existing criticism for decentralized decision-making. Participant behavior indicated that token-based pricing can act as a signaling intermediary. Given responses related to rewards, a token system that is designed to incorporate human resource allocation is a promising method. Based on aforementioned discussion, we concluded that a token economy-based bed allocation system has the potential to be an optimal method by mitigating information asymmetry.
  • Keita Fukuyama; Osamu Sugiyama; Kazuo Chin; Susumu Satou; Shigemi Matsumoto; Manabu Muto
    Advanced Biomedical Engineering Japanese Society for Medical and Biological Engineering 11 58 - 67 2022 [Refereed]
  • Koji Yokoyama; Goshiro Yamamoto; Chang Liu; Osamu Sugiyama; Luciano Ho Santos; Tomohiro Kuroda
    Appropriate evaluation of the intraoperative state of a surgical team is essential for the improvement of teamwork and hence a safe surgical environment. Traditional methods to evaluate intraoperative team states such as interview and self-check questionnaire on each surgical team member often require human efforts, which are time-consuming and can be biased by individual recall. One effective solution is to analyze the surgical video and track the important team activities, such as whether the members are complying with the surgical procedure or are being distracted by unexpected events. However, due to the complexity of the situations in an operating room, identifying the team activities without any human effort remains challenging. In this work, we propose a novel approach that automatically recognizes and quantifies intraoperative activities from surgery videos. As a first step, we focus on recognizing two activities that especially involve multiple individuals: (a) passing of clean-packaged surgery instruments which is a representative interaction between the surgical technologists such as the circulating nurse and scrub nurse, and (b) group attention that may be attracted by unexpected events. We record surgical videos as input, and apply pose estimation and particle filters to extract individual & apos;s face orientation, body orientation, and arm raise. These results coupled with individual IDs are then sent to an estimation model that provides the probability of each target activity. Simultaneously, a person model is generated and bound to each individual, which describes all the involved activities along the timeline. We tested our method using videos of simulated activities. The results showed that the system was able to recognize instrument passing and group attention with F1 = 0.95 and F1 = 0.66, respectively. We also implemented a system with an interface that automatically annotated intraoperative activities along the video timeline, and invited feedback from surgical technologists. The results suggest that the quantified and visualized activities can help improve understanding of the intraoperative state of the surgical team.
  • Ryo Otsuki; Osamu Sugiyama; Yuki Mori; Masahiro Miyake; Shusuke Hiragi; Goshiro Yamamoto; Luciano Santos; Yuta Nakanishi; Yoshikatsu Hosoda; Hiroshi Tamura; Shigemi Matsumoto; Akitaka Tsujikawa; Tomohiro Kuroda
    Advanced Biomedical Engineering Japanese Society for Medical and Biological Engineering 11 16 - 24 2022
  • Luciano Henrique De Oliveira Santos; Kazuya Okamoto; Ryo Otsuki; Shusuke Hiragi; Goshiro Yamamoto; Osamu Sugiyama; Tomoki Aoyama; Tomohiro Kuroda
    JMIR serious games 9 (1) e16458  2021/01 
    BACKGROUND: Pervasive games aim to create more fun and engaging experiences by mixing elements from the real world into the game world. Because they intermingle with players' lives and naturally promote more casual gameplay, they could be a powerful strategy to stimulate physical activity among older adults. However, to use these games more effectively, it is necessary to understand how design elements of the game affect player behavior. OBJECTIVE: The aim of this study was to evaluate how the presence of a specific design element, namely social interaction, would affect levels of physical activity. METHODS: Participants were recruited offline and randomly assigned to control and intervention groups in a single-blind design. Over 4 weeks, two variations of the same pervasive game were compared: with social interaction (intervention group) and with no social interaction (control group). In both versions, players had to walk to physical locations and collect virtual cards, but the social interaction version allowed people to collaborate to obtain more cards. Changes in the weekly step counts were used to evaluate the effect on each group, and the number of places visited was used as an indicator of play activity. RESULTS: A total of 20 participants were recruited (no social interaction group, n=10; social interaction group, n=10); 18 participants remained active until the end of the study (no social interaction group, n=9; social interaction group, n=9). Step counts during the first week were used as the baseline level of physical activity (no social interaction group: mean 46,697.2, SE 7905.4; social interaction group: mean 45,967.3, SE 8260.7). For the subsequent weeks, changes to individual baseline values (absolute/proportional) for the no social interaction group were as follows: 1583.3 (SE 3108.3)/4.6% (SE 7.2%) (week 2), 591.5 (SE 2414.5)/2.4% (SE 4.7%) (week 3), and -1041.8 (SE 1992.7)/0.6% (SE 4.4%) (week 4). For the social interaction group, changes to individual baseline values were as follows: 11520.0 (SE 3941.5)/28.0% (SE 8.7%) (week 2), 9567.3 (SE 2631.5)/23.0% (SE 5.1%) (week 3), and 7648.7 (SE 3900.9)/13.9% (SE 8.0%) (week 4). The result of the analysis of the group effect was significant (absolute change: η2=0.31, P=.04; proportional change: η2=0.30, P=.03). Correlations between both absolute and proportional change and the play activity were significant (absolute change: r=0.59, 95% CI 0.32 to 0.77; proportional change: r=0.39, 95% CI 0.08 to 0.64). CONCLUSIONS: The presence of social interaction design elements in pervasive games appears to have a positive effect on levels of physical activity. TRIAL REGISTRATION: Japan Medical Association Clinical Trial Registration Number JMA-IIA00314; (Archived by WebCite at
  • Kento Suzuki; Luciano H.O. Santos; Chang Liu; Hiroaki Ueshima; Goshiro Yamamoto; Sayaka Okahashi; Shusuke Hiragi; Osamu Sugiyama; Kazuya Okamoto; Tomohiro Kuroda
    Transactions of Japanese Society for Medical and Biological Engineering Annual 59 (Proc) 805 - 807 1881-4379 2021 
    Conventional evaluation indices for upper limb function rehabilitation are based on the time to complete a task and the duration of movement. How-ever, these metrics are insufficient to quantify motor performance attributes, such as smoothness of movement and presence of compensatory movements. This study aims to introduce a quantitative index for the evaluation of upper limb functions based on rehabilitation exercises performed by patients. For our initial evaluation, we chose the Grasp movement performed in ARAT (Action Research Arm Test), a conventional evaluation method for upper limb functions in patients with post-stroke syndrome. We use RGB videos of therapist imitating a patient with posterior syndrome. Machine learning techniques were employed to esti-mate posture and extract skeletal information, using time-series analysis, an evaluation model was created to quantify the compensatory movements of post-stroke syndrome and healthy patients.
  • Suzuki Kento; Santos Luciano; Liu Chang; Ueshima Hiroaki; Yamamoto Goshiro; Okahashi Sayaka; Hiragi Shusuke; Sugiyama Osamu; Okamoto Kazuya; Kuroda Tomohiro
    Transactions of Japanese Society for Medical and Biological Engineering Japanese Society for Medical and Biological Engineering Annual59 (Proc) 805 - 807 1347-443X 2021 
    Conventional evaluation indices for upper limb function rehabilitation are based on the time to complete a task and the duration of movement. However, these metrics are insufficient to quanti-fy motor performance attributes, such as smooth-ness of movement and presence of compensatory movements. This study aims to introduce a quan-titative index for the evaluation of upper limb functions based on rehabilitation exercises per-formed by patients. For our initial evaluation, we chose the Grasp movement performed in ARAT (Action Research Arm Test), a conventional eval-uation method for upper limb functions in pa-tients with post-stroke syndrome. We use RGB videos of therapist imitating a patient with poste-rior syndrome. Machine learning techniques were employed to estimate posture and extract skeletal information, using time-series analysis, an eval-uation model was created to quantify the com-pensatory movements of post-stroke syndrome and healthy patients.
  • 医療データの特徴を考慮した多施設間Federated Learningの設計
    Ma Yunwei; 岡本 和也; 杉山 治; 山本 豪志朗; 佐々木 博史; 南部 雅幸; 黒田 知宏
    医療情報学連合大会論文集 (一社)日本医療情報学会 40回 516 - 520 1347-8508 2020/11
  • Samar Helou; Victoria Abou-Khalil; Goshiro Yamamoto; Osamu Sugiyama; Tomohiro Kuroda
    Studies in health technology and informatics IOS Press 270 718 - 722 0926-9630 2020/06 [Refereed]
    Electronic Medical Record (EMR) systems are complex systems with interdependent features. Redesigning one feature of the system can create a cascade effect affecting the other features. By calculating the cascade effect, the designers can understand how each individual feature could be affected. This understanding allows them to maximize the positive effects and avoid negative consequences of their redesign activities. To understand the cascade effect, the designers can look at their computations' results; a task that becomes more difficult when the number of features grows. To reduce their task load, we propose a tool for visualizing the cascade effect of redesigning features in an EMR system. Our preliminary evaluation with six graduate students shows that visualizing the cascade effect reduces the task load and slightly improves their performance when analyzing the cascade effect. Ways for improving the tool include (i) showing the computation results within the visualization, and (ii) allowing the designers to compare the cascade effect generated by redesigning different features.
  • Shusuke Hiragi; Osamu Sugiyama; Jun Hatanaka; Shosuke Ohtera; Goshiro Yamamoto; Kazuya Okamoto; Masayuki Nambu; Tomohiro Kuroda
    Studies in health technology and informatics IOS Press 270 1363 - 1364 2020/06 [Refereed]
    Effective bed management is important for hospital management. Until now, bed allocation process is generally controlled by administrative staffs in centralized manner but it is not always effective. In the present study, we proposed and evaluated new method for bed allocation applying market mechanism via token. Evaluation was performed with newly-developed game-type simulation. Nurse managers as research participants played it and answered for survey. The result showed that the proposed method can be useful with appropriate operational design.
  • 滲出性加齢黄斑変性の眼底写真診断における深層学習モデルの性能比較
    中西 悠太; 三宅 正裕; 大槻 涼; 細田 祥勝; 平木 秀輔; 杉山 治; 田村 寛; 黒田 知宏; 辻川 明孝
    眼科臨床紀要 眼科臨床紀要会 13 (2) 145 - 145 1882-5176 2020/02 [Refereed]
  • Morris, Kensuke; Yamamoto, Goshiro; Sugiyama, Osamu; Santos, Luciano H. O.; Tsutsumi, Takahiko; Ohtsuki, Ryo; Kato, Genta; Hiragi, Shusuke; Okamoto, Kazuya; Nambu, Masayuki; Kuroda, Tomohiro
    European Journal for Biomedical Informatics International Journal of Medical Research {\&} Health Sciences 16 (3) 2020 [Refereed]
  • Heryawan, Lukman; Sugiyama, Osamu; Yamamoto, Goshiro; Khotimah, Purnomo Husnul; Santos, Luciano H O; Okamoto, Kazuya; Kuroda, Tomohiro
    European Journal for Biomedical Informatics International Journal of Medical Research {\&} Health Sciences 16 (1) 2020 [Refereed]
  • Lukman Heryawan; Purnomo Husnul Khotimah; Osamu Sugiyama; Goshiro Yamamoto; Luciano Henrique de Oliveira Santos; Angga Eko Pramono; Kazuya Okamoto; Tomohiro Kuroda
    Subjective, objective, assessment, and plan (SOAP) notes are widely used by physicians to document clinical reasoning in assessing, diagnosing, and treating patients. SOAP notes are also used in medical coding tasks for reimbursement of insurance claims. In Indonesia, medical coders who are independent from physicians assess SOAP notes to assign diagnostic codes and medical procedure codes based on the corresponding International Classification of Diseases standards. Discrepancies between physicians who write the SOAP notes and coders who assign diagnoses and treatments, may occur. These discrepancies were assessed by performing a video-based survey to understand the coder's perspective, allowing the development of a writing support system to achieve unproblematic SOAP notes. This survey found that problematic SOAP notes were not caused by a single problem but by multiple problems. Abbreviations used by physicians are the major problem in assigning diagnostic codes, whereas incomplete data are the major problem in determining planning. This survey also showed that problematic SOAP notes may contain helpful keywords for coders that can help in determining diagnosis and treatment. The findings show that the system should be able to recognize separate sections of the SOAP note to provide writing support features and identify helpful keywords to encourage physicians to write unproblematic SOAP notes.
  • Ryo Otsuki; Osamu Sugiyama; Yuki Mori; Masahiro Miyake; Shusuke Hiragi; Goshiro Yamamoto; Luciano Santos; Yuta Nakanishi; Yoshikatsu Hosoda; Hiroshi Tamura; Shigemi Matsumoto; Akitaka Tsujikawa; Tomohiro Kuroda
    Age-related macular degeneration (AMD) causes visual acuity (VA) loss in people aged >= 50 years. Common treatments include intravitreal injection of anti-vascular endothelial growth factor agents such as aflibercept. However, lack of response in some patients makes prediction of posttreatment VA difficult. In this paper, we propose a deep neural network model to predict posttreatment VA using pretreatment medical imaging and patient profile data. The proposed model works with image data (optical coherence tomography and color fundus photograph) and patient profile data including gender, age, affected side and pretreatment decimal visual acuity. The model was tested by comparing mean square errors (MSE) between actual and predicted visual acuity obtained from input of image data alone, input of patient profile data alone, and input of both types of data. When examining the concatenation effectiveness of input of both types of data, the outcomes of concatenation conditions 100:100 and 500:500 were compared. For concatenation condition 100:100, MSE was 0.081 for input of image data alone, 0.052 for input of patient profile data alone, and 0.058 for input of both types of data. For concatenation condition 500:500, the MSE values were 0.081, 0.052, and 0.047, respectively. The model proposed provides highly accurate prediction of posttreatment VA and indication of recovery to physicians and patients. The method can handle incomplete images and patient profile data usually collected from patients before treatment.
  • Samar El Helou; Victoria Abou Khalil; Goshiro Yamamoto; Osamu Sugiyama; Tomohiro Kuroda
    718 - 722 2020 [Refereed]
  • Kensuke Morris; Osamu Sugiyama; Goshiro Yamamoto; Manabu Shimoto; Genta Kato; Shigeru Ohtsuru; Masayuki Nambu; Tomohiro Kuroda
    Advanced Biomedical Engineering 9 35 - 42 2020 [Refereed]
    © 2020, Japanese Society for Medical and Biological Engineering. All rights reserved. The use of social network service (SNS) applications for health communication has revolutionized communication between physicians in recent years. We performed an unrestricted retrospective study focusing on emergency physicians (EPs) in Kyoto University Hospital (KUHP) since timely communication is important during emergencies. EPs used LINE, a popular SNS application in Japan. EPs (n = 22) sent 1752 messages from April 2017 to March 2018. Most messages sent contained text data (82.1%), the remaining contained media (17.9%); media included images (72.6%), LINE stamps (22.9%), LINE albums (2.3%) and files (1.6%). Content analysis by two coders produced 13 categories (n = 1438); these were ‘miscellaneous’, ‘patient’, ‘team’, ‘treatment’, ‘event’, ‘situation’, ‘reference’, ‘announcement’, ‘schedule’, ‘resource’, ‘policy’, ‘transport’ and ‘unknown’. The top five message categories were related to miscellaneous chat (22.5%), patient (19.1%), team (14.3%), treatment (11.8%) and event (6.6%). The largest number of messages among EPs were sent on Monday and Friday. The numbers of messages sent among EPs during day-shift and night-shift were similar. The categories identified influenced our proposal of medical oriented SNS platform features: structured tagging system for messages related to relevant categories (F1); inquiry broadcast system for specific inquiries using structured tagging (F2); image tagging system for images shared within groups (F3) and summarized notifications (F4). Features that need consideration are (1) an opt-in location sharing system between physicians and (2) phy-sicians’ access to patient records from the SNS application. In this study, messages discussed by EPs were categorized and the resulting categories influenced our proposal of a physician-centered SNS platform customized to EPs’ roles. Since physicians prefer using SNS applications compared to traditional mobile phones, their information needs should be considered. Designing a medical oriented SNS platform that is physician-centered should first include an understanding of topics discussed by physicians. Based on the categories classified, the proposal of physician-centered features for designing a medical oriented SNS platform is also discussed in this paper.
  • Kazuhiro Nakadai; Shungo Masaki; Ryosuke Kojima; Osamu Sugiyama; Katsutoshi Itoyama; Kenji Nishida
    2020 IEEE/SICE International Symposium on System Integration (SII2020) IEEE 658 - 663 2020/01 [Refereed]
  • 下戸 学; 大鶴 繁; 趙 晃済; 堤 貴彦; 相田 伸二; 庵原 美香; 杉山 治; 倉田 真宏; 牧 紀男
    Japanese Journal of Disaster Medicine (一社)日本災害医学会 24 (3) 230 - 230 2189-4035 2019/12
  • 茅野 宏紀; 野間 春生; 杉山 治; 下戸 学; 大鶴 繁; 黒田 知宏
    Japanese Journal of Disaster Medicine (一社)日本災害医学会 24 (3) 283 - 283 2189-4035 2019/12
  • 対話型病状判定支援システムによる受診意欲とシステム使用感に関するユーザ評価
    山内 翔大; 岡本 和也; 平木 秀輔; 杉山 治; 山本 豪志朗; 佐々木 博史; 南部 雅幸; 黒田 知宏
    医療情報学連合大会論文集 (一社)日本医療情報学会 39回 721 - 724 1347-8508 2019/11
  • 対話型病状判定支援システムによる受診意欲とシステム使用感に関するユーザ評価
    山内 翔大; 岡本 和也; 平木 秀輔; 杉山 治; 山本 豪志朗; 佐々木 博史; 南部 雅幸; 黒田 知宏
    医療情報学連合大会論文集 (一社)日本医療情報学会 39回 389 - 389 1347-8508 2019/11
  • Purnomo Husnul Khotimah; Masatoshi Yoshikawa; Akihiro Hamasaki; Osamu Sugiyama; Kazuya Okamoto; Tomohiro Kuroda
    2019 International Conference on Computer, Control, Informatics and its Applications: Emerging Trends in Big Data and Artificial Intelligence, IC3INA 2019 140 - 145 2019/10 [Refereed]
    © 2019 IEEE. Clinical research has led to the development of new medications and medication strategy may change to gain better treatment outcomes. With the increasing attention to the evidence based medical guideline, it is important for clinicians to study the medication strategy changes. This study is interested in evaluating long-term prescriptions of type 2 diabetes patients provided by Kyoto University Hospital following the release of a new medication in 2010. Frequent sequential pattern mining (FSPM) is a prominent tools for extracting frequent patterns. However, the number of the result set could be enormous and may inhibit clinicians in assessing the results. To help the clinicians, a medication progression graph (MPG) is constructed using adjacent (1-sequence) frequent patterns produced by the singleton pattern mining. Compared to conventional frequent patterns, singleton's patterns features full itemsets and 1-step distance. Hence, the pattern represents the true medication transition event. Using the graph, a clinical physician was able to observe a significant increase of physician's activities in changing the medication after the release (i.e., in MPG that start with Sulfonylurea after 2010, the number of edges is increase by 2.83 and the number of nodes is increase by 2.27 compared to before 2010). Our preliminary results show that the new visualization method enables a clinician to analyze a medication strategy changes after the new medication released.
  • Samar Helou; Victoria Abou-Khalil; Goshiro Yamamoto; Eiji Kondoh; Hiroshi Tamura; Shusuke Hiragi; Osamu Sugiyama; Kazuya Okamoto; Masayuki Nambu; Tomohiro Kuroda
    Studies in health technology and informatics IOS Press 264 1213 - 1217 0926-9630 2019/08 [Refereed]
    Redesigning Electronic Medical Record (EMR) systems is needed to improve their usefulness and usability. For user-centered redesign, designers should consider which EMR features are the most important to the users. However, prioritizing the EMR features is complicated because: (i) EMR systems involve multiple users with different, and sometimes conflicting, priorities and (ii) targeting one feature will affect other features of the EMR system. In this work, we propose a method for prioritizing the features to target when redesigning an EMR system. The method takes into consideration the different priorities of the users and the relationships between the different features. We illustrate the method through a case study on redesigning EMR systems in Japanese antenatal care settings. Our results show the importance of considering the different types of EMR users and the relationships between different EMR features. Designers could use the proposed method as a decision-aid tool in EMR redesign projects.
  • Kenichiro Fujita; Osamu Sugiyama; Shusuke Hiragi; Kazuya Okamoto; Tadamasa Takemura; Tomohiro Kuroda
    Studies in health technology and informatics IOS Press 264 1662 - 1663 0926-9630 2019/08 [Refereed]
    The amount of text of electronic medical records and its changes over time are not clear. In designing an electronic medical records system, prediction of the amount of text is important. We analyzed the number of characters described in the electronic medical records. As a result, it became clear that the annual text quantity of electronic medical records follows the lognormal distribution, and also the amount has been increasing year by year.
  • Yohei Yamasaki; Osamu Sugiyama; Shusuke Hiragi; Shosuke Ohtera; Goshiro Yamamoto; Hiroshi Sasaki; Kazuya Okamoto; Masayuki Nambu; Tomohiro Kuroda
    Studies in health technology and informatics IOS Press 264 1596 - 1597 2019/08 [Refereed]
    Nephrosis is disease characterized by abnormal protein loss from impaired kidney. We constructed early prediction model using machine learning from clinical time series data, that can predict onset of nephrosis for more than one month. Long short-term memory capable of recognizing temporal sequential data patterns, was adopted as early prediction model for nephrosis. We verified our proposed prediction model has higher accuracy compared with those of baseline classifiers by 5-fold cross validation.
  • Luciano Henrique De Oliveira Santos; Kazuya Okamoto; Silvana Schwerz Funghetto; Adriana Schüler Cavalli; Shusuke Hiragi; Goshiro Yamamoto; Osamu Sugiyama; Carla Denise Castanho; Tomoki Aoyama; Tomohiro Kuroda
    JMIR serious games 7 (3) e13962  2019/07 [Refereed]
    BACKGROUND: The novel genre of pervasive games, which aim to create more fun and engaging experiences by promoting deeper immersion, could be a powerful strategy to stimulate physical activity among older adults. To use these games more effectively, it is necessary to understand how different design elements affect player behavior. OBJECTIVE: The aim was to vary a specific design element of pervasive games for older adults, namely social interaction, to test the effect on levels of physical activity. METHODS: Over 4 weeks, two variations of the same pervasive game were compared: social interaction for the test group and no social interaction for the control group. In both versions, players had to walk to physical locations and collect virtual cards, but the social interaction version allowed people to collaborate to obtain more cards. Weekly step counts were used to evaluate the effect on each group, and the number of places visited was used as an indicator of play activity. RESULTS: A total of 32 participants were recruited (no social interaction=15, social interaction=17); 18 remained until the end of the study (no social interaction=7, social interaction=11). Step counts during the first week were used as the baseline (no social interaction: mean 17,099.4, SE 3906.5; social interaction: mean 17,981.9, SE 2171.1). For the following weeks, changes to individual baseline were as follows for no social interaction (absolute/proportional): 383.8 (SE 563.8)/1.1% (SE 4.3%), 435.9 (SE 574.5)/2.2% (SE 4.6%), and -106.1 (SE 979.9)/-2.6% (SE 8.1%) for weeks 2, 3, and 4, respectively. For social interaction they were 3841.9 (SE 1425.4)/21.7% (SE 5.1%), 2270.6 (SE 947.1)/16.5% (SE 4.4%), and 2443.4 (SE 982.6)/17.9% (SE 4.7%) for weeks 2, 3, and 4, respectively. Analysis of group effect was significant (absolute change: η2=.19, P=.01; proportional change: η2=.27, P=.009). Correlation between the proportional change and the play activity was significant (r=.34, 95% CI 0.08 to 0.56), whereas for absolute change it was not. CONCLUSIONS: Social interaction design elements of the pervasive game may have some positive effects on the promotion of physical activity, although other factors might also have influenced this effect. TRIAL REGISTRATION: Japan Medical Association Clinical Trial Registration Number JMA-IIA00314; (Archived by WebCite at
  • Samar Helou; Victoria Abou-Khalil; Goshiro Yamamoto; Eiji Kondoh; Hiroshi Tamura; Shusuke Hiragi; Osamu Sugiyama; Kazuya Okamoto; Masayuki Nambu; Tomohiro Kuroda
    JMIR human factors 6 (3) e13812  2019/07 [Refereed]
    BACKGROUND: Redesigning electronic medical record (EMR) systems is needed to improve their usability and usefulness. Similar to other artifacts, EMR systems can evolve with time and exhibit situated roles. Situated roles refer to the ways in which a system is appropriated by its users, that is, the unintended ways the users engage with, relate to, and perceive the system in its context of use. These situated roles are usually unknown to the designers as they emerge and evolve as a response by the users to a contextual need or constraint. Understanding the system's situated roles can expose the unarticulated needs of the users and enable redesign opportunities. OBJECTIVE: This study aimed to find EMR redesign opportunities by understanding the situated roles of EMR systems in prenatal care settings. METHODS: We conducted a field-based observational study at a Japanese prenatal care clinic. We observed 3 obstetricians and 6 midwives providing prenatal care to 37 pregnant women. We looked at how the EMR system is used during the checkups. We analyzed the observational data following a thematic analysis approach and identified the situated roles of the EMR system. Finally, we administered a survey to 5 obstetricians and 10 midwives to validate our results and understand the attitudes of the prenatal care staff regarding the situated roles of the EMR system. RESULTS: We identified 10 distinct situated roles that EMR systems play in prenatal care settings. Among them, 4 roles were regarded as favorable as most users wanted to experience them more frequently, and 4 roles were regarded as unfavorable as most users wanted to experience them less frequently; 2 ambivalent roles highlighted the providers' reluctance to document sensitive psychosocial information in the EMR and their use of the EMR system as an accomplice to pause communication during the checkups. To improve the usability and usefulness of EMR systems, designers can amplify the favorable roles and minimize the unfavorable roles. Our results also showed that obstetricians and midwives may have different experiences, wants, and priorities regarding the use of the EMR system. CONCLUSIONS: Currently, EMR systems are mainly viewed as tools that support the clinical workflow. Redesigning EMR systems is needed to amplify their roles as communication support tools. Our results provided multiple EMR redesign opportunities to improve the usability and usefulness of EMR systems in prenatal care. Designers can use the results to guide their EMR redesign activities and align them with the users' wants and priorities. The biggest challenge is to redesign EMR systems in a way that amplifies their favorable roles for all the stakeholders concurrently.
  • YAMAUCHI Shota; OKAMOTO Kazuya; HIRAGI Shusuke; SUGIYAMA Osamu; YAMAMOTO Goshiro; SASAKI Hiroshi; NAMBU Masayuki; KURODA Tomohiro
    JSAI Technical Report, Type 2 SIG The Japanese Society for Artificial Intelligence 2019 (AIMED-007) 11  2019/03
  • OTSUKI Ryo; SUGIYAMA Osamu; YANO Shuji; MATSUMURA Kohei; TADA Masahiro; NOMA Haruo; KURODA Tomohiro
    JSAI Technical Report, Type 2 SIG The Japanese Society for Artificial Intelligence 2019 (AIMED-007) 03  2019/03
  • SUGIYAMA Osamu; HOSODA Yoshikatsu; MIYAKE Masahiro; OTSUKI Ryo; HIRAGI Shusuke; YAMAMOTO Goshiro; TAMURA Hiroshi; TSUJIKAWA Akitaka; KURODA Tomohiro
    JSAI Technical Report, Type 2 SIG The Japanese Society for Artificial Intelligence 2019 (AIMED-007) 02  2019/03
  • 下戸 学; 大鶴 繁; 趙 晃済; 堤 貴彦; 小池 薫; 相田 伸二; 杉山 治; 倉田 真宏; 牧 紀男
    Japanese Journal of Disaster Medicine (一社)日本災害医学会 23 (3) 277 - 277 2189-4035 2019/02
  • 趙 晃済; 大鶴 繁; 下戸 学; 堤 貴彦; 小池 薫; 相田 伸二; 杉山 治; 倉田 真宏; 牧 紀男
    Japanese Journal of Disaster Medicine (一社)日本災害医学会 23 (3) 358 - 358 2189-4035 2019/02
  • Luciano H O Santos; Luciano HO Santos; Kazuya Okamoto; Shusuke Hiragi; Goshiro Yamamoto; Osamu Sugiyama; Tomoki Aoyama; Tomohiro Kuroda
    Journal of Rehabilitation and Assistive Technologies Engineering {SAGE} Publications 6 205566831984444 - 205566831984444 2019/01 [Refereed]
  • Lukman Heryawan; Purnomo Husnul Khotimah; Goshiro Yamamoto; Osamu Sugiyama; Shusuke Hiragi; Kazuya Okamoto; Tomohiro Kuroda
    Proceedings of the 7th International Conference on Human-Agent Interaction, HAI 2019, Kyoto, Japan, October 06-10, 2019 ACM 244 - 246 2019 [Refereed]
  • Samar El Helou; Victoria Abou Khalil; Goshiro Yamamoto; Eiji Kondoh; Hiroshi Tamura; Shusuke Hiragi; Osamu Sugiyama; Kazuya Okamoto; Masayuki Nambu; Tomohiro Kuroda
    Informatics 6 (2) 15 - 15 2019 [Refereed]
  • Kazuya Okamoto; Karin Goka; Masahiro Hirose; Takashi Yamamoto; Shusuke Hiragi; Goshiro Yamamoto; Osamu Sugiyama; Masayuki Nambu; Tomohiro Kuroda
    Proc. ISPOR Europe 2018 IOS Press 270 1247 - 1248 2018/11 [Refereed]
    The goal of this research was to design a solution to detect non-reported incidents, especially severe incidents. To achieve this goal, we proposed a method to process electronic medical records and automatically extract clinical notes describing severe incidents. To evaluate the proposed method, we implemented a system and used the system. The system successfully detected a non-reported incident to the safety management department.
  • Pervasive game design to evaluate social interaction effects on levels of physical activity among older adults
    Luciano Santos; Kazuya Okamoto; Shusuke Hiragi; Goshiro Yamamoto; Osamu Sugiyama; Tomoki Aoyama; Tomohiro Kuroda
    Proc. 12th ICDVRAT with ITAG 157 - 164 2018/09 [Refereed]
  • OTSUKI Ryo; SUGIYAMA Osamu; MATSUMURA Kohei; TADA Masahiro; NOMA Haruo; KURODA Tomohiro
    JSAI Technical Report, Type 2 SIG The Japanese Society for Artificial Intelligence 2018 (AIMED-005) 01  2018/03
  • Ryosuke Kojima; Reiji Suzuki; Osamu Sugiyama; Kotaro Hoshiba; Kazuhiro Nakadai
    Proceedings - 2017 International Conference on Data Science and Advanced Analytics, DSAA 2017 Institute of Electrical and Electronics Engineers Inc. 2018- 395 - 404 2472-1573 2018/01 [Refereed]
    This paper addresses bird song scene analysis focusing on location of birds and acoustic features of bird songs. Such a research area usually requires manual annotation related to positions and/or vocalization types of the target animals for a large amount of observed data. However, this manual annotation has two problems. One is that it is tough to annotate data observed in real environments because environmental noise exist and sound is reflected by trees and the ground, and also several birds at different locations may sing at the same time. The other is that it is inevitable that manual annotation produces inaccurate and inconsistent labels due to human errors and annotators’ individual differences. For the first problem, we propose a Spatial-Cue-Based Probabilistic Model (SCBPM), which is a probabilistic model to estimate the maximum likelihood result for a bird song scene analysis by integrating sound source detection, localization, separation and identification based on spatial information of sound sources. For the second problem, we employ a semiautomatic annotation approach, in which a semi-supervised training method is deduced for SCBPM. This method decreases the amount of manual annotation. Preliminary experiments using recorded bird song data from the wild revealed that our system outperformed a conventional bird song scene analysis system by simply connecting sound source detection, localization, separation and identification in a cascade way in terms of identification accuracy.
  • Design and Implementation of a Social Networking Service-Based Application in Supporting Disaster Medical Assistance Teams
    Kawai T; Ohtsuru S
    IPSJ interaction IP10 487 - 489 2018 [Refereed]
  • Kawai T; Kambara H; Matsumura K; Noma H; Sugiyama O; Shimoto M; Ohtsuru S; Kuroda T
    Innovation in Medicine and Healthcare 2017, Smart Innovation, Systems and Technologies 71 165 - 172 2190-3018 2018 [Refereed]
    During the Kumamoto earthquakes in Japan, disaster medical assistance teams (DMATs) were dispatched for emergency support. Communication among DMAT members were primarily done via emails and phones, however, during this disaster, some teams also used LINE, a popular social networking service in Japan. Although this tool is simple to use, the teams had problems organizing various topics in a single chat room. In this paper, we propose an application that uses hashtags, which consists of two main units: (1) a bot that redirects messages to specific groups according to hashtags input by users; and (2) a system for logistic-support to manually apply hashtags to messages without tags, and to manually edit hashtags of already-tagged messages. User studies of two Kyoto University Hospital DMAT members were conducted, and through discussion, we found that the generality of the proposed application should be further considered for usage in other activities.
  • Hearing and Analysis of Hospital Evacuation after The 2016 KUMAMOTO Earthquake
    Kurata M; Hitomi M; Shimmoto S; Ohtsuru S; Shimoto M; Cho K; Sugiyama O; Aida S
    European Conference on Earthquake Engineering in press 2018 [Refereed]
  • Yuji Morita; Masatoshi Yoshikawa; Noboru Kada; Akihiro Hamasaki; Osamu Sugiyama; Kazuya Okamoto; Tomohiro Kuroda
    European Journal for Biomedical Informatics 14 (1) 26 - 33 2018/01 [Refereed]
  • Kensuke Morris; Goshiro Yamamoto; Shusuke Hiragi; Shosuke Ohtera; Michi Sakai; Osamu Sugiyama; Kazuya Okamoto; Masayuki Nambu; Tomohiro Kuroda
    Studies in health technology and informatics IOS Press 247 71 - 75 0926-9630 2018 [Refereed]
    High accessibility of Electronic Health Record systems can increase usability but creates simultaneously patients' anxieties about privacy issues. In order to reduce the privacy concerns, we focused on control and awareness, and designed an approach that can provide availability of patient's clinical data to doctors in two scenarios; (S1) direct control by the patient when they are conscious, (S2) control by a trusted representative when the patient is unconscious. In this paper, we show further analysis in a survey (n = 310, age range: 19-91) done to test the acceptability of our concept of a using a trusted representative and to further understand the concerns of Japanese citizens to improve our system design. These results in S1 suggest that patients concerned about control have a stronger inclination to also choose full awareness. We found also that patients tended to choose the same level of awareness for the representative as they did for themselves in S2. In addition, patients who chose awareness in S1 tended to choose the same for their representative in S2 and themselves after recovery from unconsciousness. We also discuss the significant differences found between the age-groups 20-39 and 60-79. We conclude that the system design of privacy aware EHR systems must be improved to consider patients who want to preserve their choice of control in the event they become unconscious but do not want to use a representative to maintain control.
  • Ryosuke Kojima; Osamu Sugiyama; Kotaro Hoshiba; Reiji Suzuki; Kazuhiro Nakadai
  • Mizuho Nishio; Mitsuo Nishizawa; Osamu Sugiyama; Ryosuke Kojima; Masahiro Yakami; Tomohiro Kuroda; Kaori Togashi
    PloS one 13 (4) e0195875  2018 [Refereed]
    We aimed to evaluate a computer-aided diagnosis (CADx) system for lung nodule classification focussing on (i) usefulness of the conventional CADx system (hand-crafted imaging feature + machine learning algorithm), (ii) comparison between support vector machine (SVM) and gradient tree boosting (XGBoost) as machine learning algorithms, and (iii) effectiveness of parameter optimization using Bayesian optimization and random search. Data on 99 lung nodules (62 lung cancers and 37 benign lung nodules) were included from public databases of CT images. A variant of the local binary pattern was used for calculating a feature vector. SVM or XGBoost was trained using the feature vector and its corresponding label. Tree Parzen Estimator (TPE) was used as Bayesian optimization for parameters of SVM and XGBoost. Random search was done for comparison with TPE. Leave-one-out cross-validation was used for optimizing and evaluating the performance of our CADx system. Performance was evaluated using area under the curve (AUC) of receiver operating characteristic analysis. AUC was calculated 10 times, and its average was obtained. The best averaged AUC of SVM and XGBoost was 0.850 and 0.896, respectively; both were obtained using TPE. XGBoost was generally superior to SVM. Optimal parameters for achieving high AUC were obtained with fewer numbers of trials when using TPE, compared with random search. Bayesian optimization of SVM and XGBoost parameters was more efficient than random search. Based on observer study, AUC values of two board-certified radiologists were 0.898 and 0.822. The results show that diagnostic accuracy of our CADx system was comparable to that of radiologists with respect to classifying lung nodules.
  • Samar Helou; Goshiro Yamamoto; Eiji Kondoh; Hiroshi Tamura; Shusuke Hiragi; Osamu Sugiyama; Kazuya Okamoto; Masayuki Nambu; Tomohiro Kuroda
    Studies in health technology and informatics IOS Press 251 257 - 260 0926-9630 2018 [Refereed]
    Electronic Medical Records (EMR)s are intrinsic to modern-day clinics. Understanding the roles, i.e., the unintended functions of EMR systems in their context of use can guide the design of EMR systems and clinics to better integrate them. To understand the roles of EMR systems in antenatal care check-ups, we conducted a field-based observational study at an antenatal care clinic in a Japanese university hospital. We observed 37 antenatal care check-ups where we looked at how the EMR system affects the communication between the involved parties and supports or hinders the clinical process. Our data analysis resulted in 10 EMR roles, namely: the wingman, the third wheel, the accomplice, the bouncer, the messenger, the summarizer, the bureaucrat, the assistant, the gossip, and the alien. Through the roles, this study reveals multiple EMR design considerations and opportunities for improving both the human-EMR and human interactions in antenatal care settings.
  • Purnomo Husnul Khotimah; Yuichi Sugiyama; Masatoshi Yoshikawa; Akihiro Hamasaki; Osamu Sugiyama; Kazuya Okamoto; Tomohiro Kuroda
    IEEE J. Biomed. Health Informatics 22 (6) 1949 - 1959 2168-2194 2018 [Refereed]
    OBJECTIVE: For chronic diseases, medical history reconstruction is essential for retrospective database analyses. One important aspect is determining which prescriptions belong to the same episode. However, a standard framework for this task is still lacking, particularly for multitherapy datasets. This paper presents a medication episode construction framework for the medical history of patients with chronic diseases. METHODS: Allen's relaxed temporal relations (i.e., temporal relations with time constraints relaxed by ) is used to define the consecutive prescription relations considering the patients' behavior. For example, patients occasionally arrive earlier or later than their appointment. RESULTS: influences the generation of stable periods (i.e., periods of time, at least three months, in which a medication is continuously taken by a patient). When using the lowest selected value (7 days), considerably fewer shorter stable periods (for durations less than 300 days) are produced and more longer stable periods are produced compared to cases without using . Furthermore, the results show that by using , regarding the number of events, where a stable period continues the previous stable period, decreases and the number of medication transition events available to be observed increases. CONCLUSION: Using in medication episode construction from multitherapy prescription datasets enables the longer expression of short-duration fragmented prescriptions and pruning repetitive prescriptions. SIGNIFICANCE: Our proposed framework is designed for multitherapy datasets, which has not been addressed by previous studies. The concept of relaxes the prescription relation against noise caused by the patient behavior and consequently provides a compact, but informative search space for observing medication transition events in a longitudinal analysis.
  • Mizuho Nishio; Osamu Sugiyama; Masahiro Yakami; Syoko Ueno; Takeshi Kubo; Tomohiro Kuroda; Kaori Togashi
    PloS one 13 (7) e0200721  2018 [Refereed]
    We developed a computer-aided diagnosis (CADx) method for classification between benign nodule, primary lung cancer, and metastatic lung cancer and evaluated the following: (i) the usefulness of the deep convolutional neural network (DCNN) for CADx of the ternary classification, compared with a conventional method (hand-crafted imaging feature plus machine learning), (ii) the effectiveness of transfer learning, and (iii) the effect of image size as the DCNN input. Among 1240 patients of previously-built database, computed tomography images and clinical information of 1236 patients were included. For the conventional method, CADx was performed by using rotation-invariant uniform-pattern local binary pattern on three orthogonal planes with a support vector machine. For the DCNN method, CADx was evaluated using the VGG-16 convolutional neural network with and without transfer learning, and hyperparameter optimization of the DCNN method was performed by random search. The best averaged validation accuracies of CADx were 55.9%, 68.0%, and 62.4% for the conventional method, the DCNN method with transfer learning, and the DCNN method without transfer learning, respectively. For image size of 56, 112, and 224, the best averaged validation accuracy for the DCNN with transfer learning were 60.7%, 64.7%, and 68.0%, respectively. DCNN was better than the conventional method for CADx, and the accuracy of DCNN improved when using transfer learning. Also, we found that larger image sizes as inputs to DCNN improved the accuracy of lung nodule classification.
  • Kazuhiro Nakadai; Makoto Kumon; Hiroshi G. Okuno; Kotaro Hoshiba; Mizuho Wakabayashi; Kai Washizaki; Takahiro Ishiki; Daniel Gabriel; Yoshiaki Bando; Takayuki Morito; Ryosuke Kojima; Osamu Sugiyama
    IEEE International Conference on Intelligent Robots and Systems Institute of Electrical and Electronics Engineers Inc. 2017- 5985 - 5990 2153-0866 2017/12 [Refereed]
    This paper addresses online outdoor sound source localization using a microphone array embedded in an unmanned aerial vehicle (UAV). In addition to sound source localization, sound source enhancement and robust communication method are also described. This system is one instance of deployment of our continuously developing open source software for robot audition called HARK (Honda Research Institute Japan Audition for Robots with Kyoto University). To improve the robustness against outdoor acoustic noise, we propose to combine two sound source localization methods based on MUSIC (multiple signal classification) to cope with trade-off between latency and noise robustness. The standard Eigenvalue decomposition based MUSIC (SEVD-MUSIC) has smaller latency but less noise robustness, whereas the incremental generalized singular value decomposition based MUSIC (iGSVD-MUSIC) has higher noise robustness but larger latency. A UAV operator can use an appropriate method according to the situation. A sound enhancement method called online robust principal component analysis (ORPCA) enables the operator to detect a target sound source more easily. To improve the stability of wireless communication, and robustness of the UAV system against weather changes, we developed data compression based on free lossless audio codec (FLAC) extended to support a 16 ch audio data stream via UDP, and developed a water-resistant microphone array. The resulting system successfully worked in an outdoor search and rescue task in ImPACT Tough Robotics Challenge in November 2016.
  • 杉山 治; 大槻 涼; 鈴木 真生; 松村 耕平; 多田 昌裕; 野間 春生; 黒田 知宏
    人工知能学会第二種研究会資料 一般社団法人 人工知能学会 2017 (AIMED-004) 14  2017/11
  • 上野 翔子; 杉山 治; 西尾 瑞穂; 八上 全弘; 山本 豪志朗; 岡本 和也; 南部 雅幸; 黒田 知宏
    人工知能学会第二種研究会資料 一般社団法人 人工知能学会 2017 (AIMED-004) 04  2017/11
  • Reducing Patient Privacy Concerns via Access Control to EHRs
    Kensuke Morris; Goshiro Yamamoto; Shosuke Ohtera; Michi Sakai; Shusuke Hiragi; Kazuya Okamoto; Osamu Sugiyama; Naoto Kume; Masayuki Nambu; Tomohiro Kuroda
    第37回医療情報学連合大会(第18回日本医療情報学会学術大会)論文集 512 - 517 2017/11 [Refereed]
  • Ryo Otsuki; Osamu Sugiyama; Kohei Matsumura; Masahiro Tada; Harua Noma; Tomohiro Kuroda
    HAI 2017 - Proceedings of the 5th International Conference on Human Agent Interaction ACM 469 - 472 2017/10 [Refereed]
    © 2017 ACM. Walking 8,000 steps in a day is one of the important criteria to maintain our health. However, we often miss a chance to walk due to the difficulty to keep our motivation toward our health in a daily life. We propose an algorithm to search an appropriate daily walking pattern from the user's past walking record. The searched walking patterns are used for making a health promotion agent recommend an effective timing to walk. With this recommendation, users will not miss the timing when they can walk. In this study, we focused on designing the algorithm for searching a daily walking pattern, which satisfied both conditions, achieving 8,000 steps a day and being similar to the current user walking record. In a pilot performance study, it was revealed that the proposed algorithm can narrow down the walking pattern candidates and properly search the walking pattern similar to the current user walking record.
  • Ryosuke Kojima; Osamu Sugiyama; Kotaro Hoshiba; Kazuhiro Nakadai; Reiji Suzuki; Charles E. Taylor
    Journal of Robotics and Mechatronics Fuji Technology Press 29 (1) 236 - 246 1883-8049 2017/02 [Refereed]
    This paper addresses bird song scene analysis based on semi-automatic annotation. Research in animal behavior, especially in birds, would be aided by automated or semi-automated systems that can localize sounds, measure their timing, and identify their sources. This is difficult to achieve in real environments, in which several birds at different locations may be singing at the same time. Analysis of recordings from the wild has usually required manual annotation. These annotations may be inaccurate or inconsistent, as they may vary within and between observers. Here we suggest a system that uses automated methods from robot audition, including sound source detection, localization, separation and identification. In robot audition, these technologies are assessed separately, but combining them has often led to poor performance in natural setting. We propose a new Spatial-Cue-Based Probabilistic Model (SCBPM) for their integration focusing on spatial information. A second problem has been that supervised machine learning methods usually require a pre-trained model, which may need a large training set of annotated labels. We have employed a semi-automatic annotation approach, in which a semi-supervised training method is deduced for a new model. This method requires much less pre-annotation. Preliminary experiments with recordings of bird songs from the wild revealed that our system outperformed the identification accuracy of a method based on conventional robot audition.
  • Osamu Sugiyama; Satoshi Uemura; Akihide Nagamine; Ryosuke Kojima; Keisuke Nakamura; Kazuhiro Nakadai
    Journal of Robotics and Mechatronics Fuji Technology Press 29 (1) 188 - 197 1883-8049 2017/02 [Refereed]
    This paper addresses Acoustic Event Identification (AEI) of acoustic signals observed with a microphone array embedded in a quadrotor that is flying in a noisy outdoor environment. In such an environment, noise generated by rotors, wind, and other sound sources is a big problem. To solve this, we propose the use of a combination of two approaches that have recently been introduced: Sound Source Separation (SSS) and Sound Source Identification (SSI). SSS improves the Signal-to-Noise Ratio (SNR) of the input sound, and SSI is then performed on the SNR-improved sound. Two SSS methods are investigated. One is a single channel algorithm, Robust Principal Component Analysis (RPCA), and the other is Geometric High-order Decorrelation-based Source Separation (GHDSS-AS), known as a multichannel method. For SSI, we investigate two types of deep neural networks namely Stacked denoising Autoencoder (SdA) and Convolutional Neural Network (CNN), which have been extensively studied as highly-performant approaches in the fields of automatic speech recognition and visual object recognition. Preliminary experiments have showed the effectiveness of the proposed approaches, a combination of GHDSS-AS and CNN in particular. This combination correctly identified over 80% of sounds in an 8-class sound classification recorded by a hovering quadrotor. In addition, the CNN identifier that was implemented could be handled even with a low-end CPU by measuring the prediction time.
  • Takuma Ohata; Keisuke Nakamura; Akihide Nagamine; Takeshi Mizumoto; Takayuki Ishizaki; Ryosuke Kojima; Osamu Sugiyama; Kazuhiro Nakadai
    Journal of Robotics and Mechatronics Fuji Technology Press 29 (1) 177 - 187 1883-8049 2017/02 [Refereed]
    This paper addresses sound source detection in an outdoor environment using a quadcopter with a microphone array. As the previously reported method has a high computational cost, we proposed a sound source detection algorithm called multiple signal classification based on incremental generalized singular value decomposition (iGSVD-MUSIC) that detects the sound source location and temporal activity at low computational cost. In addition, to relax the estimation error problem of a noise correlation matrix that is used in iGSVD-MUSIC, we proposed correlation matrix scaling (CMS) to achieve soft whitening of noise. As CMS requires a parameter to decide the degree of whitening, we analyzed the optimal value of the parameter by using numerical simulation. The prototype system based on the proposed methods was evaluated with two types of microphone arrays in an outdoor environment. The experimental results showed that the proposed iGSVD-MUSIC-CMS significantly improves sound source detection performance, and the prototype system achieves real-time processing. Moreover, we successfully clarified the behavior of the CMS parameter by using a numerical simulation in which the empirically-obtained optimal value corresponded with the analytical result.
  • Kotaro Hoshiba; Osamu Sugiyama; Akihide Nagamine; Ryosuke Kojima; Makoto Kumon; Kazuhiro Nakadai
    Journal of Robotics and Mechatronics Fuji Technology Press 29 (1) 154 - 167 1883-8049 2017/02 [Refereed]
    We have studied on robot-audition-based sound source localization using a microphone array embedded on a UAV (unmanned aerial vehicle) to locate people who need assistance in a disaster-stricken area. A localization method with high robustness against noise and a small calculation cost have been proposed to solve a problem specific to the outdoor sound environment. In this paper, the proposed method is extended for practical use, a system based on the method is designed and implemented, and results of sound source localization conducted in the actual outdoor environment are shown. First, a 2.5-dimensional sound source localization method, which is a two-dimensional sound source localization plus distance estimation, is proposed. Then, the offline sound source localization system is structured using the proposed method, and the accuracy of the localization results is evaluated and discussed. As a result, the usability of the proposed extended method and newly developed threedimensional visualization tool is confirmed, and a change in the detection accuracy for different types or distances of the sound source is found. Next, the sound source localization is conducted in real-time by extending the offline system to online to ensure that the detection performance of the offline system is kept in the online system. Moreover, the relationship between the parameters and detection accuracy is evaluated to localize only a target sound source. As a result, indices to determine an appropriate threshold are obtained and localization of a target sound source is realized at a designated accuracy.
  • Santos Luciano; Henrique de Oliveira; Okamoto Kazuya; Yamamoto Goshiro; Sugiyama Osamu; Aoyama Tomoki; Kuroda Tomohiro
    エンタテインメントコンピューティングシンポジウム2017論文集 Information Processing Society of Japan 2017 (2017) 232 - 236 2017 [Refereed]
  • Mizuho Nishio; Mitsuo Nishizawa; Osamu Sugiyama; Ryosuke Kojima; Masahiro Yakami; Tomohiro Kuroda; Kaori Togashi
    CoRR abs/1708.05897 2017 [Refereed]
  • 総合病院における位置情報及びオーダ情報による患者待ち時間取得の試み
    福士 雄太; 岡本 和也; 平木 秀輔; 杉山 治; 田村 寛; 南部 雅幸; 黒田 知宏
    医療情報学連合大会論文集 (一社)日本医療情報学会 36回 (1) 566 - 569 1347-8508 2016/11
  • ヒューマンエラー防止のための輸液・シリンジポンプと病院情報システムの統合
    江指 未紗; 杉山 治; 平木 秀輔; 岡本 和也; 田村 寛; 南部 雅幸; 黒田 知宏
    医療情報学連合大会論文集 (一社)日本医療情報学会 36回 (2) 1194 - 1197 1347-8508 2016/11
  • Takayuki Morito; Osamu Sugiyama; Ryosuke Kojima; Kazuhiro Nakadai
    This paper addresses sound source separation and identification for noise-contaminated acoustic signals recorded with a microphone array embedded in an Unmanned Aerial Vehicle (UAV), aiming at people's voice detection quickly and widely in a disaster situation. The key approach to achieve this is Deep Neural Network (DNN), but it is well known that training a DNN needs a huge dataset to improve its performance. In a practical application, building such a dataset is not often realistic owing to the cost of manual data annotation. Therefore, we propose a Partially-Shared Deep Neural Network (PS-DNN) which can learn multiple tasks at the same time with a small amount of annotated data. Preliminary results show that the PS-DNN outperforms conventional DNN-based approaches which require fully-annotated data in training in terms of identification accuracy. In addition, it maintains performance even when noise-suppressed signals are used for sound source separation training, and partially annotated data is used for sound source identification training.
  • Ryosuke Kojima; Osamu Sugiyama; Reiji Suzuki; Kazuhiro Nakadai; Charles E. Taylor
    This paper addresses bird song analysis based on semi-automatic annotation. Research in animal behavior, especially with birds, would be aided by automated (or semi-automated) systems that can localize sounds, measure their timing, and identify their source. This is difficult to achieve in real environments where several birds may be singing from different locations and at the same time. Analysis of recordings from the wild has in the past typically required manual annotation. Such annotation is not always accurate or even consistent, as it may vary both within or between observers. Here we propose a system that uses automated methods from robot audition, including sound source detection, localization, separation and identification. In robot audition these technologies have typically been studied separately; combining them often leads to poor performance in real-time application from the wild. We suggest that integration is aided by placing a primary focus on spatial cues, then combining other features within a Bayesian framework. A second problem has been that supervised machine learning methods typically requires a pre-trained model that may require a large training set of annotated labels. We have employed a semi-automatic annotation approach that requires much less pre-annotation. Preliminary experiments with recordings of bird songs from the wild revealed that for identification accuracy our system outperformed a method based on conventional robot audition.
  • Purnomo Husnul Khotimah; Masatoshi Yoshikawa; Akihiro Hamasaki; Osamu Sugiyama; Kazuya Okamoto; Tomohiro Kuroda
    Frequent sequential pattern (FSP) mining has become an effective tool to explore the pattern sequence occurrences in many fields. The methods developed in FSP is mainly based on Apriori algorithm. This algorithm looks for frequent sequence of itemset which need not to be consecutive. In addition, the itemset that supports the cardinality of a frequent sequence can be a partial itemset. However, in the case of medication for diabetes type 2, the selection of patient medication is considered essential. A combination of medications represents the clinical conditions of the patients. Therefore, we considered a medication combination as one full item sets (i.e., singleton). We are interested in the transition events from one medication episode to the next. As such, we consider consecutive sequence of singleton. This paper studies the result characteristic of Apriori-based FSP and singleton mining. The result of this study shows that the singleton mining results set is the subset of Apriori-based algorithm, with 0.203 of ratio value. However, Apriori-based algorithm results set contains frequent sequence pattern of medication transition event which is unlikely to happen in real clinical conditions with high frequency. By contrast, the singleton mining results set represents the true medication transition event.
  • Takayuki Morito; Osamu Sugiyama; Satoshi Uemura; Ryosuke Kojima; Kazuhiro Nakadai
    This paper addresses reduction of computational cost in training of a Deep Neural Network (DNN), in particular, for sound identification using highly noise-contaminated sound recorded with a microphone array embedded in an Unmanned Aerial Vehicle (UAV), aiming at people's voice detection quickly and widely in a disastrous situation. It is known that a DNN training method called end-to-end training shows high performance, since it uses a huge neural network with high nonlinearity which is trained with a large amount of raw input signals without preprocessing. Its computational cost is, however, expensive due to the high complexity of the neural network. Therefore, we propose twostage DNN training using two separately-trained networks; denoising of sound sources and sound source identification. Since the huge network is divided into two smaller networks, the complexity of the networks is expected to decrease and each of them can consider a specific model of denoising and identification. This results in faster convergence and computational cost reduction in DNN training. Preliminary results showed that only 71% of training time was necessary with the proposed two staged network, while maintaining the accuracy of sound source identification, compared to end-to-end training using noisy acoustic signals recorded with an 8 ch circular microphone array embedded in a UAV.
  • Ryosuke Kojima; Osamu Sugiyama; Kazuhiro Nakadai
    APPLIED ARTIFICIAL INTELLIGENCE TAYLOR & FRANCIS INC 30 (3) 181 - 200 0883-9514 2016 [Refereed]
    We propose a multimodal "scene understanding" framework using sensory and text information. Scene understanding is defined by extracting information such as What, When, Where, Who, Why, and How on the surrounding environment. Although scene understanding has been studied, information on why and how was not considered. We constructed a framework for extracting how information, in addition to the conventional information based on multimodality and background knowledge. This framework was applied to a cooking scene, in which how information was defined as a cooking procedure. This framework was evaluated by constructing an audio-visual multimodal cooking recognition system, utilizing recipes as background knowledge. A Convolutional Neural Network (CNN) and a Hierarchical Hidden Markov Model (HHMM) were adopted in this system. Our experiments showed the robustness of the proposed framework in noisy and/or occluded situations. An interactive cooking support system based on the proposed framework might suggest the next step for cooking procedures via human-robot communications.
  • Uemura Satoshi; Sugiyama Osamu; Kojima Ryosuke; Nakadai Kazuhiro
    The Abstracts of the international conference on advanced mechatronics : toward evolutionary fusion of IT and mechatronics : ICAM The Japan Society of Mechanical Engineers 2015 329 - 330 1348-8961 2015 
    We present acoustic event identification by integration of sound source separation and deep learning based on a convolutional neural network for extremely noisy acoustics signals captured with a 16 ch microphone array embedded in an Unmanned Aerial Vehicle (UAV).We showed that the proposed method can identify over 98% sound sources correctly for a 10 class classification task using 16 ch recorded sound data with a microphone array embedded in a quadrotor.
  • Multimodal scene understanding using CNN and hierarchical HMM for a cooking support robot
    Ryosuke Kojima; Osamu Sugiyama; Kazuhiro Nakadai
    Machine Learning Summer School 2015
  • Ryosuke Kojima; Osamu Sugiyama; Kazuhiro Nakadai
    2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) IEEE 2015-December 4210 - 4215 2153-0858 2015 [Refereed]
    This paper addresses multimodal "scene understanding" for a robot using audio-visual and text information. Scene understanding is defined by extracting six-W information such as What, When, Where, Who, Why, and hoW on the surrounding environment. Although scene understanding for a robot has been studied in the fields of robot vision and audition, only the first four Ws except for why and how information were considered. We, thus, focus on extracting how information, in particular, on cooking scenes. In cooking scenes, we define how information as a cooking procedure, and it is useful that a robot gives appropriate advice for cooking. To realize such cooking support, we propose a multimodal cooking procedure recognition framework consisting of Convolutional Neural Network (CNN), and Hierarchical Hidden Markov Model (HHMM). CNN is knows as one of the most advanced classifiers, and it is applied to recognize a cooking events from audio and visual information. HHMM models a cooking procedure represented by a sequence of cooking events, which is defined as a relationship between cooking events using text data obtained from web, and the cooking events classified with CNN. Therefore, our proposed framework integrates these three types of modalities. We constructed an interactive cooking support system based on the proposed framework, which advice a next step in the current cooking procedure through human-robot communication. Preliminary results with simulated and real recorded multi-modal scenes showed the robustness of the proposed framework in a noisy and/or occluded situation.
  • Ryosuke Kojima; Osamu Sugiyama; Kazuhiro Nakadai
    We address noise-robust "auditory scene understanding" for a robot defined by extracting 6W (What, When, Where, Who, Why, hoW) information on the surrounding environment. Although such a robot has been studied in the field of robot audition, only the first four Ws except for "why" and "how" were in scope. Thus, this paper mainly focuses on extracting "how" information, in particular, on cooking scenes to realize a cooking support robot. In this case, "how" information is regarded as a cooking procedure, we construct sound-based cooking procedure recognition based on two models. One is a conventional statistical model, Gaussian Mixture Model (GMM), which is used for an acoustic model to recognize a cooking sound event such as stirring, cutting and so on. The other is a Hierarchical Hidden Markov Model (HHMM), which is used for a recipe model to recognize a sequence of cooking events, i.e., a cooking procedure. We constructed a prototype system for cooking recipe and procedure recognition. Preliminary results showed that the proposed GMM-HHMM based system outperformed a conventional GMM-HMM based system in terms of noise-robustness in cooking recipe recognition and our system was able to correct misrecognition of cooking sound events using recipe model in cooking procedure recognition.
  • Osamu Sugiyama; Ryosuke Kojima; Kazuhiro Nakadai
    In this study, we designed and developed an interactive interface to optimize sound source localization with the multi-channel robot audition software, HARK. With the developed interface, the system can lighten the loads of optimizing parameters and supports users easily to handle the parameter optimization in sound source localization. In order to properly handle the multi-channel sounds, it is better dynamically to indicate the parameter from both temporal and spatial perspectives, though almost all of the software can only indicate a static threshold. We developed an interactive interface, with which the user can create or delete the sound source on the MUSIC spectrum and can set up an appropriate parameter settings for the environment. We also conducted an evaluation of the software and revealed that our proposed interface was superior than that of the current HARK interface from the view points of intuitiveness and visibility.
  • Osamu Sugiyama; Ryosuke Kojima; Kazuhiro Nakadai
    2015 IEEE-RAS 15TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS) IEEE 2015-December 825 - 830 2164-0572 2015 [Refereed]
    This study describes the design and development of an interactive interface to optimize sound source localization (SSL) based on a microphone array with the multi-channel robot audition software program, HARK. Using this interface, users can optimize the parameters of SSL with coarse-to-fine tuning. That is, users can roughly optimize the system (coarse tuning) and subsequently tune parameters in detail (fine tuning). Experimentally, our proposed interface showed better visibility and flexibility in optimizing SSL parameters than the current HARK interface.
  • Osamu Sugiyama; Katsutoshi Itoyama; Kazuhiro Nakada; Hiroshi G. Okuno
    2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC) IEEE 2014-January (January) 2335 - 2340 1062-922X 2014 [Refereed]
    With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Since the multidirectional sound annotation requires simultaneous labeling of separated sound sources and a multidirectional overview of the sources, it is essential to have an efficient way of annotation and an intuitive view of multidirectional sounds. Our proposed sound annotation tool provides drag & drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user's annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.
  • Takahiro Iyama; Osamu Sugiyama; Takuma Otsuka; Katsutoshi Itoyama; Hiroshi G. Okuno
    We have developed a system for visualizing auditory awareness on the basis of sound source locations estimated using a depth sensor and microphone array. Previous studies on visualizing the acoustic environment viewed the level of sound pressures directly on the captured image, so the visualization was often based on a mixture of several sound sources. As a result, which targets to focus on was not intuitive. To help users selectively to find the targets and focus on the target analysis, we should extract the captured acoustic information and selectively propose it with the user demand. We have designed a three-layer visualization model for auditory awareness consisting of a sound source distribution layer, a sound location layer, and a sound saliency layer. The model extracts acoustic information by using the depth image and multi-directional sound sources captured with a depth sensor and microphone array. This model is used in the system we developed for visualizing auditory awareness.
  • Osamu Sugiyama; Kazuhiko Shinozawa; Takaaki Akimoto; Norihiro Hagita
    SOCIAL ROBOTICS, ICSR 2010 SPRINGER-VERLAG BERLIN 6414 90 - 99 0302-9743 2010 [Refereed]
    This paper reports the docking and metaphor effects on persuasion among multi-robot healthcare systems. The goal of our research is to make a robot friend that lives with its users and persuades them to make appropriate healthcare decisions. To realize such a robot friend, we propose a physical approach called docking as well as a contextual approach called metaphor to perform relational inheritance among multi-robot systems. We implemented a multi-robot persuasion system based on the two approaches and verified its effectiveness. The experimental results revealed that users emphasize interpersonal relationships to decide whether to follow the robot's advice when utilizing the metaphor approach, and that users emphasize robot aggressiveness when utilizing docking approach.
  • Osamu Sugiyama; Takayuki Kanda; Michita Imai; Hiroshi Ishiguro; Norihiro Hagita
    A simple view of deictic communication only includes the indication process and recognition process: a person points at an object and says something about it such as "look at this," and then the other person recognizes the pointing gesture and pays attention to the indicated object. However, this simple view lacks three important processes: attention synchronization, context focus, and believability establishment. We refer to these three processes as "facilitation processes" and implement them in a humanoid robot with a motion capturing system. An experiment with 30 subjects revealed that the facilitation processes make deictic communication natural.
  • Osamu Sugiyama; Takayuki Kanda; Michita Imai; Hiroshi Ishiguro; Norihiro Hagita; Yuichiro Anzai
    CONNECTION SCIENCE TAYLOR & FRANCIS LTD 18 (4) 379 - 402 0954-0091 2006/12 [Refereed]
    When describing a physical object, we indicate which object by pointing and using reference terms, such as 'this' and 'that', to inform the listener quickly of an indicated object's location. Therefore, this research proposes using a three-layer attention-drawing model for humanoid robots that incorporates such gestures and verbal cues. The proposed three-layer model consists of three sub-models: the Reference Term Model (RTM); the Limit Distance Model (LDM); and the Object Property Model (OPM). The RTM selects an appropriate reference term for distance, based on a quantitative analysis of human behaviour. The LDM decides whether to use a property of the object, such as colour, as an additional term for distinguishing the object from its neighbours. The OPM determines which property should be used for this additional reference. Based on this concept, an attention-drawing system was developed for a communication robot named 'Robovie', and its effectiveness was tested.
  • Osamu Sugiyama; Takayuki Kanda; Michita Imai; Hiroshi Ishiguro; Norihiro Hagita
    This paper presents a three-layer model for generation and recognition of attention-drawing behavior. The model enables a robot to recognize people's attention-drawing behavior as well as to perform attention-drawing behavior to people. It consists of three layers: the PSM (Pointing Space Model), the RTM (Reference Term Model), and the OPM (Object Property Model). The PSM associates the pointing gesture with a reference term, the RTM associates positional relationships with a reference term, and the OPM associates other supplemental verbal cues with a reference term. We implemented the model in a humanoid robot, Robovie, and verified its effectiveness through an experiment.
  • O Sugiyama; T Kanda; M Imai; H Ishiguro; N Hagita
    2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vols 1-4 IEEE 2140 - 2145 2005 [Refereed]
    When we talk about objects in an environment, we indicate to a listener which object is currently under consideration by using pointing gesture and such reference terms as "this" and "that". Such reference terms play an important role in human interaction by quickly informing the listener of an indicated object's location. In this research, we propose a three-layered draw-attention model for humanoid robots with gestures and verbal cues. Our proposed three-layered model consists of three sub models: Reference Term Model (RTM), Limit Distance Model (LDM) and Object Property Model (OPM). RTM decides an appropriate reference term using functions constructed by an analysis of human behavior. LDM decides whether to use the object's property with a reference term. OPM decides the appropriate property for indicating the object by comparing object properties with each other. We developed an attention drawing system in a communication robot named '' Robovie '' based on the three layered model. We confirmed its effectiveness through the experiments.


Research Themes

  • 日本学術振興会:科学研究費助成事業 基盤研究(C)
    Date (from‐to) : 2022/04 -2025/03 
    Author : 西尾 瑞穂; 藤本 晃司; 杉山 治; 倉田 靖桐
  • 日本学術振興会:科学研究費助成事業 基盤研究(A)
    Date (from‐to) : 2021/04 -2025/03 
    Author : 倉田 真宏; 趙 晃済; 小島 紘太郎; 杉山 治; 藤田 皓平; 大鶴 繁; 金尾 伊織
    特定診療行為の構造要素・非構造要素・重要機器・設備を特定し、連関図を構築している。新生児集中治療室を対象に、図式でモデルの要素間因果関係を記すシステムダイナミクス理論を用いた連関図の案を作成した。各要素の地震脆弱性に関するデータを収集し、対数正規分布に基づく被害率関数を構築した。治療室のマネジメントを分析し、診療行為に必要な人員ならびに地震時に参集可能な人員をモデル化した。地震後の利用可能な病床数の時系列変化を出力することに成功した。ただし、データが不足している要素については、仮のモデルデータを利用している。 医療施設の地震等の災害発生に伴う建物の構造安全性や機能性の低下を評価するため、京都市内で想定される地震動データを利用し、入力地震動のばらつきが建物の損傷に与える影響を簡易な方法で検討する手法を提案した。入力地震動を2つのインパルス列からなるダブルインパルス入力に置き換え、入力の振幅とインパルスの入力間隔が建物の最大塑性率に及ぼす影響を評価した。様々な振幅とインパルス間隔を持つダブルインパルス入力に対して 1 質点弾塑性系の最大塑性率を評価することで、想定地震動に対して入力のばらつきが最大塑性率に及ぼす影響を明らかにした。 災害時重要施設の事業継続性を確保するために、非構造部材に関しても耐震性能評価手法の確立が不可欠である。2021年度は、屋上の給水設備の地震被害を明らかにし、地震被害の予測に必要なデータを収集するため、置き配管試験体の振動台実験を実施し、損傷状況や加速度応答・変位応答のデータを収集した。配管の損傷軽減の目的で用いられるフレキシブル配管は、加速度、変位応答ともに直配管に比べて大きく、置き架台の高さが配管の損傷に影響を与えることを示した。また、並行して配管の繰り返し曲げ実験を実施して、径と材質の異なる配管の耐力と変形性能を明らかにした。
  • Japan Society for the Promotion of Science:Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
    Date (from‐to) : 2020/04 -2023/03 
    Author : 三宅 正裕; 辻川 明孝; 杉山 治; 山田 亮; 長崎 正朗
    今年度は、中心性漿液性脈絡網膜症からのパキコロイド新生血管発症に関連する遺伝子についての検討を中心に実施した。具体的には、脈絡膜新生血管を有さない中心性漿液性脈絡網膜症患者を対象として、脈絡膜新生血管が発症するまでの期間に対しての生存解析を、ゲノムワイドに実施した。この結果、加齢黄斑変性の疾患感受性遺伝子として知られているARMS2がゲノムワイドレベルで有意となった。その他にも候補となる一塩基変異が複数同定されたため、神戸大学のデータセットで再現性の確認を行ったが、再現性が確認されてゲノムワイドレベルで有意となった一塩基変異は他には存在しなかった。中心性漿液性脈絡網膜症及び加齢黄斑変性の疾患感受性遺伝子として知られるCFHについても検討を行ったが、ゲノムワイドレベルでは有意な相関を示さなかった。追加的な解析として、加齢黄斑変性の疾患感受性変異から計算される加齢黄斑変性の遺伝的リスクスコアによって新生血管発症率に差があるかどうかを確認したところ、加齢黄斑変性の遺伝的リスクスコアが高い中心性漿液性脈絡網膜症の一群ほど、新生血管を発症しやすいことが確認された。本研究結果は、第124回日本眼科学会総会及び国際学会であるEURETINA2020で発表を行った上で、現在論文の草案作成中である。 その他、中心性漿液性脈絡網膜症の疾患感受性遺伝子の探索を続けている。これに当たっては国際的な協力を進めており、オランダ及び香港からのデータを解析中で、現在中国からのデータを待っているところである。
  • Japan Society for the Promotion of Science:Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)
    Date (from‐to) : 2019/04 -2021/03 
    Author : 岡本 和也; 黒田 知宏; 加藤 源太; 山本 豪志朗; 杉山 治
    本研究は、退院時サマリを学習データとした診療録の要約・重要箇所の抽出、各種診療諸記録を用いた重要タイミングの抽出、各種診療所記録と診療録の紐付け手法の開発、効果的な診療録定時手法の開発といった4つのサブテーマを設定して進めている。 本年度は各疾患の重要タイミングを決定しうる診療諸記録を選択し、その診療諸記録の変化点を検出することで重要タイミングを抽出する。例えば、糖尿病であれば、血液検査結果や処方データが利用すべき診療諸記録の有力な候補であり、検査値が閾値を超えたタイミング、処方が切り替わったタイミングを抽出した。 さらに、各種診療諸記録と診療録の紐づけ手法の開発を行い、ある特定の重要タイミングの診療諸記録と診療録の記載内容の突合を行った。単純に日時で突合すると関連性のない診療録が多く紐づけられてしまうため、各診療諸記録の特徴を分析し、その特徴と診療録の関連を見つけた。そのための手法として、クラスタリング手法を用いた。
  • Japan Society for the Promotion of Science:Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (B)
    Date (from‐to) : 2018/04 -2021/03 
    Author : Noma Haruo
    In this study, we constructed a machine haptic recognition model using an ultra-small tactile sensor and a staged neural network that mimics a human tactile recognition model, and aimed to realize human-like machine haptic perception and elucidate human haptic recognition based on the similarity between machine and human haptic perception. Here, we focused on the tracing recognition task, in which eight kinds of materials are identified by tracing them, and evaluated the recognition rate depending on five DNN model (Dense, 1D-CNN, RNN, LSTM, and GRU) and the number of layers (1-4). As a result, the highest identification rate of 97.8% was achieved. In addition, the intermediate features obtained in the course of the identification process showed results similar to the features that human shows.
  • Japan Society for the Promotion of Science:Grants-in-Aid for Scientific Research Grant-in-Aid for Challenging Research (Exploratory)
    Date (from‐to) : 2017/06 -2019/03 
    Author : Noma Haruo
    DMAT is a professional medical team that can rescue in the acute phase at a large-scale disaster or an accident where many injured people occur. DMAT has used traditional communication tool such as telephone and mail sharing information between the dispatch team and the backward support member. As smartphones become popular in recent years、 Kyoto University DMAT employed LINE for the first time in a rescue operation in the Kumamoto earthquake that occurred in 2016. LINE becomes the most familiar tool from the daily life and allowed to share information in the field. However、 as the number of chat group increased、 the group chat feature specific LINE like SNS did not work well and the information overload buried the important information in many conversations. In this research、 we found the problem of the communication means which influences such a relief operation and have proposed new design of the communication method for DMAT