论文标题
使用视频的痴呆症患者的风险检测的隐私保护行为
Privacy-Protecting Behaviours of Risk Detection in People with Dementia using Videos
论文作者
论文摘要
患有痴呆症的人经常表现出痴呆症的行为和心理症状,这可能会使他们和他人的安全风险。长期护理设施中的现有视频监视系统可用于监视此类风险行为,以提醒员工在某些情况下防止潜在的伤害或死亡。但是,与正常事件相比,这些风险事件的行为是异质的且不常见的。此外,分析原始视频也可能引起隐私问题。在本文中,我们介绍了两种新颖的隐私保护基于视频的异常检测方法,以检测痴呆症患者的风险行为。我们要么将身体姿势信息作为骨骼提取,要么使用语义分割面具,以用其语义边界代替场景中的多个人。我们的工作与大多数现有的视频异常检测方法不同,该检测的重点是基于外观的特征,这可能使某人的隐私处于危险之中,并且也容易受到基于像素的噪声的影响,包括照明和观看方向。我们使用普通活动的匿名视频来训练定制的时空卷积自动编码器,并确定风险行为为异常。我们在痴呆症患者的痴呆症护理部门进行的一项现实世界研究中展示了我们的结果,其中包含大约21个小时的正常活动数据和9小时的数据,其中包含正常和风险事件的行为。我们将我们的方法与原始RGB视频进行了比较,并在基于骨架的方法的接收器操作特性曲线性能下获得了相似的区域,基于细分掩码的方法为0.823。
People living with dementia often exhibit behavioural and psychological symptoms of dementia that can put their and others' safety at risk. Existing video surveillance systems in long-term care facilities can be used to monitor such behaviours of risk to alert the staff to prevent potential injuries or death in some cases. However, these behaviours of risk events are heterogeneous and infrequent in comparison to normal events. Moreover, analyzing raw videos can also raise privacy concerns. In this paper, we present two novel privacy-protecting video-based anomaly detection approaches to detect behaviours of risks in people with dementia. We either extracted body pose information as skeletons or used semantic segmentation masks to replace multiple humans in the scene with their semantic boundaries. Our work differs from most existing approaches for video anomaly detection that focus on appearance-based features, which can put the privacy of a person at risk and is also susceptible to pixel-based noise, including illumination and viewing direction. We used anonymized videos of normal activities to train customized spatio-temporal convolutional autoencoders and identify behaviours of risk as anomalies. We showed our results on a real-world study conducted in a dementia care unit with patients with dementia, containing approximately 21 hours of normal activities data for training and 9 hours of data containing normal and behaviours of risk events for testing. We compared our approaches with the original RGB videos and obtained a similar area under the receiver operating characteristic curve performance of 0.807 for the skeleton-based approach and 0.823 for the segmentation mask-based approach.