{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T17:44:10Z","timestamp":1777657450082,"version":"3.51.4"},"reference-count":52,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2020,5,1]],"date-time":"2020-05-01T00:00:00Z","timestamp":1588291200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/R005273\/1"],"award-info":[{"award-number":["EP\/R005273\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The use of visual sensors for monitoring people in their living environments is critical in processing more accurate health measurements, but their use is undermined by the issue of privacy. Silhouettes, generated from RGB video, can help towards alleviating the issue of privacy to some considerable degree. However, the use of silhouettes would make it rather complex to discriminate between different subjects, preventing a subject-tailored analysis of the data within a free-living, multi-occupancy home. This limitation can be overcome with a strategic fusion of sensors that involves wearable accelerometer devices, which can be used in conjunction with the silhouette video data, to match video clips to a specific patient being monitored. The proposed method simultaneously solves the problem of Person ReID using silhouettes and enables home monitoring systems to employ sensor fusion techniques for data analysis. We develop a multimodal deep-learning detection framework that maps short video clips and accelerations into a latent space where the Euclidean distance can be measured to match video and acceleration streams. We train our method on the SPHERE Calorie Dataset, for which we show an average area under the ROC curve of 76.3% and an assignment accuracy of 77.4%. In addition, we propose a novel triplet loss for which we demonstrate improving performances and convergence speed.<\/jats:p>","DOI":"10.3390\/s20092576","type":"journal-article","created":{"date-parts":[[2020,5,4]],"date-time":"2020-05-04T14:00:43Z","timestamp":1588600843000},"page":"2576","update-policy":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Person Re-ID by Fusion of Video Silhouettes and Wearable Signals for Home Monitoring Applications"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/orcid.org\/0000-0002-6510-835X","authenticated-orcid":false,"given":"Alessandro","family":"Masullo","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tilo","family":"Burghardt","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/orcid.org\/0000-0001-8804-6238","authenticated-orcid":false,"given":"Dima","family":"Damen","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/orcid.org\/0000-0002-1676-3729","authenticated-orcid":false,"given":"Toby","family":"Perrett","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/orcid.org\/0000-0002-6478-1403","authenticated-orcid":false,"given":"Majid","family":"Mirmehdi","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,5,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Maskeli\u016bnas, R., Dama\u0161evi\u010dius, R., and Segal, S. (2019). A Review of Internet of Things Technologies for Ambient Assisted Living Environments. Future Internet, 11.","DOI":"10.3390\/fi11120259"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1007\/s12652-015-0328-1","article-title":"Vision-based patient monitoring: A comprehensive review of algorithms and technologies","volume":"9","author":"Sathyanarayana","year":"2018","journal-title":"J. Ambient Intell. Humaniz. Comput."},{"key":"ref_3","unstructured":"Zagler, W., Panek, P., and Rauhala, M. (2008). Ambient Assisted Living Systems\u2014The Conflicts between Technology, Acceptance, Ethics and Privacy. Assisted Living Systems\u2014Models, Architectures and Engineering Approaches, Schloss Dagstuhl."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Ziefle, M., Rocker, C., and Holzinger, A. (2011, January 18\u201322). Medical Technology in Smart Homes: Exploring the User\u2019s Perspective on Privacy, Intimacy and Trust. Proceedings of the IEEE Annual Computer Software and Applications Conference Workshops, Munich, Germany.","DOI":"10.1109\/COMPSACW.2011.75"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Birchley, G., Huxtable, R., Murtagh, M., ter Meulen, R., Flach, P., and Gooberman-Hill, R. (2017). Smart homes, private homes? An empirical study of technology researchers\u2019 perceptions of ethical issues in developing smart-home health technologies. BMC Med. Ethics, 18.","DOI":"10.1186\/s12910-017-0183-z"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Hall, J., Hannuna, S., Camplani, M., Mirmehdi, M., Damen, D., Burghardt, T., Tao, L., Paiement, A., and Craddock, I. (2016, January 24\u201325). Designing a Video Monitoring System for AAL applications: The SPHERE Case Study. Proceedings of the IET International Conference on Technologies for Active and Assisted Living, London, UK.","DOI":"10.1049\/ic.2016.0061"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"10873","DOI":"10.1016\/j.eswa.2012.03.005","article-title":"A review on vision techniques applied to Human Behaviour Analysis for Ambient-Assisted Living","volume":"39","author":"Chaaraoui","year":"2012","journal-title":"Expert Syst. Appl."},{"key":"ref_8","unstructured":"Masullo, A., Burghardt, T., Damen, D., Hannuna, S., Ponce-Lopez, V., and Mirmehdi, M. (2018, January 3\u20136). CaloriNet: From silhouettes to calorie estimation in private environments. Proceedings of the British Machine Vision Conference, Newcastle, UK."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Masullo, A., Burghardt, T., Perrett, T., Damen, D., and Mirmehdi, M. (2019). Sit-to-Stand Analysis in the Wild Using Silhouettes for Longitudinal Health Monitoring. Image Analysis and Recognition, Springer Nature Switzerland.","DOI":"10.1007\/978-3-030-27272-2_15"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"756","DOI":"10.1109\/JBHI.2016.2570300","article-title":"Silhouette Orientation Volumes for Efficient Fall Detection in Depth Videos","volume":"21","author":"Aslan","year":"2017","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.jbi.2016.08.003","article-title":"A vision based proposal for classification of normal and abnormal gait using RGB camera","volume":"63","year":"2016","journal-title":"J. Biomed. Inform."},{"key":"ref_12","unstructured":"Leo, M., and Farinella, G.M. (2018). Chapter 6\u2014Computer Vision for Ambient Assisted Living: Monitoring Systems for Personalized Healthcare and Wellness That Are Robust in the Real World and Accepted by Users, Carers, and Society. Computer Vision for Assistive Healthcare, Academic Press. Computer Vision and Pattern Recognition."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1109\/MIS.2015.57","article-title":"Bridging e-Health and the Internet of Things: The SPHERE Project","volume":"30","author":"Zhu","year":"2015","journal-title":"IEEE Intell. Syst."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"e021862","DOI":"10.1136\/bmjopen-2018-021862","article-title":"Using home sensing technology to assess outcome and recovery after hip and knee replacement in the UK: The HEmiSPHERE study protocol","volume":"8","author":"Grant","year":"2018","journal-title":"BMJ Open"},{"key":"ref_15","unstructured":"Masullo, A., Burghardt, T., Damen, D., Perrett, T., and Mirmehdi, M. (November, January 27). Who Goes There? Exploiting Silhouettes and Wearable Signals for Subject Identification in Multi-Person Environments. Proceedings of the IEEE International Conference on Computer Vision Workshops, Seoul, Korea."},{"key":"ref_16","unstructured":"Tao, L. (2016). SPHERE-Calorie, University of Bristol."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Tao, L., Burghardt, T., Mirmehdi, M., Damen, D., Cooper, A., Hannuna, S., Camplani, M., Paiement, A., and Craddock, I. (2017). Calorie Counter: RGB-Depth Visual Estimation of Energy Expenditure at Home, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-319-54407-6_16"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Yao, Z., Wu, X., Xiong, Z., and Ma, Y. (2019). A Dynamic Part-Attention Model for Person Re-Identification. Sensors, 19.","DOI":"10.3390\/s19092080"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Gohar, I., Riaz, Q., Shahzad, M., Ul Hasnain Hashmi, M.Z., Tahir, H., and Ehsan Ul Haq, M. (2020). Person Re-Identification Using Deep Modeling of Temporally Correlated Inertial Motion Patterns. Sensors, 20.","DOI":"10.3390\/s20030949"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zeng, Z., Wang, Z., Wang, Z., Zheng, Y., Chuang, Y.Y., and Satoh, S. (2020). Illumination-adaptive person re-identification. IEEE Trans. Multimed.","DOI":"10.1109\/TMM.2020.2969782"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1016\/j.imavis.2014.02.001","article-title":"A survey of approaches and trends in person re-identification","volume":"32","author":"Shah","year":"2014","journal-title":"Image Vis. Comput."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1016\/j.neucom.2019.01.079","article-title":"Deep learning-based methods for person re-identification: A comprehensive review","volume":"337","author":"Wu","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Layne, R., Hannuna, S., Camplani, M., Hall, J., Hospedales, T.M., Xiang, T., Mirmehdi, M., and Damen, D. (2017, January 21\u201326). A Dataset for Persistent Multi-target Multi-camera Tracking in RGB-D. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.189"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1007\/978-1-4471-6296-4_8","article-title":"One-Shot Person Re-identification with a Consumer Depth Camera","volume":"Volume 6","author":"Munaro","year":"2014","journal-title":"Person Re-Identification"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3243043","article-title":"Gait-based Person Re-identification","volume":"52","author":"Nambiar","year":"2019","journal-title":"ACM Comput. Surv."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1505","DOI":"10.1109\/TPAMI.2003.1251144","article-title":"Silhouette analysis-based gait recognition for human identification","volume":"25","author":"Wang","year":"2003","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Gou, M., Zhang, X., Rates-Borras, A., Asghari-Esfeden, S., Sznaier, M., and Camps, O. (2016). Person Re-identification in Appearance Impaired Scenarios. arXiv, Available online: https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/arxiv.org\/abs\/1604.00367.","DOI":"10.5244\/C.30.48"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhang, P., Wu, Q., Xu, J., and Zhang, J. (2018, January 12\u201315). Long-Term Person Re-identification Using True Motion from Videos. Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00060"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"070186","DOI":"10.1155\/2007\/70186","article-title":"Audiovisual Speech Synchrony Measure: Application to Biometrics","volume":"2007","author":"Bredin","year":"2007","journal-title":"EURASIP J. Adv. Signal Process."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Arandjelovic, R., and Zisserman, A. (2017, January 22\u201329). Look, Listen and Learn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.73"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Roth, J., Chaudhuri, S., Klejch, O., Marvin, R., Gallagher, A., Kaver, L., Ramaswamy, S., Stopczynski, A., Schmid, C., and Xi, Z. (2020, January 4\u20138). AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection. Proceedings of the ICASSP 2020\u20142020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain.","DOI":"10.1109\/ICASSP40776.2020.9053900"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1016\/j.cviu.2018.02.001","article-title":"Learning to lip read words by watching videos","volume":"173","author":"Chung","year":"2018","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_33","unstructured":"Korbar, B., Tran, D., and Torresani, L. (2018, January 3\u20138). Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization. Proceedings of the 2018 Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Teixeira, T., Jung, D., and Savvides, A. (2010, January 26\u201329). Tasking networked CCTV cameras and mobile phones to identify and localize multiple people. Proceedings of the ACM International Conference on Ubiquitous Computing, Copenhagen, Denmark.","DOI":"10.1145\/1864349.1864367"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1016\/j.jvcir.2017.03.015","article-title":"Combining passive visual cameras and active IMU sensors for persistent pedestrian tracking","volume":"48","author":"Jiang","year":"2017","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Henschel, R., Marcard, T.V., and Rosenhahn, B. (2019, January 16\u201320). Simultaneous Identification and Tracking of Multiple People Using Video and IMUs. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA.","DOI":"10.1109\/CVPRW.2019.00106"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Jimenez, A., Seco, F., Prieto, C., and Guevara, J. (2009, January 26\u201328). A comparison of Pedestrian Dead-Reckoning algorithms using a low-cost MEMS IMU. Proceedings of the IEEE International Symposium on Intelligent Signal Processing, Budapest, Hungary.","DOI":"10.1109\/WISP.2009.5286542"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Shigeta, O., Kagami, S., and Hashimoto, K. (2008, January 22\u201326). Identifying a moving object with an accelerometer in a camera view. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems, Nice, France.","DOI":"10.1109\/IROS.2008.4651201"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Rofouei, M., Wilson, A., Brush, A., and Tansley, S. (2012, January 5\u201310). Your phone or mine?: Fusing body, touch and device sensing for multi-user device-display interaction. Proceedings of the ACM Annual Conference on Human Factors in Computing Systems, Austin, TX, USA.","DOI":"10.1145\/2207676.2208332"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Wilson, A.D., and Benko, H. (2014, January 12\u201316). Crossmotion: Fusing device and image motion for user identification, tracking and device association. Proceedings of the International Conference on Multimodal Interaction, Istanbul, Turkey.","DOI":"10.1145\/2663204.2663270"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Cabrera-Quiros, L., and Hung, H. (2016, January 15\u201319). Who is where? Matching People in Video to Wearable Acceleration During Crowded Mingling Events. Proceedings of the ACM on Multimedia Conference, Amsterdam, The Netherlands.","DOI":"10.1145\/2964284.2967224"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1867","DOI":"10.1109\/TMM.2018.2888798","article-title":"A Hierarchical Approach for Associating Body-Worn Sensors to Video Regions in Crowded Mingling Scenarios","volume":"21","author":"Hung","year":"2019","journal-title":"IEEE Trans. Multimed."},{"key":"ref_43","unstructured":"(2020, April 30). OpenNI. Available online: https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/structure.io\/openni."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., and Philbin, J. (2015, January 7\u201312). FaceNet: A unified embedding for face recognition and clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1109\/MSP.2012.2205597","article-title":"Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups","volume":"29","author":"Hinton","year":"2012","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Bredin, H. (2017, January 5\u20139). TristouNet: Triplet loss for speaker turn embedding. Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, LA, USA.","DOI":"10.1109\/ICASSP.2017.7953194"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Torfi, A., Dawson, J., and Nasrabadi, N.M. (2018, January 23\u201327). Text-Independent Speaker Verification Using 3D Convolutional Neural Networks. Proceedings of the IEEE International Conference on Multimedia and Expo, San Diego, CA, USA.","DOI":"10.1109\/ICME.2018.8486441"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"4405","DOI":"10.1007\/s11042-015-3177-1","article-title":"A survey of depth and inertial sensor fusion for human action recognition","volume":"76","author":"Chen","year":"2017","journal-title":"Multimed. Tools Appl."},{"key":"ref_49","unstructured":"Lagadec, R., Pelloni, D., and Weiss, D. (1982, January 3\u20135). A 2-channel, 16-bit digital sampling frequency converter for professional digital audio. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Paris, France."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Cao, Z., Simon, T., Wei, S.E., and Sheikh, Y. (2018). Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. arXiv, Available online: https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/arxiv.org\/abs\/1812.08008.","DOI":"10.1109\/CVPR.2017.143"},{"key":"ref_51","unstructured":"Cabrera-Quiros, L., Demetriou, A., Gedik, E., van der Meij, L., and Hung, H. (2018). The MatchNMingle dataset: A novel multi-sensor resource for the analysis of social interactions and group dynamics in-the-wild during free-standing conversations and speed dates. IEEE Trans. Affect. Comput."},{"key":"ref_52","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv, Available online: https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/arxiv.org\/pdf\/1412.6980.pdf."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/www.mdpi.com\/1424-8220\/20\/9\/2576\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:52:16Z","timestamp":1760363536000},"score":1,"resource":{"primary":{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/www.mdpi.com\/1424-8220\/20\/9\/2576"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,1]]},"references-count":52,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2020,5]]}},"alternative-id":["s20092576"],"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.3390\/s20092576","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,5,1]]}}}