{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,18]],"date-time":"2026-02-18T23:50:24Z","timestamp":1771458624367,"version":"3.50.1"},"reference-count":42,"publisher":"Institution of Engineering and Technology (IET)","issue":"1","license":[{"start":{"date-parts":[[2022,8,25]],"date-time":"2022-08-25T00:00:00Z","timestamp":1661385600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/http\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62172367"],"award-info":[{"award-number":["62172367"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["ietresearch.onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["IET Image Processing"],"published-print":{"date-parts":[[2023,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Depth maps are acquirable and irreplaceable geometric information that significantly enhances traditional color images. RGB and Depth (RGBD) images have been widely used in various image analysis applications, but they are still very limited due to challenges from different modalities and misalignment between color and depth. In this paper, a Fully Aligned Fusion Network (FAFNet) for RGBD semantic segmentation is presented. To improve cross\u2010modality fusion, a new RGBD fusion block is proposed, features from color images and depth maps are first fused by an attention cross fusion module and then aligned by a semantic flow. A multi\u2010layer structure is also designed to hierarchically utilize the RGBD fusion block, which not only eases issues of low resolution and noises for depth maps but also reduces the loss of semantic features in the upsampling process. Quantitative and qualitative evaluations on both the NYU\u2010Depth V2 and the SUN RGB\u2010D dataset demonstrate that the FAFNet model outperforms state\u2010of\u2010the\u2010art RGBD semantic segmentation\u00a0methods.<\/jats:p>","DOI":"10.1049\/ipr2.12614","type":"journal-article","created":{"date-parts":[[2022,8,25]],"date-time":"2022-08-25T05:15:56Z","timestamp":1661404556000},"page":"32-41","update-policy":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["FAFNet: Fully aligned fusion network for RGBD semantic segmentation based on hierarchical semantic flows"],"prefix":"10.1049","volume":"17","author":[{"ORCID":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/orcid.org\/0000-0003-2780-6146","authenticated-orcid":false,"given":"Jiazhou","family":"Chen","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology Zhejiang University of Technology Hangzhou P.R.China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yangfan","family":"Zhan","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology Zhejiang University of Technology Hangzhou P.R.China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanghui","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology Zhejiang University of Technology Hangzhou P.R.China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiang","family":"Pan","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology Zhejiang University of Technology Hangzhou P.R.China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"265","published-online":{"date-parts":[[2022,8,25]]},"reference":[{"key":"e_1_2_10_2_1","doi-asserted-by":"crossref","unstructured":"Long J. Shelhamer E. Darrell T.:Fully convolutional networks for semantic segmentation. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp.3431\u20133440.IEEE Piscataway(2015)","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_2_10_3_1","doi-asserted-by":"crossref","unstructured":"Zhao H. Shi J. Qi X. Wang X. Jia J.:Pyramid scene parsing network. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp.2881\u20132890.IEEE Piscataway(2017)","DOI":"10.1109\/CVPR.2017.660"},{"key":"e_1_2_10_4_1","doi-asserted-by":"crossref","unstructured":"Chen L.C. Papandreou G. Schroff F. Adam H.:Rethinking atrous convolution for semantic image segmentation.arXiv preprint arXiv:170605587(2017)","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"e_1_2_10_5_1","doi-asserted-by":"crossref","unstructured":"Li X. You A. Zhu Z. Zhao H. Yang M. Yang K. et\u00a0al.:Semantic flow for fast and accurate scene parsing. In:Proceedings of European Conference on Computer Vision (ECCV) pp.775\u2013793.Springer Berlin(2020)","DOI":"10.1007\/978-3-030-58452-8_45"},{"key":"e_1_2_10_6_1","doi-asserted-by":"publisher","DOI":"10.1049\/iet-ipr.2017.0738"},{"key":"e_1_2_10_7_1","doi-asserted-by":"publisher","DOI":"10.1049\/iet-ipr.2017.1020"},{"key":"e_1_2_10_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMC.1979.4310076"},{"key":"e_1_2_10_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2004.110"},{"key":"e_1_2_10_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.gmod.2019.101030"},{"key":"e_1_2_10_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.gmod.2021.101108"},{"key":"e_1_2_10_12_1","doi-asserted-by":"crossref","unstructured":"Hazirbas C. Ma L. Domokos C. Cremers D.:Fusenet: Incorporating depth into semantic segmentation via fusion\u2010based cnn architecture. In:Proceedings of Asian Conference on Computer Vision (ACCV) pp.213\u2013228.Springer Berlin(2016)","DOI":"10.1007\/978-3-319-54181-5_14"},{"key":"e_1_2_10_13_1","doi-asserted-by":"crossref","unstructured":"Zhou H. Qi L. Wan Z. Huang H. Yang X.:Rgb\u2010d co\u2010attention network for semantic segmentation. In:Proceedings of the IEEE Asian Conference on Computer Vision (ACCV).IEEE Piscataway(2020)","DOI":"10.1007\/978-3-030-69525-5_31"},{"key":"e_1_2_10_14_1","doi-asserted-by":"crossref","unstructured":"Chen X. Lin K.Y. Wang J. Wu W. Qian C. Li H. et\u00a0al.:Bi\u2010directional cross\u2010modality feature propagation with separation\u2010and\u2010aggregation gate for rgb\u2010d semantic segmentation. In:Proceedings of the European Conference on Computer Vision (ECCV) pp.561\u2013577.Springer Berlin(2020)","DOI":"10.1007\/978-3-030-58621-8_33"},{"key":"e_1_2_10_15_1","doi-asserted-by":"crossref","unstructured":"Li Z. Gan Y. Liang X. Yu Y. Cheng H. Lin L.:Lstm\u2010cf: Unifying context modeling and fusion with lstms for rgb\u2010d scene labeling. In:Proceedings of European conference on computer vision (ECCV) pp.541\u2013557.Springer Berlin(2016)","DOI":"10.1007\/978-3-319-46475-6_34"},{"key":"e_1_2_10_16_1","doi-asserted-by":"crossref","unstructured":"Qi X. Liao R. Jia J. Fidler S. Urtasun R.:3d graph neural networks for rgbd semantic segmentation. In:Proceedings of the IEEE International Conference on Computer Vision (ICCV) pp.5199\u20135208.IEEE Piscataway(2017)","DOI":"10.1109\/ICCV.2017.556"},{"key":"e_1_2_10_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/MMUL.2012.24"},{"key":"e_1_2_10_18_1","doi-asserted-by":"crossref","unstructured":"Ronneberger O. Fischer P. Brox T.:U\u2010net: Convolutional networks for biomedical image segmentation. In:Proceedings of International Conference on Medical Image Computing and Computer\u2010Assisted Intervention pp.234\u2013241.Springer Cham(2015)","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"e_1_2_10_19_1","unstructured":"Chen L.C. Papandreou G. Kokkinos I. Murphy K. Yuille A.L.:Semantic image segmentation with deep convolutional nets and fully connected crfs.arXiv preprint arXiv:14127062(2014)"},{"key":"e_1_2_10_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2699184"},{"key":"e_1_2_10_21_1","doi-asserted-by":"crossref","unstructured":"Chen L.C. Zhu Y. Papandreou G. Schroff F. Adam H.:Encoder\u2010decoder with atrous separable convolution for semantic image segmentation. In:Proceedings of the European Conference on Computer Vision (ECCV) pp.801\u2013818.Springer Berlin(2018)","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"e_1_2_10_22_1","doi-asserted-by":"crossref","unstructured":"Lin G. Milan A. Shen C. Reid I.:Refinenet: Multi\u2010path refinement networks for high\u2010resolution semantic segmentation. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp.1925\u20131934.IEEE Piscataway(2017)","DOI":"10.1109\/CVPR.2017.549"},{"key":"e_1_2_10_23_1","unstructured":"Couprie C. Farabet C. Najman L. LeCun Y.:Indoor semantic segmentation using depth information.arXiv preprint arXiv:13013572(2013)"},{"key":"e_1_2_10_24_1","doi-asserted-by":"crossref","unstructured":"Noh H. Hong S. Han B.:Learning deconvolution network for semantic segmentation. In:Proceedings of the IEEE International Conference on Computer Vision (ICCV) pp.1520\u20131528.IEEE Piscataway(2015)","DOI":"10.1109\/ICCV.2015.178"},{"key":"e_1_2_10_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2644615"},{"key":"e_1_2_10_26_1","doi-asserted-by":"crossref","unstructured":"Wang W. Neumann U.:Depth\u2010aware cnn for rgb\u2010d segmentation. In:Proceedings of the European Conference on Computer Vision (ECCV) pp.135\u2013150.Springer Berlin(2018)","DOI":"10.1007\/978-3-030-01252-6_9"},{"key":"e_1_2_10_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.gmod.2011.01.001"},{"key":"e_1_2_10_28_1","doi-asserted-by":"publisher","DOI":"10.1049\/iet-ipr.2020.0230"},{"key":"e_1_2_10_29_1","first-page":"1","article-title":"Align deep features for oriented object detection","volume":"60","author":"Han J.","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"issue":"1","key":"e_1_2_10_30_1","first-page":"550","article-title":"Alignseg: Feature\u2010aligned segmentation networks","volume":"44","author":"Huang Z.","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_2_10_31_1","doi-asserted-by":"crossref","unstructured":"Chollet F.:Xception: Deep learning with depthwise separable convolutions. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp.1251\u20131258.IEEE Piscataway(2017)","DOI":"10.1109\/CVPR.2017.195"},{"key":"e_1_2_10_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/S1352-2310(97)00447-0"},{"key":"e_1_2_10_33_1","doi-asserted-by":"publisher","DOI":"10.1162\/089976600300015349"},{"key":"e_1_2_10_34_1","unstructured":"Jaderberg M. Simonyan K. Zisserman A. Kavukcuoglu K.:Spatial transformer networks. In:Proceedings of the 28th International Conference on Neural Information Processing Systems \u2010 Volume 2. NIPS'15 p.2017\u2013C2025.MIT Press Cambridge(2015)"},{"key":"e_1_2_10_35_1","doi-asserted-by":"crossref","unstructured":"Silberman N. Hoiem D. Kohli P. Fergus R.:Indoor segmentation and support inference from rgbd images. In:Proceedings of European Conference on Computer Vision (ECCV) pp.746\u2013760.Springer Berlin(2012)","DOI":"10.1007\/978-3-642-33715-4_54"},{"key":"e_1_2_10_36_1","doi-asserted-by":"crossref","unstructured":"Song S. Lichtenberg S.P. Xiao J.:Sun rgb\u2010d: A rgb\u2010d scene understanding benchmark suite. In:Proceedings of the IEEE conference on computer vision and pattern recognition pp.567\u2013576.IEEE Piscataway(2015)","DOI":"10.1109\/CVPR.2015.7298655"},{"key":"e_1_2_10_37_1","doi-asserted-by":"crossref","unstructured":"Gupta S. Girshick R. Arbel\u00e1ez P. Malik J.:Learning rich features from rgb\u2010d images for object detection and segmentation. In:Proceedings of European conference on computer vision (ECCV) pp.345\u2013360.Springer Berlin(2014)","DOI":"10.1007\/978-3-319-10584-0_23"},{"key":"e_1_2_10_38_1","doi-asserted-by":"crossref","unstructured":"He K. Zhang X. Ren S. Sun J.:Deep residual learning for image recognition. In:Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR) pp.770\u2013778.IEEE Piscataway(2016)","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_10_39_1","unstructured":"Ge R. Kakade S.M. Kidambi R. Netrapalli P.:The step decay schedule: A near optimal geometrically decaying learning rate procedure for least squares.arXiv preprint arXiv:190412838(2019)"},{"key":"e_1_2_10_40_1","doi-asserted-by":"crossref","unstructured":"Cao J. Leng H. Lischinski D. Cohen\u2010Or D. Tu C. Li Y.:Shapeconv: Shape\u2010aware convolutional layer for indoor rgb\u2010d semantic segmentation. In:Proceedings of the IEEE\/CVF International Conference on Computer Vision pp.7088\u20137097.IEEE Piscataway(2021)","DOI":"10.1109\/ICCV48922.2021.00700"},{"key":"e_1_2_10_41_1","doi-asserted-by":"crossref","unstructured":"Hu X. Yang K. Fei L. Wang K.:Acnet: Attention based network to exploit complementary features for rgbd semantic segmentation. In:Proceedings of the IEEE International Conference on Image Processing (ICIP) pp.1440\u20131444.IEEE Piscataway(2019)","DOI":"10.1109\/ICIP.2019.8803025"},{"key":"e_1_2_10_42_1","unstructured":"Park S.J. Hong K.S. Lee S.:Rdfnet: Rgb\u2010d multi\u2010level residual feature fusion for indoor semantic segmentation. In:Proceedings of the IEEE International Conference on Computer Vision (ICCV) pp.4980\u20134989.IEEE Piscataway(2017)"},{"key":"e_1_2_10_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2021.3066071"}],"container-title":["IET Image Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/onlinelibrary.wiley.com\/doi\/pdf\/10.1049\/ipr2.12614","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1049\/ipr2.12614","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/ietresearch.onlinelibrary.wiley.com\/doi\/pdf\/10.1049\/ipr2.12614","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T11:49:47Z","timestamp":1761565787000},"score":1,"resource":{"primary":{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/ietresearch.onlinelibrary.wiley.com\/doi\/10.1049\/ipr2.12614"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,25]]},"references-count":42,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,1]]}},"alternative-id":["10.1049\/ipr2.12614"],"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.1049\/ipr2.12614","archive":["Portico"],"relation":{},"ISSN":["1751-9659","1751-9667"],"issn-type":[{"value":"1751-9659","type":"print"},{"value":"1751-9667","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,25]]},"assertion":[{"value":"2022-01-21","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-08-10","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-08-25","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}