{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T08:32:26Z","timestamp":1777105946866,"version":"3.51.4"},"reference-count":21,"publisher":"Wiley","issue":"12","license":[{"start":{"date-parts":[[2013,1,18]],"date-time":"2013-01-18T00:00:00Z","timestamp":1358467200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/http\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2013,8,25]]},"abstract":"<jats:title>SUMMARY<\/jats:title><jats:p>This paper discusses performance optimization on the dynamical core of global numerical weather prediction model in Global\/Regional Assimilation and Prediction System (GRAPES). GRAPES is a new generation of numerical weather prediction system developed and currently used by Chinese Meteorology Administration. The computational performance of the dynamical core in GRAPES relies on the efficient solution of three\u2010dimensional Helmholtz equations, which lead to large\u2010scale and sparse linear systems formulated by the discretization in space and time. We choose generalized conjugate residual (GCR) algorithm to solve the corresponding linear systems and further propose algorithm optimizations for large\u2010scale parallelism in two aspects: (i) reduction of iteration number for solution and (ii) performance enhancement of each GCR iteration. The reduction of iteration number is achieved by advanced preconditioning techniques, combining block incomplete LU factorization\u2010k preconditioner over 7\u2010diagonals of the coefficient matrix with the restricted additive Schwarz method effectively . The improvement for GCR iteration is to reduce the global communication operations by refactoring the GCR algorithm, which decreases the communication overhead over large number of cores. Performance evaluation on the Tianhe\u20101A system shows that the new preconditioning techniques reduce almost one\u2010third iterations for solving the linear systems, the proposed methods can obtain 25% performance improvement on average compared with the original version of Helmholtz solver in GRAPES, and the speedup with our algorithms can reach 10 using 2048 cores compared with 256 cores. Copyright \u00a9 2013 John Wiley &amp; Sons, Ltd.<\/jats:p>","DOI":"10.1002\/cpe.2979","type":"journal-article","created":{"date-parts":[[2013,1,18]],"date-time":"2013-01-18T09:48:39Z","timestamp":1358502519000},"page":"1722-1737","source":"Crossref","is-referenced-by-count":17,"title":["A scalable Helmholtz solver in GRAPES over large\u2010scale multicore cluster"],"prefix":"10.1002","volume":"25","author":[{"given":"Linfeng","family":"Li","sequence":"first","affiliation":[{"name":"Department of Computer Science and Technology Tsinghua University Beijing China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Xue","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Technology Tsinghua University Beijing China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rajiv","family":"Ranjan","sequence":"additional","affiliation":[{"name":"CSIRO ICT Centre Canberra Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhiyan","family":"Jin","sequence":"additional","affiliation":[{"name":"National Meteorological Center Beijing China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2013,1,18]]},"reference":[{"key":"e_1_2_7_2_1","first-page":"142","article-title":"A GRAPES new dynamical model and its preliminary numerical tests","author":"Chen DH","year":"2003","journal-title":"Proceedings of International Workshop on NWP Models for Heavy Precipitation in Asia and Pacific Areas, Japan Meteorological Agency and Ship & Ocean Foundation"},{"issue":"3","key":"e_1_2_7_3_1","first-page":"731","article-title":"ILU preconditioner for NWP system: GRAPES","volume":"29","author":"Liu Y","year":"2008","journal-title":"Computer Engineering and Design"},{"key":"e_1_2_7_4_1","volume-title":"Scientific Design and Application of the Numerical Prediction System GRAPES","author":"Xue JS","year":"2008"},{"key":"e_1_2_7_5_1","first-page":"287","article-title":"Parallel preconditioned GMRES solvers for 3\u2010D Helmholtz equations in regional non\u2010hydrostatic atmosphere model","volume":"3","author":"Zhang LL","year":"2008","journal-title":"International Conference on Computer Science and Software Engineering"},{"key":"e_1_2_7_6_1","unstructured":"ZhangLL.Research of high\u2010performance parallel computing for numerical models in meteorologic prediction.PhD thesis National University of Defense Technology 2002 Hunan China."},{"key":"e_1_2_7_7_1","unstructured":"LiuGP ZhaoWT ZhangLL.Parallel Helmholtz solvers for Chinese GRAPES atmosphere model based on PETSc tools.Proceedings of International Symposium on Distributed Computing and Applications to Business Engineering and Science Hubei China 2007;287\u2013290."},{"key":"e_1_2_7_8_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcph.2002.7176"},{"key":"e_1_2_7_9_1","doi-asserted-by":"publisher","DOI":"10.1201\/9781584889106"},{"key":"e_1_2_7_10_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898718003"},{"key":"e_1_2_7_11_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9780898719505"},{"key":"e_1_2_7_12_1","doi-asserted-by":"publisher","DOI":"10.1137\/S106482759732678X"},{"key":"e_1_2_7_13_1","unstructured":"CaiXC DryjaM SarkisM.A convergence theory for restricted additive Schwarz methods.Technical Report Dept. of Computer Science Univ. of Colorado at Boulder 1999."},{"issue":"4","key":"e_1_2_7_14_1","first-page":"80","article-title":"Improved parallel generalized conjugate residual algorithm","volume":"35","author":"Zhao LB","year":"2009","journal-title":"Computer Engineering"},{"key":"e_1_2_7_15_1","doi-asserted-by":"crossref","unstructured":"YangL BrentR.The improved BiCGSTAB method for large and sparse unsymmetric linear systems on parallel distributed memory architectures Beijing China 2002;324\u2013328.","DOI":"10.1109\/ICAPP.2002.1173595"},{"key":"e_1_2_7_16_1","doi-asserted-by":"publisher","DOI":"10.1137\/070685804"},{"key":"e_1_2_7_17_1","unstructured":"DanielB HenryJ.Experiences in optimizing a numerical weather prediction model: an exercise in futility?7th Linux Cluster Institute Conference Norman USA 2006;1\u201334."},{"issue":"2","key":"e_1_2_7_18_1","first-page":"87","article-title":"Challenges and solutions to improve the scalability of an operational regional meteorological forecasting model","volume":"3","author":"Alvaro L","year":"2011","journal-title":"International Journal of High Performance Systems Architecture"},{"key":"e_1_2_7_19_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11434-008-0417-z"},{"key":"e_1_2_7_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11434-008-0494-z"},{"key":"e_1_2_7_21_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611971538"},{"key":"e_1_2_7_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcp.2010.12.027"}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fcpe.2979","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.2979","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,29]],"date-time":"2025-04-29T17:37:09Z","timestamp":1745948229000},"score":1,"resource":{"primary":{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.2979"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,1,18]]},"references-count":21,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2013,8,25]]}},"alternative-id":["10.1002\/cpe.2979"],"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.1002\/cpe.2979","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"value":"1532-0626","type":"print"},{"value":"1532-0634","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,1,18]]}}}