{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,2]],"date-time":"2026-01-02T07:37:57Z","timestamp":1767339477319},"reference-count":24,"publisher":"Wiley","issue":"2","license":[{"start":{"date-parts":[[2008,6,6]],"date-time":"2008-06-06T00:00:00Z","timestamp":1212710400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/http\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2009,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>To achieve high data availability or reliability in an efficient manner, distributed storage systems must detect whether an observed node failure is permanent or transient, and if necessary, generate replicas to restore the desired level of replication. Given the unpredictability of network dynamics, however, distinguishing permanent and transient failures is extremely difficult. Though timeout\u2010based detectors can be used to avoid mistaking transient failures as permanent failures, it is unknown how the timeout values should be selected to achieve a better tradeoff between detection latency and accuracy. In this paper, we address this fundamental tradeoff from several perspectives. First, we explore the impact of different timeout values on maintenance cost by examining the probability of their false positives and false negatives. Second, we propose a self\u2010configurable failure detector called the Neutralizer based on the idea of counteracting false positives with false negatives. The Neutralizer could enable the system to maintain a desired replication level on average with the least amount of bandwidth. We conduct extensive simulations using real trace data from a widely deployed peer\u2010to\u2010peer system and synthetic traces based on PlanetLab and Microsoft PCs, showing a significant reduction in aggregate bandwidth usage after applying the Neutralizer (especially in an environment with a low average node availability). Overall, we demonstrate that the Neutralizer closely approximates the performance of a perfect \u2018oracle\u2019 detector in many cases. Copyright \u00a9 2008 John Wiley &amp; Sons, Ltd.<\/jats:p>","DOI":"10.1002\/cpe.1338","type":"journal-article","created":{"date-parts":[[2008,6,6]],"date-time":"2008-06-06T19:56:34Z","timestamp":1212782194000},"page":"185-204","source":"Crossref","is-referenced-by-count":1,"title":["The Neutralizer: a self\u2010configurable failure detector for minimizing distributed storage maintenance cost"],"prefix":"10.1002","volume":"21","author":[{"given":"Zhi","family":"Yang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yafei","family":"Dai","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoming","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2008,6,6]]},"reference":[{"key":"e_1_2_10_2_2","unstructured":"BlakeC RodriguesR.High availability scalable storage dynamic peer networks: Pick two. Proceedings of the HotOS Lihue Hawaii 2003."},{"key":"e_1_2_10_3_2","unstructured":"TatiK VoelkerG.On object maintenance in peer\u2010to\u2010peer systems. Proceedings of the IPTPS Santa Barbara CA 2006."},{"key":"e_1_2_10_4_2","unstructured":"WeatherspoonH ChunB\u2010G SoC KubiatowiczJ.Long\u2010term data maintenance in wide\u2010area storage systems: A quantitative approach. Technical report UC Berkeley UCB\/CSD\u201005\u20101404 July2005."},{"key":"e_1_2_10_5_2","unstructured":"BhagwanR TatiK ChengY SavageS VoelkerG.Total recall: System support for automated availability management. Proceedings of the NSDI San Jose CA 2004."},{"key":"e_1_2_10_6_2","doi-asserted-by":"crossref","unstructured":"AdyaA WattenhoferR BoloskyW CastroM CermakG ChaikenR DouceurJ HowellJ LorchJ TheimerM.FARSITE: Federated available and reliable storage for an incompletely trusted environment. In Proceedings of the Fifth OSDI pp. 1\u201314 Boston MA December 2002.","DOI":"10.1145\/1060289.1060291"},{"key":"e_1_2_10_7_2","doi-asserted-by":"crossref","unstructured":"KubiatowiczJ WellsC ZhaoBY BindelD ChenY CzerwinskiS EatonP GeelsD GummadiR RheaS.OceanStore: An architecture for global\u2010scale persistent storage. Proceedings of the ASPLOS Cambridge MA U.S.A. 2000;190\u2013201.","DOI":"10.1145\/378995.379239"},{"key":"e_1_2_10_8_2","doi-asserted-by":"crossref","unstructured":"ZhangZ LianQ LinS ChenW ChenY JinC.BitVault: A highly reliable distributed data retention platform. ACM SIGOPS Operating Systems Review 2007.","DOI":"10.1145\/1243418.1243423"},{"key":"e_1_2_10_9_2","unstructured":"ChunB\u2010G DabekF HaeberlenA SitE WeatherspoonH KaashoekMF KubiatowiczJ MorrisR.Efficient replica maintenance for distributed storage systems. Proceedings of the NSDI San Jose CA U.S.A. 2006."},{"key":"e_1_2_10_10_2","doi-asserted-by":"crossref","unstructured":"DabekF KaashoekMF KargerD MorrisR StoicaI.Wide\u2010area cooperative storage with CFS. Proceedings of the ACM SOSP Banff Canada October 2001.","DOI":"10.1145\/502034.502054"},{"key":"e_1_2_10_11_2","unstructured":"BhagwanR SavageS VoelkerG.Understanding availability. Proceedings of the IPTPS Berkeley CA U.S.A. 2003."},{"key":"e_1_2_10_12_2","unstructured":"TianJ DaiY.Understanding the dynamic of peer\u2010to\u2010peer systems. Proceedings of the IPTPS Bellevue WA 2007."},{"key":"e_1_2_10_13_2","unstructured":"ZhuangS GeelsD StoicaI KatzR.On failure detection algorithms in overlay networks. Proceedings of the INFOCOM Miami FL U.S.A. 2005."},{"key":"e_1_2_10_14_2","doi-asserted-by":"crossref","unstructured":"SoKCW SirerEG.Latency and bandwidth\u2010minimizing failure detectors. Proceedings of the EuroSys Lisbon Portugal 2007.","DOI":"10.1145\/1272996.1273008"},{"key":"e_1_2_10_15_2","unstructured":"D\u00e9fagoX SchiperA SergentN.Semi\u2010passive replication. Symposium on Reliable Distributed Systems West Lafayette IN U.S.A. 1998."},{"key":"e_1_2_10_16_2","unstructured":"YalagandulaP NatfS YuH GibbonsPB SeshanS.Beyond availability: Towards a deeper understanding of machine failure characteristics in large distributed systems. WORLDS San Francisco CA U.S.A. 2004."},{"key":"e_1_2_10_17_2","unstructured":"WeatherspoonH MoscovitzT KubiatowiczJ.Introspective failure analysis: Avoiding correlated failures in peer\u2010to\u2010peer systems. Proceedings of the Reliable Distributed Systems Osaka Japan 2002."},{"key":"e_1_2_10_18_2","volume-title":"An Introduction to Stochastic Processes","author":"Kao EPC","year":"1997"},{"key":"e_1_2_10_19_2","unstructured":"YangM ZhaoBY DaiY ZhangZ.Deployment of a large scale peer\u2010to\u2010peer social network. Proceedings of the WORLDS San Francisco CA U.S.A. 2004."},{"key":"e_1_2_10_20_2","doi-asserted-by":"crossref","unstructured":"BoloskyW DouceurJ ElyD TheimerM.Feasibility of a serverless distributed file system deployed on an existing setof desktop PCs. Proceedings of the Sigmetrics Santa Clara CA U.S.A. June 2000.","DOI":"10.1145\/339331.339345"},{"key":"e_1_2_10_21_2","unstructured":"StriblingJ. Planetlab All Pairs Ping.https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/http\/infospect.Planet\u2010lab.org\/pings."},{"key":"e_1_2_10_22_2","doi-asserted-by":"crossref","unstructured":"DruschelP RowstronA.Storage management and caching in PAST a large\u2010scale persistent peer\u2010to\u2010peer storage utility. Proceedings of the ACM SOSP Chateau Lake Louise Banff Canada 2001.","DOI":"10.1145\/502034.502053"},{"key":"e_1_2_10_23_2","doi-asserted-by":"crossref","unstructured":"KarpB RatnasamyS RheaS ShenkerS.Spurring adoption of DHTS with openhash a public DHT service. Proceedings of the IPTPS La Jolla CA U.S.A. February 2004.","DOI":"10.1007\/978-3-540-30183-7_19"},{"key":"e_1_2_10_24_2","unstructured":"CatesJ.Robust and efficient data management for a distributed hash table. Master's Thesis MIT June2003."},{"key":"e_1_2_10_25_2","unstructured":"DabekF LiJ SitE RobertsonJ KaashoekMF MorrisR.Designing a DHT for low latency and high throughput. Proceedings of the NSDI San Francisco CA U.S.A. March 2004."}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fcpe.1338","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.1338","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,11]],"date-time":"2023-10-11T01:15:32Z","timestamp":1696986932000},"score":1,"resource":{"primary":{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.1338"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,6,6]]},"references-count":24,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2009,2]]}},"alternative-id":["10.1002\/cpe.1338"],"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.1002\/cpe.1338","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"value":"1532-0626","type":"print"},{"value":"1532-0634","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,6,6]]}}}