Publication:
Analyzing fault behavior of shared data in parallel applications

dc.contributor.authorsOz, Isil
dc.date.accessioned2022-03-12T20:27:31Z
dc.date.accessioned2026-01-10T17:08:58Z
dc.date.available2022-03-12T20:27:31Z
dc.date.issued2016
dc.description.abstractMulticore architectures are becoming the most promising computing platforms thanks to their high performance. The soft error rate in multicore systems increases by the trend in the transistor sizes and the reduction of the voltage of the transistors. Evaluating the impact of soft errors on parallel applications is critical to understand the fault characteristics and to decide the fault tolerance strategies for the reliable execution. In this paper, we examine the soft error vulnerabilities of shared data in parallel Java applications. To analyze fault behavior of shared data in parallel programs, we design and implement a bytecode instrumentation based analysis and fault injection framework. We evaluate the fault behavior of shared data fields on a set of parallel applications from NAS benchmark suite. Our experimental evaluation demonstrates data type and access characteristics of the shared fields, and shows that shared data structures of parallel applications are more vulnerable to soft errors. While error rates for unshared local data stay around 20% in our target applications, the rate for shared data exceeds above 30% for some applications. We further discuss potential directions of our results and how shared data analysis can be employed to apply partial fault tolerance techniques. (C) 2016 Elsevier B.V. All rights reserved.
dc.identifier.doi10.1016/j.micpro.2016.03.014
dc.identifier.eissn1872-9436
dc.identifier.issn0141-9331
dc.identifier.urihttps://hdl.handle.net/11424/233708
dc.identifier.wosWOS:000382592100007
dc.language.isoeng
dc.publisherELSEVIER
dc.relation.ispartofMICROPROCESSORS AND MICROSYSTEMS
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectReliability
dc.subjectMulticores
dc.subjectFault injection
dc.titleAnalyzing fault behavior of shared data in parallel applications
dc.typearticle
dspace.entity.typePublication
oaire.citation.endPage80
oaire.citation.startPage67
oaire.citation.titleMICROPROCESSORS AND MICROSYSTEMS
oaire.citation.volume45

Files