Publication:
A Survey on Multithreading Alternatives for Soft Error Fault Tolerance

dc.contributor.authorARSLAN YILMAZ, SANEM
dc.contributor.authorsOz, Isil; Arslan, Sanem
dc.date.accessioned2022-03-12T22:38:18Z
dc.date.accessioned2026-01-11T08:00:56Z
dc.date.available2022-03-12T22:38:18Z
dc.date.issued2019
dc.description.abstractSmaller transistor sizes and reduction in voltage levels in modern microprocessors induce higher soft error rates. This trend makes reliability a primary design constraint for computer systems. Redundant multithreading (RMT) makes use of parallelism in modern systems by employing thread-level time redundancy for fault detection and recovery. RMT can detect faults by running identical copies of the program as separate threads in parallel execution units with identical inputs and comparing their outputs. In this article, we present a survey of RMT implementations at different architectural levels with several design considerations. We explain the implementations in seminal papers and their extensions and discuss the design choices employed by the techniques. We review both hardware and software approaches by presenting the main characteristics and analyze the studies with different design choices regarding their strengths and weaknesses. We also present a classification to help potential users find a suitable method for their requirement and to guide researchers planning to work on this area by providing insights into the future trend.
dc.identifier.doi10.1145/3302255
dc.identifier.eissn1557-7341
dc.identifier.issn0360-0300
dc.identifier.urihttps://hdl.handle.net/11424/235581
dc.identifier.wosWOS:000473754100005
dc.language.isoeng
dc.publisherASSOC COMPUTING MACHINERY
dc.relation.ispartofACM COMPUTING SURVEYS
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectSoft error
dc.subjectthread-level redundancy
dc.subjectredundant multithreading
dc.subjectRELIABILITY
dc.subjectSYSTEMS
dc.subjectREDUNDANCY
dc.subjectEXECUTION
dc.subjectDESIGN
dc.subjectCORES
dc.titleA Survey on Multithreading Alternatives for Soft Error Fault Tolerance
dc.typearticle
dspace.entity.typePublication
oaire.citation.issue2
oaire.citation.titleACM COMPUTING SURVEYS
oaire.citation.volume52

Files