dc.contributor |
Barcelona Supercomputing Center |
dc.contributor.author |
Hernandez, Carles |
dc.contributor.author |
Abella, Jaume |
dc.date |
2015-05-19 |
dc.identifier.citation |
Hernandez, Carles; Abella, Jaume. Timely Error Detection for Effective Recovery in Light-Lockstep Automotive Systems. "IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems", 19 Maig 2015, vol. 34, núm. 11, p. 1718-1729. |
dc.identifier.citation |
0278-0070 |
dc.identifier.citation |
10.1109/TCAD.2015.2434958 |
dc.identifier.uri |
http://hdl.handle.net/2117/87259 |
dc.language.iso |
eng |
dc.publisher |
Institute of Electrical and Electronics Engineers (IEEE) |
dc.relation |
http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=7110331&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D7110331 |
dc.relation |
info:eu-repo/grantAgreement/ES/1PE/TIN2012-34557 |
dc.relation |
info:eu-repo/grantAgreement/EC/FP7/287759/EU/High Performance and Embedded Architecture and Compilation/HIPEAC |
dc.relation |
info:eu-repo/grantAgreement/ES/RYC-2013-14717 |
dc.rights |
info:eu-repo/semantics/openAccess |
dc.subject |
Àrees temàtiques de la UPC::Enginyeria electrònica |
dc.subject |
Automotive technology |
dc.subject |
Error detection and recovery in robotics |
dc.subject |
Real-time systems |
dc.subject |
Multiprocessing systems |
dc.subject |
Automotive engineering |
dc.subject |
Traffic engineering computing |
dc.subject |
Electrònica--Automòbils |
dc.title |
Timely Error Detection for Effective Recovery in Light-Lockstep Automotive Systems |
dc.type |
info:eu-repo/semantics/submittedVersion |
dc.type |
info:eu-repo/semantics/article |
dc.description.abstract |
Safety-relevant systems in the automotive domain often implement features such as lockstep execution for error detection, and reset and re-execution for error correction. Light-lockstep has already been adopted in some such systems due to its relatively low-implementation cost given that it does not require deep changes into nonlockstep hardware. Instead, as only off-core activities (i.e., data/addresses sent) need to be compared across different cores, light-lockstep designs are lowly intrusive. This approach has been proven sufficient to guarantee functional correctness of the system in the presence of errors in the cores, in particular in relation with certification against safety standards such as ISO26262 in the automotive domain. However, error detection in light-lockstep systems may occur long after the error actually occurs, thus jeopardizing timing guarantees, which are as critical as functional ones in hard real-time systems. In this paper, we analyze the timing behavior of errors due to transient and permanent faults in light-lockstep systems. Our results show that the time elapsed until an error is detected can be inordinately large, especially for permanent faults. Based on this observation and building upon the specific characteristics of light-lockstep systems, we propose lightly verbose (LiVe), a new mechanism to enforce the early detection of errors, due to both transient and permanent faults, thus enabling the computation of tight error detection timing bounds. We also analyze how existing mechanisms for error recovery in multicore systems increase their effectiveness when light-lockstep operates in LiVe mode in the context of mixed-criticality workloads. |
dc.description.abstract |
The research leading to these results has received funding from the ARTEMIS Joint Undertaking VeTeSS project under grant agreement number 295311. This work has also been funded by the Ministry of Science and Technology of Spain under contract TIN2012-34557 and HiPEAC. Jaume Abella
has been partially supported by the Ministry of Economy and Competitiveness under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717. |
dc.description.abstract |
Peer Reviewed |