To access the full text documents, please follow this link:

Timely Error Detection for Effective Recovery in Light-Lockstep Automotive Systems
Hernandez, Carles; Abella, Jaume
Barcelona Supercomputing Center
Safety-relevant systems in the automotive domain often implement features such as lockstep execution for error detection, and reset and re-execution for error correction. Light-lockstep has already been adopted in some such systems due to its relatively low-implementation cost given that it does not require deep changes into nonlockstep hardware. Instead, as only off-core activities (i.e., data/addresses sent) need to be compared across different cores, light-lockstep designs are lowly intrusive. This approach has been proven sufficient to guarantee functional correctness of the system in the presence of errors in the cores, in particular in relation with certification against safety standards such as ISO26262 in the automotive domain. However, error detection in light-lockstep systems may occur long after the error actually occurs, thus jeopardizing timing guarantees, which are as critical as functional ones in hard real-time systems. In this paper, we analyze the timing behavior of errors due to transient and permanent faults in light-lockstep systems. Our results show that the time elapsed until an error is detected can be inordinately large, especially for permanent faults. Based on this observation and building upon the specific characteristics of light-lockstep systems, we propose lightly verbose (LiVe), a new mechanism to enforce the early detection of errors, due to both transient and permanent faults, thus enabling the computation of tight error detection timing bounds. We also analyze how existing mechanisms for error recovery in multicore systems increase their effectiveness when light-lockstep operates in LiVe mode in the context of mixed-criticality workloads.
The research leading to these results has received funding from the ARTEMIS Joint Undertaking VeTeSS project under grant agreement number 295311. This work has also been funded by the Ministry of Science and Technology of Spain under contract TIN2012-34557 and HiPEAC. Jaume Abella has been partially supported by the Ministry of Economy and Competitiveness under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.
Peer Reviewed
Àrees temàtiques de la UPC::Enginyeria electrònica
Automotive technology
Error detection and recovery in robotics
Real-time systems
Multiprocessing systems
Automotive engineering
Traffic engineering computing
Attribution-NonCommercial-NoDerivs 3.0 Spain
Institute of Electrical and Electronics Engineers (IEEE)

Show full item record

Related documents

Other documents of the same author

Espinosa, Jaime; Hernandez, Carles; Abella, Jaume; de Andres, David; Ruiz, Juan C.
Espinosa, Jaime; Hernandez, Carles; Abella, Jaume
Espinosa, Jaime; de Andrés, David; Ruiz, Juan-Carlos; Hernandez, Carles; Abella, Jaume
Riera, Marc; Canal, Ramon; Abella, Jaume; Gonzalez, Antonio
Jalle Ibarra, Javier; Fernández, Mikel; Abella, Jaume; Andersson, Jan; Patte, Mathieu; Fossati, Luca; Zulianello, Marco; Cazorla Almeida, Francisco Javier