Skip to content

AlphaGergedan/Teampi-Soft-Error-Resilience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

teaMPI with Soft Error Resilience

See the original repository here: https://gitlab.lrz.de/hpcsoftware/teaMPI

This repository tries to integrate soft error resilience to the teaMPI library.
(updating on branch ulfm_failure_tolerance)

We have added a single heartbeat option to send hash values of the results of the replicas. TeaMPI handles the comparison of the hash values transparent to the application. The user can create hash value from the biggest data structures of the application that is being used, and include them in the single heartbeats (see here for usage).

About

We modify the TeaMPI library for additional functionality.

Resources

Stars

Watchers

Forks