diff --git a/contents/robust_ai/robust_ai.qmd b/contents/robust_ai/robust_ai.qmd index 07888e61..118ff41b 100644 --- a/contents/robust_ai/robust_ai.qmd +++ b/contents/robust_ai/robust_ai.qmd @@ -10,7 +10,7 @@ Resources: [Slides](#sec-robust-ai-resource), [Labs](#sec-robust-ai-resource), [ ![_DALL·E 3 Prompt: Create an image featuring an advanced AI system symbolized by an intricate, glowing neural network, deeply nested within a series of progressively larger and more fortified shields. Each shield layer represents a layer of defense, showcasing the system's robustness against external threats and internal errors. The neural network, at the heart of this fortress of shields, radiates with connections that signify the AI's capacity for learning and adaptation. This visual metaphor emphasizes not only the technological sophistication of the AI but also its resilience and security, set against the backdrop of a state-of-the-art, secure server room filled with the latest in technological advancements. The image aims to convey the concept of ultimate protection and resilience in the field of artificial intelligence._](./images/png/cover_robust_ai.png) -The development of robust machine learning systems has become increasingly crucial. As these systems are deployed in various critical applications, from autonomous vehicles to healthcare diagnostics, ensuring their resilience to faults and errors is paramount. Robust AI, in the context of hardware faults, software faults, and errors, plays an important role in maintaining the reliability, safety, and performance of machine learning systems. By addressing the challenges posed by transient, permanent, and intermittent hardware faults @ahmadilivani2024systematic, as well as bugs, design flaws, and implementation errors in software @zhang2008distribution, robust AI techniques enable machine learning systems to operate effectively even in adverse conditions. This chapter explores the fundamental concepts, techniques, and tools essential for building fault-tolerant and error-resilient machine learning systems, empowering researchers and practitioners to develop AI solutions that can withstand the complexities and uncertainties of real-world environments. +The development of robust machine learning systems has become increasingly crucial. As these systems are deployed in various critical applications, from autonomous vehicles to healthcare diagnostics, ensuring their resilience to faults and errors is paramount. Robust AI, in the context of hardware faults, software faults, and errors, plays an important role in maintaining the reliability, safety, and performance of machine learning systems. By addressing the challenges posed by transient, permanent, and intermittent hardware faults [@ahmadilivani2024systematic], as well as bugs, design flaws, and implementation errors in software [@zhang2008distribution], robust AI techniques enable machine learning systems to operate effectively even in adverse conditions. This chapter explores the fundamental concepts, techniques, and tools essential for building fault-tolerant and error-resilient machine learning systems, empowering researchers and practitioners to develop AI solutions that can withstand the complexities and uncertainties of real-world environments. ::: {.callout-tip} @@ -56,7 +56,7 @@ Here are some real-world examples of cases where faults in hardware or software In February 2017, Amazon Web Services (AWS) experienced [a significant outage](https://aws.amazon.com/message/41926/) due to human error during maintenance. An engineer inadvertently entered an incorrect command, causing many servers to be taken offline. This outage disrupted many AWS services, including Amazon's AI-powered assistant, Alexa. As a result, Alexa-powered devices, such as Amazon Echo and third-party products using Alexa Voice Service, could not respond to user requests for several hours. This incident highlights the potential impact of human errors on cloud-based ML systems and the need for robust maintenance procedures and failsafe mechanisms. -In another example @dixit2021silent, Facebook encountered a silent data corruption issue within its distributed querying infrastructure ([Figure +In another example [@dixit2021silent], Facebook encountered a silent data corruption issue within its distributed querying infrastructure ([Figure 19.x](#8owvod923jax)). Facebook's infrastructure includes a querying system that fetches and executes SQL and SQL-like queries across multiple datasets using frameworks like Presto, Hive, and Spark. One of the applications that utilized this querying infrastructure was a compression application used to reduce the storage footprint of data stores. In this compression application, files were compressed when not being read and decompressed when a read request was made. Before decompression, the file size was checked to ensure it was greater than zero, indicating a valid compressed file with contents. ![Silent data corruption in database applications (Source: [Facebook](https://arxiv.org/pdf/2102.11245))](./images/png/sdc_example.png){#fig-sdc-example} @@ -95,9 +95,9 @@ As AI capabilities increasingly integrate into embedded systems, the potential f ## Hardware Faults -Hardware faults are a significant challenge in computing systems, including both traditional computing and ML systems. These faults occur when physical components, such as processors, memory modules, storage devices, or interconnects, malfunction or behave abnormally. Hardware faults can cause incorrect computations, data corruption, system crashes, or complete system failure, compromising the integrity and trustworthiness of the computations performed by the system @jha2019ml. +Hardware faults are a significant challenge in computing systems, including both traditional computing and ML systems. These faults occur when physical components, such as processors, memory modules, storage devices, or interconnects, malfunction or behave abnormally. Hardware faults can cause incorrect computations, data corruption, system crashes, or complete system failure, compromising the integrity and trustworthiness of the computations performed by the system [@jha2019ml]. -Understanding the taxonomy of hardware faults is essential for anyone working with computing systems, especially in the context of ML systems. ML systems rely on complex hardware architectures and large-scale computations to train and deploy models that learn from data and make intelligent predictions or decisions. However, hardware faults can introduce errors and inconsistencies in the ML pipeline, affecting the trained models' accuracy, robustness, and reliability @li2017understanding. +Understanding the taxonomy of hardware faults is essential for anyone working with computing systems, especially in the context of ML systems. ML systems rely on complex hardware architectures and large-scale computations to train and deploy models that learn from data and make intelligent predictions or decisions. However, hardware faults can introduce errors and inconsistencies in the ML pipeline, affecting the trained models' accuracy, robustness, and reliability [@li2017understanding]. Knowing the different types of hardware faults, their mechanisms, and their potential impact on system behavior is crucial for developing effective strategies to detect, mitigate, and recover them. This knowledge is also necessary for designing fault-tolerant computing systems, implementing robust ML algorithms, and ensuring the overall dependability of ML-based applications. @@ -115,7 +115,7 @@ By the end of this discussion, readers will have a solid understanding of fault #### Definition and Characteristics -Transient faults are characterized by their short duration and non-permanent nature. They typically manifest as single-event upsets (SEUs) or single-event transients (SETs), where a single bit or a group of bits in a memory location or register unexpectedly changes its value @mukherjee2005soft. These faults do not persist or leave any lasting impact on the hardware. However, they can still lead to incorrect computations, data corruption, or system misbehavior if not properly handled. +Transient faults are characterized by their short duration and non-permanent nature. They typically manifest as single-event upsets (SEUs) or single-event transients (SETs), where a single bit or a group of bits in a memory location or register unexpectedly changes its value [@mukherjee2005soft]. These faults do not persist or leave any lasting impact on the hardware. However, they can still lead to incorrect computations, data corruption, or system misbehavior if not properly handled. ![](./images/png/image22.png) @@ -134,9 +134,9 @@ Transient faults can manifest through different mechanisms depending on the affe A common example of a transient fault is a bit flip in the main memory. If an important data structure or critical instruction is stored in the affected memory location, it can lead to incorrect computations or program misbehavior. For instance, a bit flip in the memory storing a loop counter can cause the loop to execute indefinitely or terminate prematurely. Transient faults in control registers or flag bits can alter the flow of program execution, leading to unexpected jumps or incorrect branch decisions. In communication systems, transient faults can corrupt transmitted data packets, resulting in retransmissions or data loss. -In ML systems, transient faults can have significant implications during the training phase @he2023understanding. ML training involves iterative computations and updates to model parameters based on large datasets. If a transient fault occurs in the memory storing the model weights or gradients, it can lead to incorrect updates and compromise the convergence and accuracy of the training process. For example, a bit flip in the weight matrix of a neural network can cause the model to learn incorrect patterns or associations, leading to degraded performance @wan2021analyzing. Transient faults in the data pipeline, such as corruption of training samples or labels, can also introduce noise and affect the quality of the learned model. +In ML systems, transient faults can have significant implications during the training phase [@he2023understanding]. ML training involves iterative computations and updates to model parameters based on large datasets. If a transient fault occurs in the memory storing the model weights or gradients, it can lead to incorrect updates and compromise the convergence and accuracy of the training process. For example, a bit flip in the weight matrix of a neural network can cause the model to learn incorrect patterns or associations, leading to degraded performance [@wan2021analyzing]. Transient faults in the data pipeline, such as corruption of training samples or labels, can also introduce noise and affect the quality of the learned model. -During the inference phase, transient faults can impact the reliability and trustworthiness of ML predictions. If a transient fault occurs in the memory storing the trained model parameters or in the computation of the inference results, it can lead to incorrect or inconsistent predictions. For instance, a bit flip in the activation values of a neural network can alter the final classification or regression output @mahmoud2020pytorchfi. In safety-critical applications, such as autonomous vehicles or medical diagnosis, transient faults during inference can have severe consequences, leading to incorrect decisions or actions @li2017understanding @jha2019ml. Ensuring the resilience of ML systems against transient faults is crucial to maintaining the integrity and reliability of the predictions. +During the inference phase, transient faults can impact the reliability and trustworthiness of ML predictions. If a transient fault occurs in the memory storing the trained model parameters or in the computation of the inference results, it can lead to incorrect or inconsistent predictions. For instance, a bit flip in the activation values of a neural network can alter the final classification or regression output [@mahmoud2020pytorchfi]. In safety-critical applications, such as autonomous vehicles or medical diagnosis, transient faults during inference can have severe consequences, leading to incorrect decisions or actions [@li2017understanding] [@jha2019ml]. Ensuring the resilience of ML systems against transient faults is crucial to maintaining the integrity and reliability of the predictions. ### Permanent Faults @@ -158,16 +158,16 @@ Permanent faults can arise from several causes, including manufacturing defects #### Mechanisms of Permanent Faults -Permanent faults can manifest through various mechanisms, depending on the nature and location of the fault. Stuck-at faults @seong2010safer are common permanent faults where a signal or memory cell remains fixed at a particular value (either 0 or 1) regardless of the inputs ([Figure +Permanent faults can manifest through various mechanisms, depending on the nature and location of the fault. Stuck-at faults [@seong2010safer] are common permanent faults where a signal or memory cell remains fixed at a particular value (either 0 or 1) regardless of the inputs ([Figure 19.x](#ahtmh1s1mxgf)). Stuck-at faults can occur in logic gates, memory cells, or interconnects, causing incorrect computations or data corruption. Another mechanism is device failures, where a component, such as a transistor or a memory cell, completely ceases to function. This can be due to manufacturing defects or severe wear-out. Bridging faults occur when two or more signal lines are unintentionally connected, causing short circuits or incorrect logic behavior. ![Stuck-at Fault Model in Digital Circuits (Source: [Accendo Reliability](https://accendoreliability.com/digital-circuits-stuck-fault-model/))](./images/png/stuck_fault.png){#fig-stuck-fault} #### Impact on ML Systems -Permanent faults can severely affect the behavior and reliability of computing systems. For example, a stuck-at-fault in a processor's arithmetic logic unit (ALU) can cause incorrect computations, leading to erroneous results or system crashes. A permanent fault in a memory module, such as a stuck-at fault in a specific memory cell, can corrupt the stored data, causing data loss or program misbehavior. In storage devices, permanent faults like bad sectors or device failures can result in data inaccessibility or complete loss of stored information. Permanent interconnect faults can disrupt communication channels, causing data corruption or system hangs.\ \ Permanent faults can significantly affect ML systems during the training and inference phases. During training, permanent faults in processing units or memory can lead to incorrect computations, resulting in corrupted or suboptimal models. Faults in storage devices can corrupt the training data or the stored model parameters, leading to data loss or model inconsistencies @he2023understanding. During inference, permanent faults can impact the reliability and correctness of ML predictions. Faults in the processing units can produce incorrect results or cause system failures, while faults in memory storing the model parameters can lead to corrupted or outdated models being used for inference @zhang2018analyzing. +Permanent faults can severely affect the behavior and reliability of computing systems. For example, a stuck-at-fault in a processor's arithmetic logic unit (ALU) can cause incorrect computations, leading to erroneous results or system crashes. A permanent fault in a memory module, such as a stuck-at fault in a specific memory cell, can corrupt the stored data, causing data loss or program misbehavior. In storage devices, permanent faults like bad sectors or device failures can result in data inaccessibility or complete loss of stored information. Permanent interconnect faults can disrupt communication channels, causing data corruption or system hangs.\ \ Permanent faults can significantly affect ML systems during the training and inference phases. During training, permanent faults in processing units or memory can lead to incorrect computations, resulting in corrupted or suboptimal models. Faults in storage devices can corrupt the training data or the stored model parameters, leading to data loss or model inconsistencies [@he2023understanding]. During inference, permanent faults can impact the reliability and correctness of ML predictions. Faults in the processing units can produce incorrect results or cause system failures, while faults in memory storing the model parameters can lead to corrupted or outdated models being used for inference [@zhang2018analyzing]. -To mitigate the impact of permanent faults in ML systems, fault-tolerant techniques must be employed at both the hardware and software levels. Hardware redundancy, such as duplicating critical components or using error-correcting codes @kim2015bamboo, can help detect and recover from permanent faults. Software techniques, such as checkpoint and restart mechanisms @egwutuoha2013survey, can enable the system to recover from permanent faults by rolling back to a previously saved state. Regular monitoring, testing, and maintenance of ML systems can help identify and replace faulty components before they cause significant disruptions. +To mitigate the impact of permanent faults in ML systems, fault-tolerant techniques must be employed at both the hardware and software levels. Hardware redundancy, such as duplicating critical components or using error-correcting codes [@kim2015bamboo], can help detect and recover from permanent faults. Software techniques, such as checkpoint and restart mechanisms [@egwutuoha2013survey], can enable the system to recover from permanent faults by rolling back to a previously saved state. Regular monitoring, testing, and maintenance of ML systems can help identify and replace faulty components before they cause significant disruptions. Designing ML systems with fault tolerance in mind is crucial to ensure their reliability and robustness in the presence of permanent faults. This may involve incorporating redundancy, error detection and correction mechanisms, and fail-safe strategies into the system architecture. By proactively addressing the challenges posed by permanent faults, ML systems can maintain their integrity, accuracy, and trustworthiness, even in the face of hardware failures. @@ -180,28 +180,28 @@ Intermittent faults are hardware faults that occur sporadically and unpredictabl Intermittent faults are characterized by their sporadic and non-deterministic nature. They occur irregularly and may appear and disappear spontaneously, with varying durations and frequencies. These faults do not consistently manifest every time the affected component is used, making them harder to detect than permanent faults. Intermittent faults can affect various hardware components, including processors, memory modules, storage devices, or interconnects. They can cause transient errors, data corruption, or unexpected system behavior. -Intermittent faults can significantly impact the behavior and reliability of computing systems @rashid2014characterizing. For example, an intermittent fault in a processor's control logic can cause irregular program flow, leading to incorrect computations or system hangs. Intermittent faults in memory modules can corrupt data values, resulting in erroneous program execution or data inconsistencies. In storage devices, intermittent faults can cause read/write errors or data loss. Intermittent faults in communication channels can lead to data corruption, packet loss, or intermittent connectivity issues. These faults can cause system crashes, data integrity problems, or performance degradation, depending on the severity and frequency of the intermittent failures. +Intermittent faults can significantly impact the behavior and reliability of computing systems [@rashid2014characterizing]. For example, an intermittent fault in a processor's control logic can cause irregular program flow, leading to incorrect computations or system hangs. Intermittent faults in memory modules can corrupt data values, resulting in erroneous program execution or data inconsistencies. In storage devices, intermittent faults can cause read/write errors or data loss. Intermittent faults in communication channels can lead to data corruption, packet loss, or intermittent connectivity issues. These faults can cause system crashes, data integrity problems, or performance degradation, depending on the severity and frequency of the intermittent failures. ![Increased resistance due to an intermittent fault -- crack between copper bump and package solder (Source: [Constantinescu](https://ieeexplore.ieee.org/document/4925824))](./images/png/intermittent_fault.png){#fig-intermittent-fault} #### Causes of Intermittent Faults -Intermittent faults can arise from several causes, both internal and external to the hardware components @constantinescu2008intermittent. One common cause is aging and wear-out of the components. As electronic devices age, they become more susceptible to intermittent failures due to degradation mechanisms such as electromigration, oxide breakdown, or solder joint fatigue. Manufacturing defects or process variations can also introduce intermittent faults, where marginal or borderline components may exhibit sporadic failures under specific conditions ([Figure +Intermittent faults can arise from several causes, both internal and external to the hardware components [@constantinescu2008intermittent]. One common cause is aging and wear-out of the components. As electronic devices age, they become more susceptible to intermittent failures due to degradation mechanisms such as electromigration, oxide breakdown, or solder joint fatigue. Manufacturing defects or process variations can also introduce intermittent faults, where marginal or borderline components may exhibit sporadic failures under specific conditions ([Figure 19.x](#kix.7lswkjecl7ra)). Environmental factors, such as temperature fluctuations, humidity, or vibrations, can trigger intermittent faults by altering the electrical characteristics of the components. Loose or degraded connections, such as those in connectors or printed circuit boards, can also cause intermittent faults. ![Residue induced intermittent fault in a DRAM chip (Source: [Hynix Semiconductor](https://ieeexplore.ieee.org/document/4925824))](./images/png/intermittent_fault_dram.png){#fig-intermittent-fault-dram} #### Mechanisms of Intermittent Faults -Intermittent faults can manifest through various mechanisms, depending on the underlying cause and the affected component. One mechanism is the intermittent open or short circuit, where a signal path or connection becomes temporarily disrupted or shorted, causing erratic behavior. Another mechanism is the intermittent delay fault @zhang2018thundervolt, where the timing of signals or propagation delays becomes inconsistent, leading to synchronization issues or incorrect computations. Intermittent faults can also manifest as transient bit flips or soft errors in memory cells or registers, causing data corruption or incorrect program execution. +Intermittent faults can manifest through various mechanisms, depending on the underlying cause and the affected component. One mechanism is the intermittent open or short circuit, where a signal path or connection becomes temporarily disrupted or shorted, causing erratic behavior. Another mechanism is the intermittent delay fault [@zhang2018thundervolt], where the timing of signals or propagation delays becomes inconsistent, leading to synchronization issues or incorrect computations. Intermittent faults can also manifest as transient bit flips or soft errors in memory cells or registers, causing data corruption or incorrect program execution. #### Impact on ML Systems -In the context of ML systems, intermittent faults can introduce significant challenges and impact the system's reliability and performance. During the training phase, intermittent faults in processing units or memory can lead to inconsistencies in computations, resulting in incorrect or noisy gradients and weight updates. This can affect the convergence and accuracy of the training process, leading to suboptimal or unstable models. Intermittent data storage or retrieval faults can corrupt the training data, introducing noise or errors that degrade the quality of the learned models @he2023understanding. +In the context of ML systems, intermittent faults can introduce significant challenges and impact the system's reliability and performance. During the training phase, intermittent faults in processing units or memory can lead to inconsistencies in computations, resulting in incorrect or noisy gradients and weight updates. This can affect the convergence and accuracy of the training process, leading to suboptimal or unstable models. Intermittent data storage or retrieval faults can corrupt the training data, introducing noise or errors that degrade the quality of the learned models [@he2023understanding]. During the inference phase, intermittent faults can impact the reliability and consistency of ML predictions. Faults in the processing units or memory can cause incorrect computations or data corruption, leading to erroneous or inconsistent predictions. Intermittent faults in the data pipeline can introduce noise or errors in the input data, affecting the accuracy and robustness of the predictions. In safety-critical applications, such as autonomous vehicles or medical diagnosis systems, intermittent faults can have severe consequences, leading to incorrect decisions or actions that compromise safety and reliability. -Mitigating the impact of intermittent faults in ML systems requires a multi-faceted approach @rashid2012intermittent. At the hardware level, techniques such as robust design practices, component selection, and environmental control can help reduce the occurrence of intermittent faults. Redundancy and error correction mechanisms can be employed to detect and recover from intermittent failures. At the software level, runtime monitoring, anomaly detection, and fault-tolerant techniques can be incorporated into the ML pipeline. This may include techniques such as data validation, outlier detection, model ensembling, or runtime model adaptation to handle intermittent faults gracefully. +Mitigating the impact of intermittent faults in ML systems requires a multi-faceted approach [@rashid2012intermittent]. At the hardware level, techniques such as robust design practices, component selection, and environmental control can help reduce the occurrence of intermittent faults. Redundancy and error correction mechanisms can be employed to detect and recover from intermittent failures. At the software level, runtime monitoring, anomaly detection, and fault-tolerant techniques can be incorporated into the ML pipeline. This may include techniques such as data validation, outlier detection, model ensembling, or runtime model adaptation to handle intermittent faults gracefully. Designing ML systems that are resilient to intermittent faults is crucial to ensuring their reliability and robustness. This involves incorporating fault-tolerant techniques, runtime monitoring, and adaptive mechanisms into the system architecture. By proactively addressing the challenges of intermittent faults, ML systems can maintain their accuracy, consistency, and trustworthiness, even in sporadic hardware failures. Regular testing, monitoring, and maintenance of ML systems can help identify and mitigate intermittent faults before they cause significant disruptions or performance degradation. @@ -217,24 +217,24 @@ Fault detection techniques are important for identifying and localizing hardware Hardware-level fault detection techniques are implemented at the physical level of the system and aim to identify faults in the underlying hardware components. There are several hardware techniques, but broadly, we can bucket these different mechanisms into the following categories. -**Built-in self-test (BIST) mechanisms:** BIST is a powerful technique for detecting faults in hardware components @bushnell2002built. It involves incorporating additional hardware circuitry into the system for self-testing and fault detection. BIST can be applied to various components, such as processors, memory modules, or application-specific integrated circuits (ASICs). For example, BIST can be implemented in a processor using scan chains, which are dedicated paths that allow access to internal registers and logic for testing purposes. +**Built-in self-test (BIST) mechanisms:** BIST is a powerful technique for detecting faults in hardware components [@bushnell2002built]. It involves incorporating additional hardware circuitry into the system for self-testing and fault detection. BIST can be applied to various components, such as processors, memory modules, or application-specific integrated circuits (ASICs). For example, BIST can be implemented in a processor using scan chains, which are dedicated paths that allow access to internal registers and logic for testing purposes. During the BIST process, predefined test patterns are applied to the processor's internal circuitry, and the responses are compared against expected values. Any discrepancies indicate the presence of faults. Intel's Xeon processors, for instance, include BIST mechanisms to test the CPU cores, cache memory, and other critical components during system startup. -**Error detection codes:** Error detection codes are widely used to detect data storage and transmission errors @hamming1950error. These codes add redundant bits to the original data, allowing the detection of bit errors. Example: Parity checks are a simple form of error detection code ([Figure 19.x](#kix.2vxlbeehnemj)). In a single-bit parity scheme, an extra bit is appended to each data word, making the number of 1s in the word even (even parity) or odd (odd parity). +**Error detection codes:** Error detection codes are widely used to detect data storage and transmission errors [@hamming1950error]. These codes add redundant bits to the original data, allowing the detection of bit errors. Example: Parity checks are a simple form of error detection code ([Figure 19.x](#kix.2vxlbeehnemj)). In a single-bit parity scheme, an extra bit is appended to each data word, making the number of 1s in the word even (even parity) or odd (odd parity). ![Parity bit example (Source: [Computer Hope](https://www.computerhope.com/jargon/p/paritybi.htm))](./images/png/parity.png){#fig-parity} When reading the data, the parity is checked, and if it doesn't match the expected value, an error is ㄎdetected. More advanced error detection codes, such as cyclic redundancy checks (CRC), calculate a checksum based on the data and append it to the message. The checksum is recalculated at the receiving end and compared with the transmitted checksum to detect errors. Error-correcting code (ECC) memory modules, commonly used in servers and critical systems, employ advanced error detection and correction codes to detect and correct single-bit or multi-bit errors in memory. -**Hardware redundancy and voting mechanisms:** Hardware redundancy involves duplicating critical components and comparing their outputs to detect and mask faults @sheaffer2007hardware. Voting mechanisms, such as triple modular redundancy (TMR), employ multiple instances of a component and compare their outputs to identify and mask faulty behavior @arifeen2020approximate. In a TMR system, three identical instances of a hardware component, such as a processor or a sensor, perform the same computation in parallel. The outputs of these instances are fed into a voting circuit, which compares the results and selects the majority value as the final output. If one of the instances produces an incorrect result due to a fault, the voting mechanism masks the error and maintains the correct output. TMR is commonly used in aerospace and aviation systems, where high reliability is critical. For instance, the Boeing 777 aircraft employs TMR in its primary flight computer system to ensure the availability and correctness of flight control functions @yeh1996triple. +**Hardware redundancy and voting mechanisms:** Hardware redundancy involves duplicating critical components and comparing their outputs to detect and mask faults [@sheaffer2007hardware]. Voting mechanisms, such as triple modular redundancy (TMR), employ multiple instances of a component and compare their outputs to identify and mask faulty behavior [@arifeen2020approximate]. In a TMR system, three identical instances of a hardware component, such as a processor or a sensor, perform the same computation in parallel. The outputs of these instances are fed into a voting circuit, which compares the results and selects the majority value as the final output. If one of the instances produces an incorrect result due to a fault, the voting mechanism masks the error and maintains the correct output. TMR is commonly used in aerospace and aviation systems, where high reliability is critical. For instance, the Boeing 777 aircraft employs TMR in its primary flight computer system to ensure the availability and correctness of flight control functions [@yeh1996triple]. Tesla's self-driving computers (SDCs) employ a redundant hardware architecture to ensure the safety and reliability of critical functions, such as perception, decision-making, and vehicle control ([Figure 19.x](#kix.nsc1yczcug9r)). One key component of this architecture is using DMR in the car's onboard computer systems. ![Tesla full self-driving computer with dual redundant SoCs (Source: [Tesla](https://old.hotchips.org/hc31/HC31_2.3_Tesla_Hotchips_ppt_Final_0817.pdf))](./images/png/tesla_dmr.png){#fig-tesla-dmr} -In Tesla's DMR implementation, two identical hardware units, often called \"redundant computers\" or \"redundant control units,\" perform the same computations in parallel @bannon2019computer. Each unit independently processes sensor data, executes perception and decision-making algorithms, and generates control commands for the vehicle's actuators (e.g., steering, acceleration, and braking). +In Tesla's DMR implementation, two identical hardware units, often called \"redundant computers\" or \"redundant control units,\" perform the same computations in parallel [@bannon2019computer]. Each unit independently processes sensor data, executes perception and decision-making algorithms, and generates control commands for the vehicle's actuators (e.g., steering, acceleration, and braking). The outputs of these two redundant units are continuously compared to detect any discrepancies or faults. If the outputs match, the system assumes that both units are functioning correctly, and the control commands are sent to the vehicle's actuators. However, if there is a mismatch between the outputs, the system identifies a potential fault in one of the units and takes appropriate action to ensure safe operation. @@ -248,7 +248,7 @@ It's important to note that while DMR provides fault detection and some level of The use of DMR in Tesla's SDCs highlights the importance of hardware redundancy in safety-critical applications. By employing redundant computing units and comparing their outputs, the system can detect and mitigate faults, enhancing the overall safety and reliability of the self-driving functionality. -**Watchdog timers:** Watchdog timers are hardware components that monitor the execution of critical tasks or processes @pont2002using. They are commonly used to detect and recover from software or hardware faults that cause a system to become unresponsive or stuck in an infinite loop. In an embedded system, a watchdog timer can be configured to monitor the execution of the main control loop ([Figure +**Watchdog timers:** Watchdog timers are hardware components that monitor the execution of critical tasks or processes [@pont2002using]. They are commonly used to detect and recover from software or hardware faults that cause a system to become unresponsive or stuck in an infinite loop. In an embedded system, a watchdog timer can be configured to monitor the execution of the main control loop ([Figure 19.x](#3l259jcz0lli)). The software periodically resets the watchdog timer to indicate that it functions correctly. Suppose the software fails to reset the timer within a specified time limit (timeout period). In that case, the watchdog timer assumes that the system has encountered a fault and triggers a predefined recovery action, such as resetting the system or switching to a backup component. Watchdog timers are widely used in automotive electronics, industrial control systems, and other safety-critical applications to ensure the timely detection and recovery from faults. ![Watchdog timer example in detecting MCU faults (Source: [Ablic](https://www.ablic.com/en/semicon/products/automotive/automotive-watchdog-timer/intro/))](./images/png/watchdog.png){#fig-watchdog} @@ -257,21 +257,21 @@ The use of DMR in Tesla's SDCs highlights the importance of hardware redundancy Software-level fault detection techniques rely on software algorithms and monitoring mechanisms to identify system faults. These techniques can be implemented at various levels of the software stack, including the operating system, middleware, or application level. -**Runtime monitoring and anomaly detection:** Runtime monitoring involves continuously observing the behavior of the system and its components during execution @francalanza2017foundation. It helps detect anomalies, errors, or unexpected behavior that may indicate the presence of faults. For example, Consider an ML-based image classification system deployed in a self-driving car. Runtime monitoring can be implemented to track the classification model's performance and behavior. +**Runtime monitoring and anomaly detection:** Runtime monitoring involves continuously observing the behavior of the system and its components during execution [@francalanza2017foundation]. It helps detect anomalies, errors, or unexpected behavior that may indicate the presence of faults. For example, Consider an ML-based image classification system deployed in a self-driving car. Runtime monitoring can be implemented to track the classification model's performance and behavior. -Anomaly detection algorithms can be applied to the model's predictions or intermediate layer activations, such as statistical outlier detection or machine learning-based approaches (e.g., One-Class SVM or Autoencoders) @chandola2009anomaly ([Figure 19.x](#a0u8fu59ui0r)). Suppose the monitoring system detects a significant deviation from the expected patterns, such as a sudden drop in classification accuracy or out-of-distribution samples. In that case, it can raise an alert indicating a potential fault in the model or the input data pipeline. This early detection allows for timely intervention and fault mitigation strategies to be applied. +Anomaly detection algorithms can be applied to the model's predictions or intermediate layer activations, such as statistical outlier detection or machine learning-based approaches (e.g., One-Class SVM or Autoencoders) [@chandola2009anomaly] ([Figure 19.x](#a0u8fu59ui0r)). Suppose the monitoring system detects a significant deviation from the expected patterns, such as a sudden drop in classification accuracy or out-of-distribution samples. In that case, it can raise an alert indicating a potential fault in the model or the input data pipeline. This early detection allows for timely intervention and fault mitigation strategies to be applied. ![Examples of anomaly detection. (a) Fully supervised anomaly detection, (b) normal-only anomaly detection, (c, d, e) semi-supervised anomaly detection, (f) unsupervised anomaly detection (Source: [Google](https://www.google.com/url?sa=i&url=http%3A%2F%2Fresearch.google%2Fblog%2Funsupervised-and-semi-supervised-anomaly-detection-with-data-centric-ml%2F&psig=AOvVaw1p9owe13lxfZogUHTZnxrj&ust=1714877457779000&source=images&cd=vfe&opi=89978449&ved=0CBIQjRxqFwoTCIjMmMP-8oUDFQAAAAAdAAAAABAE))](./images/png/ad.png){#fig-ad} -**Consistency checks and data validation:** Consistency checks and data validation techniques ensure data integrity and correctness at different processing stages in an ML system @lindholm2019data. These checks help detect data corruption, inconsistencies, or errors that may propagate and affect the system's behavior. Example: In a distributed ML system where multiple nodes collaborate to train a model, consistency checks can be implemented to validate the integrity of the shared model parameters. Each node can compute a checksum or hash of the model parameters before and after the training iteration (Figure ). Any inconsistencies or data corruption can be detected by comparing the checksums across nodes. Additionally, range checks can be applied to the input data and model outputs to ensure they fall within expected bounds. For instance, if an autonomous vehicle's perception system detects an object with unrealistic dimensions or velocities, it can indicate a fault in the sensor data or the perception algorithms @wan2023vpp. +**Consistency checks and data validation:** Consistency checks and data validation techniques ensure data integrity and correctness at different processing stages in an ML system [@lindholm2019data]. These checks help detect data corruption, inconsistencies, or errors that may propagate and affect the system's behavior. Example: In a distributed ML system where multiple nodes collaborate to train a model, consistency checks can be implemented to validate the integrity of the shared model parameters. Each node can compute a checksum or hash of the model parameters before and after the training iteration (Figure ). Any inconsistencies or data corruption can be detected by comparing the checksums across nodes. Additionally, range checks can be applied to the input data and model outputs to ensure they fall within expected bounds. For instance, if an autonomous vehicle's perception system detects an object with unrealistic dimensions or velocities, it can indicate a fault in the sensor data or the perception algorithms [@wan2023vpp]. -**Heartbeat and timeout mechanisms:** Heartbeat mechanisms and timeouts are commonly used to detect faults in distributed systems and ensure the liveness and responsiveness of components @kawazoe1997heartbeat. These are quite similar to the watchdog timers found in hardware. For example, in a distributed ML system, where multiple nodes collaborate to perform tasks such as data preprocessing, model training, or inference, heartbeat mechanisms can be implemented to monitor the health and availability of each node. Each node periodically sends a heartbeat message to a central coordinator or to its peer nodes, indicating its status and availability. Suppose a node fails to send a heartbeat within a specified timeout period ([Figure 19.x](#ojufkz2g56e)). In that case, it is considered faulty, and appropriate actions can be taken, such as redistributing the workload or initiating a failover mechanism. Timeouts can also be used to detect and handle hanging or unresponsive components. For example, if a data loading process exceeds a predefined timeout threshold, it may indicate a fault in the data pipeline, and the system can take corrective measures. +**Heartbeat and timeout mechanisms:** Heartbeat mechanisms and timeouts are commonly used to detect faults in distributed systems and ensure the liveness and responsiveness of components [@kawazoe1997heartbeat]. These are quite similar to the watchdog timers found in hardware. For example, in a distributed ML system, where multiple nodes collaborate to perform tasks such as data preprocessing, model training, or inference, heartbeat mechanisms can be implemented to monitor the health and availability of each node. Each node periodically sends a heartbeat message to a central coordinator or to its peer nodes, indicating its status and availability. Suppose a node fails to send a heartbeat within a specified timeout period ([Figure 19.x](#ojufkz2g56e)). In that case, it is considered faulty, and appropriate actions can be taken, such as redistributing the workload or initiating a failover mechanism. Timeouts can also be used to detect and handle hanging or unresponsive components. For example, if a data loading process exceeds a predefined timeout threshold, it may indicate a fault in the data pipeline, and the system can take corrective measures. ![Heartbeat messages in distributed systems (Source: [GeeksforGeeks](https://www.geeksforgeeks.org/what-are-heartbeat-messages/))](./images/png/heartbeat.png){#fig-heartbeat} Figure 19.x Heartbeat messages in distributed systems. Credit: [GeeksforGeeks](https://www.geeksforgeeks.org/what-are-heartbeat-messages/) -**Software-implemented fault tolerance (SIFT) techniques:** SIFT techniques introduce redundancy and fault detection mechanisms at the software level to enhance the reliability and fault tolerance of the system @reis2005swift. Example: N-version programming is a SIFT technique where multiple functionally equivalent software component versions are developed independently by different teams. This can be applied to critical components such as the model inference engine in an ML system. Multiple versions of the inference engine can be executed in parallel, and their outputs can be compared for consistency. It is considered the correct result if most versions produce the same output. If there is a discrepancy, it indicates a potential fault in one or more versions, and appropriate error-handling mechanisms can be triggered. Another example is using software-based error correction codes, such as Reed-Solomon codes @plank1997tutorial, to detect and correct errors in data storage or transmission ([Figure 19.x](#kjmtegsny44z)). These codes add redundancy to the data, enabling the detection and correction of certain errors and enhancing the system's fault tolerance. +**Software-implemented fault tolerance (SIFT) techniques:** SIFT techniques introduce redundancy and fault detection mechanisms at the software level to enhance the reliability and fault tolerance of the system [@reis2005swift]. Example: N-version programming is a SIFT technique where multiple functionally equivalent software component versions are developed independently by different teams. This can be applied to critical components such as the model inference engine in an ML system. Multiple versions of the inference engine can be executed in parallel, and their outputs can be compared for consistency. It is considered the correct result if most versions produce the same output. If there is a discrepancy, it indicates a potential fault in one or more versions, and appropriate error-handling mechanisms can be triggered. Another example is using software-based error correction codes, such as Reed-Solomon codes [@plank1997tutorial], to detect and correct errors in data storage or transmission ([Figure 19.x](#kjmtegsny44z)). These codes add redundancy to the data, enabling the detection and correction of certain errors and enhancing the system's fault tolerance. ![n-bits representation of the Reed-Solomon codes (Source: [GeeksforGeeks](https://www.geeksforgeeks.org/what-is-reed-solomon-code/))](./images/png/Reed-Solomon.png){#fig-Reed-Solomon} @@ -295,25 +295,25 @@ Table XX provides an extensive comparative analysis of transient, permanent, and #### Definition and Characteristics -Adversarial attacks are methods that aim to trick models into making incorrect predictions by providing it with specially crafted, deceptive inputs (called adversarial examples) \[@parrish2023adversarial\]. By adding slight perturbations to input data, adversaries can \"hack\" a model's pattern recognition and deceive it. These are sophisticated techniques where slight, often imperceptible alterations to input data can trick an ML model into making a wrong prediction. +Adversarial attacks are methods that aim to trick models into making incorrect predictions by providing it with specially crafted, deceptive inputs (called adversarial examples) \[[@parrish2023adversarial]\]. By adding slight perturbations to input data, adversaries can \"hack\" a model's pattern recognition and deceive it. These are sophisticated techniques where slight, often imperceptible alterations to input data can trick an ML model into making a wrong prediction. -In text-to-image models like DALLE \[@ramesh2021zero\] or Stable Diffusion \[@rombach2022highresolution\], one can generate prompts that lead to unsafe images. For example, by altering the pixel values of an image, attackers can deceive a facial recognition system into identifying a face as a different person. +In text-to-image models like DALLE \[[@ramesh2021zero]\] or Stable Diffusion \[[@rombach2022highresolution]\], one can generate prompts that lead to unsafe images. For example, by altering the pixel values of an image, attackers can deceive a facial recognition system into identifying a face as a different person. Adversarial attacks exploit the way ML models learn and make decisions during inference. These models work on the principle of recognizing patterns in data. An adversary crafts special inputs with perturbations to mislead the model's pattern recognition\-\--essentially 'hacking' the model's perceptions. Adversarial attacks fall under different scenarios: -\* \*\*Whitebox Attacks:\*\* the attacker possess full knowledge of the target model's internal workings, including the training data,parameters, and architecture @ye2021thundernna. This comprehensive access creates favorable conditions for an attacker to exploit the model's vulnerabilities. The attacker can take advantage of specific and subtle weaknesses to craft effective adversarial examples. +\* \*\*Whitebox Attacks:\*\* the attacker possess full knowledge of the target model's internal workings, including the training data,parameters, and architecture [@ye2021thundernna]. This comprehensive access creates favorable conditions for an attacker to exploit the model's vulnerabilities. The attacker can take advantage of specific and subtle weaknesses to craft effective adversarial examples. -\* \*\*Blackbox Attacks:\*\* in contrast to whitebox attacks, in blackbox attacks, the attacker has little to no knowledge of the target model @guo2019simple. To carry out the attack, the adversarial actor needs to make careful observations of the model's output behavior. +\* \*\*Blackbox Attacks:\*\* in contrast to whitebox attacks, in blackbox attacks, the attacker has little to no knowledge of the target model [@guo2019simple]. To carry out the attack, the adversarial actor needs to make careful observations of the model's output behavior. -\* \*\*Greybox Attacks:\*\* these fall in between blackbox and whitebox attacks. The attacker has only partial knowledge about the target model's internal design @xu2021grey. For example, the attacker could have knowledge about training data but not the architecture or parameters. In the real-world, practical attacks fall under both blackbox and greybox scenarios. +\* \*\*Greybox Attacks:\*\* these fall in between blackbox and whitebox attacks. The attacker has only partial knowledge about the target model's internal design [@xu2021grey]. For example, the attacker could have knowledge about training data but not the architecture or parameters. In the real-world, practical attacks fall under both blackbox and greybox scenarios. The landscape of machine learning models is both complex and broad, especially given their relatively recent integration into commercial applications. This rapid adoption, while transformative, has brought to light numerous vulnerabilities within these models. Consequently, a diverse array of adversarial attack methods has emerged, each strategically exploiting different aspects of different models. Below, we highlight a subset of these methods, showcasing the multifaceted nature of adversarial attacks on machine learning models: -\* \*\*Generative Adversarial Networks (GANs)\*\* are deep learning models that consist of two networks competing against each other: a generator and and a discriminator \[@goodfellow2020generative\]. The generator tries to synthesize realistic data, while the discriminator evaluates whether they are real or fake. GANs can be used to craft adversarial examples. The generator network is trained to produce inputs that are misclassified by the target model. These GAN-generated images can then be used to attack a target classifier or detection model. The generator and the target model are engaged in a competitive process, with the generator continually improving its ability to create deceptive examples, and the target model enhancing its resistance to such examples. GANs provide a powerful framework for crafting complex and diverse adversarial inputs, illustrating the adaptability of generative models in the adversarial landscape. +\* \*\*Generative Adversarial Networks (GANs)\*\* are deep learning models that consist of two networks competing against each other: a generator and and a discriminator \[[@goodfellow2020generative]\]. The generator tries to synthesize realistic data, while the discriminator evaluates whether they are real or fake. GANs can be used to craft adversarial examples. The generator network is trained to produce inputs that are misclassified by the target model. These GAN-generated images can then be used to attack a target classifier or detection model. The generator and the target model are engaged in a competitive process, with the generator continually improving its ability to create deceptive examples, and the target model enhancing its resistance to such examples. GANs provide a powerful framework for crafting complex and diverse adversarial inputs, illustrating the adaptability of generative models in the adversarial landscape. -\* \*\*Transfer Learning Adversarial Attacks\*\* exploit the knowledge transferred from a pre-trained model to a target model, enabling the creation of adversarial examples that can deceive both models.These attacks pose a growing concern, particularly when adversaries have knowledge of the feature extractor but lack access to the classification head (the part or layer that is responsible for making the final classifications). Referred to as\"headless attacks,\" these transferable adversarial strategies leverage the expressive capabilities of feature extractors to craft perturbations while being oblivious to the label space or training data. The existence of such attacks underscores the importance of developing robust defenses for transfer learning applications, especially since pre-trained models are commonly used \[@ahmed2020headless\]. +\* \*\*Transfer Learning Adversarial Attacks\*\* exploit the knowledge transferred from a pre-trained model to a target model, enabling the creation of adversarial examples that can deceive both models.These attacks pose a growing concern, particularly when adversaries have knowledge of the feature extractor but lack access to the classification head (the part or layer that is responsible for making the final classifications). Referred to as\"headless attacks,\" these transferable adversarial strategies leverage the expressive capabilities of feature extractors to craft perturbations while being oblivious to the label space or training data. The existence of such attacks underscores the importance of developing robust defenses for transfer learning applications, especially since pre-trained models are commonly used \[[@ahmed2020headless]\]. #### Mechanisms of Adversarial Attacks @@ -373,7 +373,7 @@ As the field of adversarial machine learning evolves, researchers continue to ex Adversarial attacks on machine learning systems have emerged as a significant concern in recent years, highlighting the potential vulnerabilities and risks associated with the widespread adoption of ML technologies. These attacks involve carefully crafted perturbations to input data that can deceive or mislead ML models, leading to incorrect predictions or misclassifications. The impact of adversarial attacks on ML systems is far-reaching and can have serious consequences in various domains. -One striking example of the impact of adversarial attacks was demonstrated by researchers in 2017. They experimented with small black and white stickers on stop signs \[@eykholt2018robust\]. To the human eye, these stickers did not obscure the sign or prevent its interpretability. However, when images of the sticker-modified stop signs were fed into standard traffic sign classification ML models, a shocking result emerged. The models misclassified the stop signs as speed limit signs over 85% of the time. +One striking example of the impact of adversarial attacks was demonstrated by researchers in 2017. They experimented with small black and white stickers on stop signs \[[@eykholt2018robust]\]. To the human eye, these stickers did not obscure the sign or prevent its interpretability. However, when images of the sticker-modified stop signs were fed into standard traffic sign classification ML models, a shocking result emerged. The models misclassified the stop signs as speed limit signs over 85% of the time. This demonstration shed light on the alarming potential of simple adversarial stickers to trick ML systems into misreading critical road signs. The implications of such attacks in the real world are significant, particularly in the context of autonomous vehicles. If deployed on actual roads, these adversarial stickers could cause self-driving cars to misinterpret stop signs as speed limits, leading to dangerous situations. Researchers warned that this could result in rolling stops or unintended acceleration into intersections, endangering public safety. @@ -383,11 +383,11 @@ This demonstration shed light on the alarming potential of simple adversarial st The case study of the adversarial stickers on stop signs provides a concrete illustration of how adversarial examples exploit how ML models recognize patterns. By subtly manipulating the input data in ways that are imperceptible to humans, attackers can induce incorrect predictions and create serious risks, especially in safety-critical applications like autonomous vehicles. The attack's simplicity highlights the vulnerability of ML models to even minor changes in the input, emphasizing the need for robust defenses against such threats. -The impact of adversarial attacks extends beyond the degradation of model performance. These attacks raise significant security and safety concerns, particularly in domains where ML models are relied upon for critical decision-making. In healthcare applications, adversarial attacks on medical imaging models could lead to misdiagnosis or incorrect treatment recommendations, jeopardizing patient well-being @tsai2023adversarial. In financial systems, adversarial attacks could enable fraud or manipulation of trading algorithms, resulting in substantial economic losses. +The impact of adversarial attacks extends beyond the degradation of model performance. These attacks raise significant security and safety concerns, particularly in domains where ML models are relied upon for critical decision-making. In healthcare applications, adversarial attacks on medical imaging models could lead to misdiagnosis or incorrect treatment recommendations, jeopardizing patient well-being [@tsai2023adversarial]. In financial systems, adversarial attacks could enable fraud or manipulation of trading algorithms, resulting in substantial economic losses. -Moreover, adversarial vulnerabilities undermine the trustworthiness and interpretability of ML models. If carefully crafted perturbations can easily fool models, it erodes confidence in their predictions and decisions. Adversarial examples expose the models' reliance on superficial patterns and their inability to capture the true underlying concepts, challenging the reliability of ML systems @fursov2021adversarial. +Moreover, adversarial vulnerabilities undermine the trustworthiness and interpretability of ML models. If carefully crafted perturbations can easily fool models, it erodes confidence in their predictions and decisions. Adversarial examples expose the models' reliance on superficial patterns and their inability to capture the true underlying concepts, challenging the reliability of ML systems [@fursov2021adversarial]. -Defending against adversarial attacks often requires additional computational resources and can impact the overall system performance. Techniques like adversarial training, where models are trained on adversarial examples to improve robustness, can significantly increase training time and computational requirements @bai2021recent. Runtime detection and mitigation mechanisms, such as input preprocessing @addepalli2020towards or prediction consistency checks, introduce latency and affect the real-time performance of ML systems. +Defending against adversarial attacks often requires additional computational resources and can impact the overall system performance. Techniques like adversarial training, where models are trained on adversarial examples to improve robustness, can significantly increase training time and computational requirements [@bai2021recent]. Runtime detection and mitigation mechanisms, such as input preprocessing [@addepalli2020towards] or prediction consistency checks, introduce latency and affect the real-time performance of ML systems. The presence of adversarial vulnerabilities also complicates the deployment and maintenance of ML systems. System designers and operators must consider the potential for adversarial attacks and incorporate appropriate defenses and monitoring mechanisms. Regular updates and retraining of models become necessary to adapt to new adversarial techniques and maintain system security and performance over time. @@ -397,7 +397,7 @@ The impact of adversarial attacks on ML systems is significant and multifaceted. #### Definition and Characteristics -Data poisoning is an attack where the training data is tampered with, leading to a compromised model \[@biggio2012poisoning\]. Attackers can modify existing training examples, insert new malicious data points, or influence the data collection process. The poisoned data is labeled in such a way as to skew the model's learned behavior. This can be particularly damaging in applications where ML models make automated decisions based on learned patterns. Beyond training sets, poisoning tests and validation data can allow adversaries to boost reported model performance artificially. +Data poisoning is an attack where the training data is tampered with, leading to a compromised model \[[@biggio2012poisoning]\]. Attackers can modify existing training examples, insert new malicious data points, or influence the data collection process. The poisoned data is labeled in such a way as to skew the model's learned behavior. This can be particularly damaging in applications where ML models make automated decisions based on learned patterns. Beyond training sets, poisoning tests and validation data can allow adversaries to boost reported model performance artificially. ![NightShade's poisoning effects on Stable Diffusion (Source: [TOMÉ](https://telefonicatech.com/en/blog/attacks-on-artificial-intelligence-iii-data-poisoning))](./images/png/poisoning_example.png){#fig-poisoning-example} @@ -409,9 +409,9 @@ The process usually involves the following steps: \* \*\*Deployment:\*\* Once the model is deployed, the corrupted training leads to flawed decision-making or predictable vulnerabilities the attacker can exploit. -The impacts of data poisoning extend beyond just classification errors or accuracy drops. In critical applications like healthcare, such alterations can lead to significant trust and safety issues @marulli2022sensitivity. Later on we will discuss a few case studies of these issues. +The impacts of data poisoning extend beyond just classification errors or accuracy drops. In critical applications like healthcare, such alterations can lead to significant trust and safety issues [@marulli2022sensitivity]. Later on we will discuss a few case studies of these issues. -There are six main categories of data poisoning \[@oprea2022poisoning\]: +There are six main categories of data poisoning \[[@oprea2022poisoning]\]: \* \*\*Availability Attacks\*\*: these attacks aim to compromise the overall functionality of a model. They cause it to misclassify most testing samples, rendering the model unusable for practical applications. An example is label flipping, where labels of a specific, targeted class are replaced with labels from a different one. @@ -441,7 +441,7 @@ The characteristics of data poisoning include: Data poisoning attacks can be carried out through various mechanisms, exploiting different ML pipeline vulnerabilities. These mechanisms allow attackers to manipulate the training data and introduce malicious samples that can compromise the model's performance, fairness, or integrity. Understanding these mechanisms is crucial for developing effective defenses against data poisoning and ensuring the robustness of ML systems. Data poisoning mechanisms can be broadly categorized based on the attacker's approach and the stage of the ML pipeline they target. Some common mechanisms include modifying training data labels, altering feature values, injecting carefully crafted malicious samples, exploiting data collection and preprocessing vulnerabilities, manipulating data at the source, poisoning data in online learning scenarios, and collaborating with insiders to manipulate data. -Each of these mechanisms presents unique challenges and requires different mitigation strategies. For example, detecting label manipulation may involve analyzing the distribution of labels and identifying anomalies @zhou2018learning, while preventing feature manipulation may require secure data preprocessing and anomaly detection techniques @carta2020local. Defending against insider threats may involve strict access control policies and monitoring of data access patterns. Moreover, the effectiveness of data poisoning attacks often depends on the attacker's knowledge of the ML system, including the model architecture, training algorithms, and data distribution. Attackers may use techniques such as adversarial machine learning or data synthesis to craft samples that are more likely to bypass detection and achieve their malicious objectives. +Each of these mechanisms presents unique challenges and requires different mitigation strategies. For example, detecting label manipulation may involve analyzing the distribution of labels and identifying anomalies [@zhou2018learning], while preventing feature manipulation may require secure data preprocessing and anomaly detection techniques [@carta2020local]. Defending against insider threats may involve strict access control policies and monitoring of data access patterns. Moreover, the effectiveness of data poisoning attacks often depends on the attacker's knowledge of the ML system, including the model architecture, training algorithms, and data distribution. Attackers may use techniques such as adversarial machine learning or data synthesis to craft samples that are more likely to bypass detection and achieve their malicious objectives. ![Garbage In -- Garbage Out (Source: [Information Matters](https://informationmatters.net/data-poisoning-ai/))](./images/png/distribution_shift_example.png){#fig-distribution-shift-example} @@ -483,7 +483,7 @@ Addressing the impact of data poisoning requires a proactive approach to data se ##### Case Study 1 -In 2017, researchers demonstrated a data poisoning attack against a popular toxicity classification model called Perspective \[@hosseini2017deceiving\]. This ML model is used to detect toxic comments online. +In 2017, researchers demonstrated a data poisoning attack against a popular toxicity classification model called Perspective \[[@hosseini2017deceiving]\]. This ML model is used to detect toxic comments online. The researchers added synthetically generated toxic comments with slight misspellings and grammatical errors to the model's training data. This slowly corrupted the model, causing it to misclassify increasing numbers of severely toxic inputs as non-toxic over time. @@ -495,7 +495,7 @@ This case highlights how data poisoning can degrade model accuracy and reliabili ![Samples of dirty-label poison data regarding mismatched text/image pairs (Source: [Shan](https://arxiv.org/pdf/2310.13828))](./images/png/dirty_label_example.png){#fig-dirty-label-example} -Interestingly enough, data poisoning attacks are not always malicious \[@shan2023prompt\]. Nightshade, a tool developed by a team led by Professor Ben Zhao at the University of Chicago, utilizes data poisoning to help artists protect their art against scraping and copyright violations by generative AI models. Artists can use the tool to make subtle modifications to their images before uploading them online. +Interestingly enough, data poisoning attacks are not always malicious \[[@shan2023prompt]\]. Nightshade, a tool developed by a team led by Professor Ben Zhao at the University of Chicago, utilizes data poisoning to help artists protect their art against scraping and copyright violations by generative AI models. Artists can use the tool to make subtle modifications to their images before uploading them online. While these changes are indiscernible to the human eye, they can significantly disrupt the performance of generative AI models when incorporated into the training data. Generative models can be manipulated into generating hallucinations and weird images. For example, with only 300 poisoned images, the University of Chicago researchers were able to trick the latest Stable Diffusion model into generating images of dogs that look like cats or images of cows when prompted for cars. @@ -503,9 +503,9 @@ As the number of poisoned images on the internet increases, the performance of t On the flip side, this tool can be used maliciously and can affect legitimate applications of the generative models. This goes to show the very challenging and novel nature of machine learning attacks. -\@fig-poisoning demonstrates the effects of different levels of data poisoning (50 samples, 100 samples, and 300 samples of poisoned images) on generating images in different categories. Notice how the images start deforming and deviating from the desired category. For example , after 300 poison samples a car prompt generates a cow. +\[@fig]-poisoning demonstrates the effects of different levels of data poisoning (50 samples, 100 samples, and 300 samples of poisoned images) on generating images in different categories. Notice how the images start deforming and deviating from the desired category. For example , after 300 poison samples a car prompt generates a cow. -!\[Data poisoning. Credit: \@shan2023prompt.\](images/png/image14.png){#fig-poisoning} +!\[Data poisoning. Credit: \[@shan2023prompt].\](images/png/image14.png){#fig-poisoning} ### Distribution Shifts @@ -589,11 +589,11 @@ Detecting adversarial examples is the first line of defense against adversarial ![A small adversarial noise added to the original image can make the neural network to classify the image as a Guacamole instead of an Egyptian cat (Source: [Sutanto](https://www.mdpi.com/2079-9292/10/1/52))](./images/png/adversarial_attack_detection.png){#fig-adversarial-attack-detection} -Statistical methods aim to detect adversarial examples by analyzing the statistical properties of the input data. These methods often compare the distribution of the input data to a reference distribution, such as the training data distribution or a known benign distribution. Techniques like the [Kolmogorov-Smirnov](https://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm) @berger2014kolmogorov test or the [Anderson-Darling](https://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm) test can be used to measure the discrepancy between the distributions and flag inputs that deviate significantly from the expected distribution. +Statistical methods aim to detect adversarial examples by analyzing the statistical properties of the input data. These methods often compare the distribution of the input data to a reference distribution, such as the training data distribution or a known benign distribution. Techniques like the [Kolmogorov-Smirnov](https://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm) [@berger2014kolmogorov] test or the [Anderson-Darling](https://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm) test can be used to measure the discrepancy between the distributions and flag inputs that deviate significantly from the expected distribution. [Kernel density estimation (KDE)](https://mathisonian.github.io/kde/) is a non-parametric technique used to estimate the probability density function of a dataset. In the context of adversarial example detection, KDE can be used to estimate the density of benign examples in the input space. Adversarial examples, which often lie in low-density regions, can be detected by comparing their estimated density to a threshold. Inputs with an estimated density below the threshold are flagged as potential adversarial examples. -Another technique is feature squeezing @panda2019discretization, which is a technique that reduces the complexity of the input space by applying dimensionality reduction or discretization. The idea behind feature squeezing is that adversarial examples often rely on small, imperceptible perturbations that can be eliminated or reduced through these transformations. By comparing the model's predictions on the original input and the squeezed input, inconsistencies can be detected, indicating the presence of adversarial examples. +Another technique is feature squeezing [@panda2019discretization], which is a technique that reduces the complexity of the input space by applying dimensionality reduction or discretization. The idea behind feature squeezing is that adversarial examples often rely on small, imperceptible perturbations that can be eliminated or reduced through these transformations. By comparing the model's predictions on the original input and the squeezed input, inconsistencies can be detected, indicating the presence of adversarial examples. Model uncertainty estimation techniques aim to quantify the confidence or uncertainty associated with a model's predictions. Adversarial examples often exploit regions of high uncertainty in the model's decision boundary. By estimating the uncertainty using techniques like Bayesian neural networks, dropout-based uncertainty estimation, or ensemble methods, inputs with high uncertainty can be flagged as potential adversarial examples. @@ -601,9 +601,9 @@ Model uncertainty estimation techniques aim to quantify the confidence or uncert Once adversarial examples are detected, various defense strategies can be employed to mitigate their impact and improve the robustness of ML models. -Adversarial training is a technique that involves augmenting the training data with adversarial examples and retraining the model on this augmented dataset. By exposing the model to adversarial examples during training, it learns to classify them correctly and becomes more robust to adversarial attacks. Adversarial training can be performed using various attack methods, such as the [Fast Gradient Sign Method (FGSM)](https://www.tensorflow.org/tutorials/generative/adversarial_fgsm) or Projected Gradient Descent (PGD) @madry2017towards. +Adversarial training is a technique that involves augmenting the training data with adversarial examples and retraining the model on this augmented dataset. By exposing the model to adversarial examples during training, it learns to classify them correctly and becomes more robust to adversarial attacks. Adversarial training can be performed using various attack methods, such as the [Fast Gradient Sign Method (FGSM)](https://www.tensorflow.org/tutorials/generative/adversarial_fgsm) or Projected Gradient Descent (PGD) [@madry2017towards]. -Defensive distillation @papernot2016distillation is a technique that trains a second model (the student model) to mimic the behavior of the original model (the teacher model). The student model is trained on the soft labels produced by the teacher model, which are less sensitive to small perturbations. By using the student model for inference, the impact of adversarial perturbations can be reduced, as the student model learns to generalize better and is less sensitive to adversarial noise. +Defensive distillation [@papernot2016distillation] is a technique that trains a second model (the student model) to mimic the behavior of the original model (the teacher model). The student model is trained on the soft labels produced by the teacher model, which are less sensitive to small perturbations. By using the student model for inference, the impact of adversarial perturbations can be reduced, as the student model learns to generalize better and is less sensitive to adversarial noise. Input preprocessing and transformation techniques aim to remove or mitigate the effect of adversarial perturbations before feeding the input to the ML model. These techniques can include image denoising, JPEG compression, random resizing and padding, or applying random transformations to the input data. By reducing the impact of adversarial perturbations, these preprocessing steps can help improve the model's robustness to adversarial attacks. @@ -615,7 +615,7 @@ To assess the effectiveness of adversarial defense techniques and measure the ro Adversarial robustness metrics quantify the model's resilience to adversarial attacks. These metrics can include the model's accuracy on adversarial examples, the average distortion required to fool the model, or the model's performance under different attack strengths. By comparing these metrics across different models or defense techniques, practitioners can assess and compare their robustness levels. -Standardized adversarial attack benchmarks and datasets provide a common ground for evaluating and comparing the robustness of ML models. These benchmarks include datasets with pre-generated adversarial examples, as well as tools and frameworks for generating adversarial attacks. Examples of popular adversarial attack benchmarks include the [MNIST-C](https://github.com/google-research/mnist-c), [CIFAR-10-C](https://paperswithcode.com/dataset/cifar-10c), and ImageNet-C @hendrycks2019benchmarking datasets, which contain corrupted or perturbed versions of the original datasets. +Standardized adversarial attack benchmarks and datasets provide a common ground for evaluating and comparing the robustness of ML models. These benchmarks include datasets with pre-generated adversarial examples, as well as tools and frameworks for generating adversarial attacks. Examples of popular adversarial attack benchmarks include the [MNIST-C](https://github.com/google-research/mnist-c), [CIFAR-10-C](https://paperswithcode.com/dataset/cifar-10c), and ImageNet-C [@hendrycks2019benchmarking] datasets, which contain corrupted or perturbed versions of the original datasets. By leveraging these adversarial example detection techniques, defense strategies, and robustness evaluation methods, practitioners can develop more robust and resilient ML systems. However, it is important to note that adversarial robustness is an ongoing research area, and no single technique provides complete protection against all types of adversarial attacks. A comprehensive approach that combines multiple defense mechanisms and regular testing is essential to maintain the security and reliability of ML systems in the face of evolving adversarial threats. @@ -647,7 +647,7 @@ Data provenance and lineage tracking involve maintaining a record of the origin, Robust optimization techniques can be used to modify the training objective to minimize the impact of outliers or poisoned instances. This can be achieved by using robust loss functions that are less sensitive to extreme values, such as the Huber loss or the modified Huber loss. Regularization techniques, such as [L1 or L2 regularization](https://towardsdatascience.com/l1-and-l2-regularization-methods-ce25e7fc831c), can also help in reducing the model's sensitivity to poisoned data by constraining the model's complexity and preventing overfitting. -Robust loss functions are designed to be less sensitive to outliers or noisy data points. Examples include the modified [Huber loss](https://pytorch.org/docs/stable/generated/torch.nn.HuberLoss.html), the Tukey loss @beaton1974fitting, and the trimmed mean loss. These loss functions down-weight or ignore the contribution of anomalous instances during training, reducing their impact on the model's learning process. Robust objective functions, such as the minimax objective or the distributionally robust objective, aim to optimize the model's performance under worst-case scenarios or in the presence of adversarial perturbations. +Robust loss functions are designed to be less sensitive to outliers or noisy data points. Examples include the modified [Huber loss](https://pytorch.org/docs/stable/generated/torch.nn.HuberLoss.html), the Tukey loss [@beaton1974fitting], and the trimmed mean loss. These loss functions down-weight or ignore the contribution of anomalous instances during training, reducing their impact on the model's learning process. Robust objective functions, such as the minimax objective or the distributionally robust objective, aim to optimize the model's performance under worst-case scenarios or in the presence of adversarial perturbations. Data augmentation techniques involve generating additional training examples by applying random transformations or perturbations to the existing data. This helps in increasing the diversity and robustness of the training dataset. By introducing controlled variations in the data, the model becomes less sensitive to specific patterns or artifacts that may be present in poisoned instances. Randomization techniques, such as random subsampling or bootstrap aggregating, can also help in reducing the impact of poisoned data by training multiple models on different subsets of the data and combining their predictions. @@ -685,7 +685,7 @@ In addition, domain classifiers are trained to distinguish between different dom Transfer learning leverages knowledge gained from one domain to improve performance on another domain. By using pre-trained models or transferring learned features from a source domain to a target domain, transfer learning can help mitigate the impact of distribution shifts. The pre-trained model can be fine-tuned on a small amount of labeled data from the target domain, allowing it to adapt to the new distribution. Transfer learning is particularly effective when the source and target domains share similar characteristics or when labeled data in the target domain is scarce. -Continual learning, also known as lifelong learning, enables ML models to learn continuously from new data distributions while retaining knowledge from previous distributions. Techniques such as elastic weight consolidation (EWC) @kirkpatrick2017overcoming or gradient episodic memory (GEM) @lopez2017gradient allow models to adapt to evolving data distributions over time. These techniques aim to balance the plasticity of the model (ability to learn from new data) with the stability of the model (retaining previously learned knowledge). By incrementally updating the model with new data and mitigating catastrophic forgetting, continual learning helps models stay robust to distribution shifts. +Continual learning, also known as lifelong learning, enables ML models to learn continuously from new data distributions while retaining knowledge from previous distributions. Techniques such as elastic weight consolidation (EWC) [@kirkpatrick2017overcoming] or gradient episodic memory (GEM) [@lopez2017gradient] allow models to adapt to evolving data distributions over time. These techniques aim to balance the plasticity of the model (ability to learn from new data) with the stability of the model (retaining previously learned knowledge). By incrementally updating the model with new data and mitigating catastrophic forgetting, continual learning helps models stay robust to distribution shifts. Data augmentation techniques such as what we have seen previously involve applying transformations or perturbations to the existing training data to increase its diversity and improve the model's robustness to distribution shifts. By introducing variations in the data, such as rotations, translations, scaling, or adding noise, data augmentation helps the model learn invariant features and generalize better to unseen distributions. Data augmentation can be performed both during training and inference to enhance the model's ability to handle distribution shifts. @@ -701,7 +701,7 @@ Detecting and mitigating distribution shifts is an ongoing process that requires #### Definition and Characteristics -Software faults refer to defects, errors, or bugs in the runtime software frameworks and components that support the execution and deployment of ML models @myllyaho2022misbehaviour. These faults can arise from various sources, such as programming mistakes, design flaws, or compatibility issues @zhang2008distribution, and can have significant implications for the performance, reliability, and security of ML systems. Software faults in ML frameworks exhibit several key characteristics: +Software faults refer to defects, errors, or bugs in the runtime software frameworks and components that support the execution and deployment of ML models [@myllyaho2022misbehaviour]. These faults can arise from various sources, such as programming mistakes, design flaws, or compatibility issues [@zhang2008distribution], and can have significant implications for the performance, reliability, and security of ML systems. Software faults in ML frameworks exhibit several key characteristics: - **Diversity:** Software faults can manifest in different forms, ranging from simple logic errors and syntax mistakes to more complex issues like memory leaks, race conditions, and integration problems. The variety of fault types adds to the challenge of detecting and mitigating them effectively. @@ -738,13 +738,13 @@ Machine learning frameworks, such as TensorFlow, PyTorch, and scikit-learn, prov Software faults in machine learning frameworks can have significant and far-reaching impacts on the performance, reliability, and security of ML systems. Let's explore the various ways in which software faults can affect ML systems: -**Performance Degradation and System Slowdowns:** Memory leaks and inefficient resource management can lead to gradual performance degradation over time, as the system becomes increasingly memory-constrained and spends more time on garbage collection or memory swapping @maas2008combining. This issue is compounded by synchronization issues and concurrency bugs which can cause delays, reduced throughput, and suboptimal utilization of computational resources, especially in multi-threaded or distributed ML systems. Furthermore, compatibility problems or inefficient code paths can introduce additional overhead and slowdowns, affecting the overall performance of the ML system. +**Performance Degradation and System Slowdowns:** Memory leaks and inefficient resource management can lead to gradual performance degradation over time, as the system becomes increasingly memory-constrained and spends more time on garbage collection or memory swapping [@maas2008combining]. This issue is compounded by synchronization issues and concurrency bugs which can cause delays, reduced throughput, and suboptimal utilization of computational resources, especially in multi-threaded or distributed ML systems. Furthermore, compatibility problems or inefficient code paths can introduce additional overhead and slowdowns, affecting the overall performance of the ML system. **Incorrect Predictions or Outputs:** Software faults in data preprocessing, feature engineering, or model evaluation can introduce biases, noise, or errors that propagate through the ML pipeline and result in incorrect predictions or outputs. Over time, numerical instabilities, precision errors, or [rounding issues](https://www.cs.drexel.edu/~popyack/Courses/CSP/Fa17/extras/Rounding/index.html) can accumulate and lead to degraded accuracy or convergence problems in the trained models. Moreover, faults in the model serving or inference components can cause inconsistencies between the expected and actual outputs, leading to incorrect or unreliable predictions in production. **Reliability and Stability Issues:** Unhandled exceptions, crashes, or sudden terminations caused by software faults can compromise the reliability and stability of ML systems, especially in production environments. Intermittent or sporadic faults can be difficult to reproduce and diagnose, leading to unpredictable behavior and reduced confidence in the ML system's outputs. Additionally, faults in checkpointing, model serialization, or state management can cause data loss or inconsistencies, affecting the reliability and recoverability of the ML system. -**Security Vulnerabilities:** Software faults, such as buffer overflows, injection vulnerabilities, or improper access control, can introduce security risks and expose the ML system to potential attacks or unauthorized access. Adversaries may exploit faults in the preprocessing or feature extraction stages to manipulate the input data and deceive the ML models, leading to incorrect or malicious behavior. Furthermore, inadequate protection of sensitive data, such as user information or confidential model parameters, can lead to data breaches or privacy violations @li2021survey. +**Security Vulnerabilities:** Software faults, such as buffer overflows, injection vulnerabilities, or improper access control, can introduce security risks and expose the ML system to potential attacks or unauthorized access. Adversaries may exploit faults in the preprocessing or feature extraction stages to manipulate the input data and deceive the ML models, leading to incorrect or malicious behavior. Furthermore, inadequate protection of sensitive data, such as user information or confidential model parameters, can lead to data breaches or privacy violations [@li2021survey]. **Difficulty in Reproducing and Debugging:** Software faults can make it challenging to reproduce and debug issues in ML systems, especially when the faults are intermittent or dependent on specific runtime conditions. Incomplete or ambiguous error messages, coupled with the complexity of ML frameworks and models, can prolong the debugging process and hinder the ability to identify and fix the underlying faults. Moreover, inconsistencies between development, testing, and production environments can make it difficult to reproduce and diagnose faults that occur in specific contexts. @@ -764,7 +764,7 @@ Detecting and mitigating software faults in machine learning frameworks is essen **Runtime Monitoring and Logging:** Implementing comprehensive logging mechanisms captures relevant information, such as input data, model parameters, and system events, during runtime. Monitoring key performance metrics, resource utilization, and error rates helps detect anomalies, performance bottlenecks, or unexpected behavior. Employing runtime assertion checks and invariants validates assumptions and detects violations of expected conditions during program execution. Utilizing [profiling tools](https://microsoft.github.io/code-with-engineering-playbook/machine-learning/ml-profiling/) identifies performance bottlenecks, memory leaks, or inefficient code paths that may indicate the presence of software faults. -**Fault-Tolerant Design Patterns:** Implementing error handling and exception management mechanisms enables graceful handling and recovery from exceptional conditions or runtime errors. Employing redundancy and failover mechanisms, such as backup systems or redundant computations, ensures the availability and reliability of the ML system in the presence of faults. Designing modular and loosely coupled architectures minimizes the propagation and impact of faults across different components of the ML system. Utilizing checkpointing and recovery mechanisms @eisenman2022check allows the system to resume from a known stable state in case of failures or interruptions. +**Fault-Tolerant Design Patterns:** Implementing error handling and exception management mechanisms enables graceful handling and recovery from exceptional conditions or runtime errors. Employing redundancy and failover mechanisms, such as backup systems or redundant computations, ensures the availability and reliability of the ML system in the presence of faults. Designing modular and loosely coupled architectures minimizes the propagation and impact of faults across different components of the ML system. Utilizing checkpointing and recovery mechanisms [@eisenman2022check] allows the system to resume from a known stable state in case of failures or interruptions. **Regular Updates and Patches:** Staying up to date with the latest versions and patches of the ML frameworks, libraries, and dependencies provides benefits from bug fixes, security updates, and performance improvements. Monitoring release notes, security advisories, and community forums keeps practitioners informed about known issues, vulnerabilities, or compatibility problems in the ML framework. Establishing a systematic process for testing and validating updates and patches before applying them to production systems ensures stability and compatibility. @@ -796,14 +796,14 @@ Fault models can be categorized based on various characteristics: On the other hand, error models describe how a fault propagates through the system and manifests as an error. An error may cause the system to deviate from its expected behavior, leading to incorrect results or even system failures. Error models can be defined at different levels of abstraction, from the hardware level (e.g., register-level bit-flips) to the software level (e.g., corrupted weights or activations in an ML model). -The fault model (or error model, which is typically the more applicable terminology in the context of understanding the robustness of an ML system) plays a major role in simulating and measuring what happens to a system when a fault occurs. The chosen model informs the assumptions made about the system being studied. For example, a system focusing on single-bit transient errors @sangchoolie2017one would not be well-suited to understand the impact of permanent, multi-bit flip errors @wilkening2014calculating, as it is designed assuming a different model altogether. +The fault model (or error model, which is typically the more applicable terminology in the context of understanding the robustness of an ML system) plays a major role in simulating and measuring what happens to a system when a fault occurs. The chosen model informs the assumptions made about the system being studied. For example, a system focusing on single-bit transient errors [@sangchoolie2017one] would not be well-suited to understand the impact of permanent, multi-bit flip errors [@wilkening2014calculating], as it is designed assuming a different model altogether. -Furthermore, the implementation of an error model is also an important consideration, particularly regarding where in the compute stack an error is said to occur. For instance, a single-bit flip model at the architectural register level differs from a single bit flip in the weight of a model at the PyTorch level. Although both seem to target a similar error model, the former would usually be modeled in an architecturally accurate simulator (like gem5 @binkert2011gem5), which would more precisely capture error propagation compared to the latter, which focuses on value propagation through a model. +Furthermore, the implementation of an error model is also an important consideration, particularly regarding where in the compute stack an error is said to occur. For instance, a single-bit flip model at the architectural register level differs from a single bit flip in the weight of a model at the PyTorch level. Although both seem to target a similar error model, the former would usually be modeled in an architecturally accurate simulator (like gem5 [@binkert2011gem5]), which would more precisely capture error propagation compared to the latter, which focuses on value propagation through a model. -Recent research has shown that certain characteristics of error models may exhibit similar behaviors across different levels of abstraction @sangchoolie2017one @papadimitriou2021demystifying. For example, single-bit errors are generally more problematic than multi-bit errors, regardless of whether they are modeled at the hardware or software level. However, other characteristics, such as error masking @mohanram2003partial ([Figure +Recent research has shown that certain characteristics of error models may exhibit similar behaviors across different levels of abstraction [@sangchoolie2017one] [@papadimitriou2021demystifying]. For example, single-bit errors are generally more problematic than multi-bit errors, regardless of whether they are modeled at the hardware or software level. However, other characteristics, such as error masking [@mohanram2003partial] ([Figure 19.x](#kncu0umx706t)), may not always be accurately captured by software-level models, as they can hide underlying system effects. Masking occurs when -![Example of error masking in microarchitectural components @ko2021characterizing](./images/png/error_masking.png){#fig-error-masking} +![Example of error masking in microarchitectural components [@ko2021characterizing]](./images/png/error_masking.png){#fig-error-masking} Some tools, such as Fidelity, aim to bridge the gap between hardware-level and software-level error models by mapping patterns between the two levels of abstraction. This allows for more accurate modeling of hardware faults in software-based tools, which is essential for developing robust and reliable ML systems. Lower-level tools typically represent more accurate error propagation characteristics but suffer from being too slow in simulating many errors due to the complex nature of hardware system designs. On the other hand, higher-level tools, such as those implemented in ML frameworks like PyTorch or TensorFlow, which we will discuss soon in the later sections, are often faster and more efficient for evaluating the robustness of ML systems. @@ -813,7 +813,7 @@ In the following subsections, we will discuss various hardware-based and softwar An error injection tool is a tool that allows the user to implement a particular error model, such as a transient single-bit flip during inference. Most error injection tools are software-based, as software-level tools are faster for ML robustness studies. However, hardware-based fault injection methods are still important for grounding the higher level error models, as they are considered the most accurate way to study the impact of faults on ML systems by directly manipulating the hardware to introduce faults. These methods allow researchers to observe how the system behaves under real-world fault conditions. Both software-based and hardware-based error injection tools are described in this section in more detail. -![Hardware errors can occur due to a variety of reasons and at different times and/or locations in a system. This figure illustrates many potential error models which can be explored when studying the impact of hardware-based errors on systems @ahmadilivani2024systematic](./images/png/hardware_errors.png){#fig-hardware-errors} +![Hardware errors can occur due to a variety of reasons and at different times and/or locations in a system. This figure illustrates many potential error models which can be explored when studying the impact of hardware-based errors on systems [@ahmadilivani2024systematic]](./images/png/hardware_errors.png){#fig-hardware-errors} #### Methods @@ -821,11 +821,11 @@ Two of the most common hardware-based fault injection methods are FPGA-based fau **FPGA-based Fault Injection:** Field-Programmable Gate Arrays (FPGAs) are reconfigurable integrated circuits that can be programmed to implement various hardware designs. In the context of fault injection, FPGAs offer a high level of precision and accuracy, as researchers can target specific bits or sets of bits within the hardware. By modifying the FPGA configuration, faults can be introduced at specific locations and times during the execution of an ML model. FPGA-based fault injection allows for fine-grained control over the fault model, enabling researchers to study the impact of different types of faults, such as single-bit flips or multi-bit errors. This level of control makes FPGA-based fault injection a valuable tool for understanding the resilience of ML systems to hardware faults. -**Radiation or Beam Testing:** Radiation or beam testing @velazco2010combining involves exposing the hardware running an ML model to high-energy particles, such as protons or neutrons ([Figure 19.x](#5a77jp776dxi)). These particles can cause bit-flips or other types of faults in the hardware, mimicking the effects of real-world radiation-induced faults. Beam testing is widely regarded as a highly accurate method for measuring the error rate induced by particle strikes on a running application and provides a realistic representation of the types of faults that can occur in real-world environments, particularly in applications exposed to high levels of radiation, such as space systems or particle physics experiments. However, unlike FPGA-based fault injection, beam testing is less precise in targeting specific bits or components within the hardware, as it might be difficult to aim the beam of particles to a particular bit in the hardware. Despite being quite expensive from a research standpoint, beam testing is a well-regarded practice in industry for reliability purposes. +**Radiation or Beam Testing:** Radiation or beam testing [@velazco2010combining] involves exposing the hardware running an ML model to high-energy particles, such as protons or neutrons ([Figure 19.x](#5a77jp776dxi)). These particles can cause bit-flips or other types of faults in the hardware, mimicking the effects of real-world radiation-induced faults. Beam testing is widely regarded as a highly accurate method for measuring the error rate induced by particle strikes on a running application and provides a realistic representation of the types of faults that can occur in real-world environments, particularly in applications exposed to high levels of radiation, such as space systems or particle physics experiments. However, unlike FPGA-based fault injection, beam testing is less precise in targeting specific bits or components within the hardware, as it might be difficult to aim the beam of particles to a particular bit in the hardware. Despite being quite expensive from a research standpoint, beam testing is a well-regarded practice in industry for reliability purposes. Figure -![Radiation test setup for semiconductor components @lee2022design (Source: [JD Instrument](https://jdinstruments.net/tester-capabilities-radiation-test/))](./images/png/image15.png)![](./images/png/image14.png) +![Radiation test setup for semiconductor components [@lee2022design] (Source: [JD Instrument](https://jdinstruments.net/tester-capabilities-radiation-test/))](./images/png/image15.png)![](./images/png/image14.png) #### Limitations @@ -867,25 +867,25 @@ Software-based fault injection tools also have some limitations compared to hard Software-based fault injection tools can be categorized based on their target frameworks or use cases. Here, we will discuss some of the most popular tools in each category: -Ares @reagen2018ares, a fault injection tool initially developed for the Keras framework in 2018, emerged as one of the first tools to study the impact of hardware faults on deep neural networks (DNNs) in the context of the rising popularity of ML frameworks in the mid- to late 2010s. The tool was validated against a DNN accelerator implemented in silicon, demonstrating its effectiveness in modeling hardware faults. Ares provided a comprehensive study on the impact of hardware faults in both weights and activation values, characterizing the effects of single bit flips and bit-error-rates (BER) on hardware structures. Later, the Ares framework was extended to support the PyTorch ecosystem, enabling researchers to investigate hardware faults in a more modern setting and further extending its utility in the field. +Ares [@reagen2018ares], a fault injection tool initially developed for the Keras framework in 2018, emerged as one of the first tools to study the impact of hardware faults on deep neural networks (DNNs) in the context of the rising popularity of ML frameworks in the mid- to late 2010s. The tool was validated against a DNN accelerator implemented in silicon, demonstrating its effectiveness in modeling hardware faults. Ares provided a comprehensive study on the impact of hardware faults in both weights and activation values, characterizing the effects of single bit flips and bit-error-rates (BER) on hardware structures. Later, the Ares framework was extended to support the PyTorch ecosystem, enabling researchers to investigate hardware faults in a more modern setting and further extending its utility in the field. ![Hardware bitflips in ML workloads can cause phantom objects and misclassifications, which can erroneously be used downstream by larger systems such as in autonomous driving. Shown above is a correct and faulty version of the same image using the PyTorchFI injection framework](./images/png/phantom_objects.png){#fig-phantom-objects} -PyTorchFI @mahmoud2020pytorchfi, a fault injection tool specifically designed for the PyTorch framework, was developed in 2020 in collaboration with Nvidia Research. It enables the injection of faults into the weights, activations, and gradients of PyTorch models, supporting a wide range of fault models. By leveraging the GPU acceleration capabilities of PyTorch, PyTorchFI provides a fast and efficient implementation for conducting fault injection experiments on large-scale ML systems ([Figure 19.x](#txkz61sj1mj4)). The tool's speed and ease-of-use have led to widespread adoption in the community, resulting in multiple developer-led projects, such as PyTorchALFI by Intel Labs, which focuses on safety in automotive environments. Follow-up PyTorch-centric tools for fault injection include Dr. DNA by Meta @ma2024dr (which further facilitates the Pythonic programming model for ease of use), and the GoldenEye framework, which incorporates novel numerical datatypes (such as AdaptivFloat @tambe2020algorithm and [BlockFloat](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format)) in the context of hardware bit flips. +PyTorchFI [@mahmoud2020pytorchfi], a fault injection tool specifically designed for the PyTorch framework, was developed in 2020 in collaboration with Nvidia Research. It enables the injection of faults into the weights, activations, and gradients of PyTorch models, supporting a wide range of fault models. By leveraging the GPU acceleration capabilities of PyTorch, PyTorchFI provides a fast and efficient implementation for conducting fault injection experiments on large-scale ML systems ([Figure 19.x](#txkz61sj1mj4)). The tool's speed and ease-of-use have led to widespread adoption in the community, resulting in multiple developer-led projects, such as PyTorchALFI by Intel Labs, which focuses on safety in automotive environments. Follow-up PyTorch-centric tools for fault injection include Dr. DNA by Meta [@ma2024dr] (which further facilitates the Pythonic programming model for ease of use), and the GoldenEye framework, which incorporates novel numerical datatypes (such as AdaptivFloat [@tambe2020algorithm] and [BlockFloat](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format)) in the context of hardware bit flips. -TensorFI @chen2020tensorfi, or the TensorFlow Fault Injector, is a fault injection tool developed specifically for the TensorFlow framework. Similar to Ares and PyTorchFI, TensorFI is considered the state-of-the-art tool for ML robustness studies in the TensorFlow ecosystem. It allows researchers to inject faults into the computational graph of TensorFlow models and study their impact on the model's performance, supporting a wide range of fault models. One of the key benefits of TensorFI is its ability to evaluate the resilience of various types of ML models, not just DNNs. Further advancements, such as BinFi, provide a mechanism to speed up error injection experiments by focusing on the \"important\" bits in the system, accelerating the process of ML robustness analysis and prioritizing the critical components of a model. +TensorFI [@chen2020tensorfi], or the TensorFlow Fault Injector, is a fault injection tool developed specifically for the TensorFlow framework. Similar to Ares and PyTorchFI, TensorFI is considered the state-of-the-art tool for ML robustness studies in the TensorFlow ecosystem. It allows researchers to inject faults into the computational graph of TensorFlow models and study their impact on the model's performance, supporting a wide range of fault models. One of the key benefits of TensorFI is its ability to evaluate the resilience of various types of ML models, not just DNNs. Further advancements, such as BinFi, provide a mechanism to speed up error injection experiments by focusing on the \"important\" bits in the system, accelerating the process of ML robustness analysis and prioritizing the critical components of a model. -NVBitFI @tsai2021nvbitfi, a general-purpose fault injection tool developed by Nvidia for their GPU platforms, operates at a lower level compared to framework-specific tools like Ares, PyTorchFI, and TensorFI. While these tools focus on various deep learning platforms to implement and perform robustness analysis, NVBitFI targets the underlying hardware assembly code for fault injection. This allows researchers to inject faults into any application running on Nvidia GPUs, making it a versatile tool for studying the resilience of not only ML systems but also other GPU-accelerated applications. By enabling users to inject errors at the architectural level, NVBitFI provides a more general-purpose fault model that is not restricted to just ML models. As Nvidia's GPU systems are commonly used in many ML-based systems, NVBitFI serves as a valuable tool for comprehensive fault injection analysis across a wide range of applications. +NVBitFI [@tsai2021nvbitfi], a general-purpose fault injection tool developed by Nvidia for their GPU platforms, operates at a lower level compared to framework-specific tools like Ares, PyTorchFI, and TensorFI. While these tools focus on various deep learning platforms to implement and perform robustness analysis, NVBitFI targets the underlying hardware assembly code for fault injection. This allows researchers to inject faults into any application running on Nvidia GPUs, making it a versatile tool for studying the resilience of not only ML systems but also other GPU-accelerated applications. By enabling users to inject errors at the architectural level, NVBitFI provides a more general-purpose fault model that is not restricted to just ML models. As Nvidia's GPU systems are commonly used in many ML-based systems, NVBitFI serves as a valuable tool for comprehensive fault injection analysis across a wide range of applications. ###### Domain-specific Examples Domain-specific fault injection tools have been developed to address the unique challenges and requirements of various ML application domains, such as autonomous vehicles and robotics. This section highlights three domain-specific fault injection tools: DriveFI and PyTorchALFI for autonomous vehicles, and MAVFI for unmanned aerial vehicles (UAVs). These tools enable researchers to inject hardware faults into the perception, control, and other subsystems of these complex systems, allowing them to study the impact of faults on system performance and safety. The development of these software-based fault injection tools has greatly expanded the capabilities of the ML community to develop more robust and reliable systems that can operate safely and effectively in the presence of hardware faults. -DriveFI @jha2019ml is a fault injection tool specifically designed for the autonomous vehicle domain. It enables the injection of hardware faults into the perception and control pipelines of autonomous vehicle systems, allowing researchers to study the impact of these faults on the system's performance and safety. DriveFI has been integrated with industry-standard autonomous driving platforms, such as Nvidia DriveAV and Baidu Apollo, making it a valuable tool for evaluating the resilience of autonomous vehicle systems. +DriveFI [@jha2019ml] is a fault injection tool specifically designed for the autonomous vehicle domain. It enables the injection of hardware faults into the perception and control pipelines of autonomous vehicle systems, allowing researchers to study the impact of these faults on the system's performance and safety. DriveFI has been integrated with industry-standard autonomous driving platforms, such as Nvidia DriveAV and Baidu Apollo, making it a valuable tool for evaluating the resilience of autonomous vehicle systems. -PyTorchALFI @grafe2023large is an extension of PyTorchFI developed by Intel Labs for the autonomous vehicle domain. It builds upon the fault injection capabilities of PyTorchFI and adds features specifically tailored for evaluating the resilience of autonomous vehicle systems, such as the ability to inject faults into camera and LiDAR sensor data. +PyTorchALFI [@grafe2023large] is an extension of PyTorchFI developed by Intel Labs for the autonomous vehicle domain. It builds upon the fault injection capabilities of PyTorchFI and adds features specifically tailored for evaluating the resilience of autonomous vehicle systems, such as the ability to inject faults into camera and LiDAR sensor data. -MAVFI @hsiao2023mavfi is a fault injection tool designed for the robotics domain, specifically for unmanned aerial vehicles (UAVs). MAVFI is built on top of the Robot Operating System (ROS) framework and allows researchers to inject faults into the various components of a UAV system, such as sensors, actuators, and control algorithms. By evaluating the impact of these faults on the UAV's performance and stability, researchers can develop more resilient and fault-tolerant UAV systems. +MAVFI [@hsiao2023mavfi] is a fault injection tool designed for the robotics domain, specifically for unmanned aerial vehicles (UAVs). MAVFI is built on top of the Robot Operating System (ROS) framework and allows researchers to inject faults into the various components of a UAV system, such as sensors, actuators, and control algorithms. By evaluating the impact of these faults on the UAV's performance and stability, researchers can develop more resilient and fault-tolerant UAV systems. The development of software-based fault injection tools has greatly expanded the capabilities of researchers and practitioners to study the resilience of ML systems to hardware faults. By leveraging the speed, flexibility, and accessibility of these tools, the ML community can develop more robust and reliable systems that can operate safely and effectively in the presence of hardware faults. @@ -894,13 +894,13 @@ The development of software-based fault injection tools has greatly expanded the While software-based fault injection tools offer many advantages in terms of speed, flexibility, and accessibility, they may not always accurately capture the full range of effects that hardware faults can have on the system. This is because software-based tools operate at a higher level of abstraction than hardware-based methods, and may miss some of the low-level hardware interactions and error propagation mechanisms that can impact the behavior of the ML system ([Figure 19.x](#2cs18nolwc0m)). -![Hardware errors may manifest themselves in different ways at the software level, as classified by Bolchini et al @bolchini2022fast](./images/png/hardware_errors_Bolchini.png){#fig-hardware-errors-Bolchini} +![Hardware errors may manifest themselves in different ways at the software level, as classified by Bolchini et al [@bolchini2022fast]](./images/png/hardware_errors_Bolchini.png){#fig-hardware-errors-Bolchini} To address this issue, researchers have developed tools that aim to bridge the gap between low-level hardware error models and higher-level software error models. One such tool is Fidelity, which is designed to map patterns between hardware-level faults and their software-level manifestations. #### Fidelity: Bridging the Gap -Fidelity @he2020fidelity is a tool that enables the accurate modeling of hardware faults in software-based fault injection experiments. It achieves this by carefully studying the relationship between hardware-level faults and their impact on the software representation of the ML system. +Fidelity [@he2020fidelity] is a tool that enables the accurate modeling of hardware faults in software-based fault injection experiments. It achieves this by carefully studying the relationship between hardware-level faults and their impact on the software representation of the ML system. The key insights behind Fidelity are: