From 7dfdd79a2ccb660d723c003a9aed27caa6711d79 Mon Sep 17 00:00:00 2001 From: Christopher Haster Date: Wed, 30 Oct 2024 11:39:45 -0500 Subject: [PATCH] README.md - Working on error+erasure section --- README.md | 237 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 233 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 66cd1cc..3a878ce 100644 --- a/README.md +++ b/README.md @@ -1320,7 +1320,7 @@ noting: Fortunately, because of math, the first term will always be a 1. So just like with CRC polynomials, we can make the leading 1 implicit and - not bother storing it. + not bother storing it in memory. Division with an implicit 1 is implemented in [ramrsbd_gf_p_divmod1][ramrsbd_gf_p_divmod1], and has the extra @@ -1414,10 +1414,239 @@ noting: And some caveats: -1. For _any_ error-correcting code, attempting to _correct_ errors - reduces the code's ability to _detect_ errors. +1. For any error-correcting code, attempting to **correct** errors + reduces the code's ability to **detect** errors. 2. Limited to 255 byte codewords - the non-zero elements of GF(256). -3. Support for known-location "erasures" is left as an exercise for +3. Support for known-location "erasures" left as an exercise for the reader. + + All of the above math assumes we don't know the location of errors, + which is usually the case for block devices. + + But it turns out if we _do_ know the location of errors, via parity + bits or some other side-channel, we can do quite a bit better. We + usually call these known-location errors "erasures". + + With Reed-Solomon, each unknown-location error requires 2 bytes of ECC + to find and repair, while known-location erasures require only 1 byte + of ECC to repair. You can even mix and match $e$ errors and $f$ + erasures as long as you have $n$ bytes of ECC such that: + +

+ 2e + f \le n +

+ + This isn't implemented in ramrsbd, but, _in theory_, the math isn't + too difficult to extend. + + First note we can split $\Lambda(x)$ into a separate error-locator + poylnomial $\Lambda_E(x)$ and erasure-locator polynomial + $\Lambda_F(x)$: + +

+ \Lambda(x) = \Lambda_E(x) \Lambda_F(x) +

+ + We know the location of the known-location erasures, so the + erasure-locator $\Lambda_F(x)$ is trivial to calculate: + +

+ \Lambda_F(x) = \prod_{j \in F} \left(1 - X_j x\right) +

+ + Before we can find the error-locator polynomial $\Lambda_E(x)$, we + need to modify our syndromes the hide the effects of the + erasure-locator polynomial. These are often called the Forney + syndromes $S_{Fi}$: + +

+ S_F(x) = S(x) \Lambda_F(x) \bmod x^n +

+ + Note that the Forney syndromes $S_{Fi}$ sill satisfy the equation for + $\Omega(x)$: + +

+ \begin{aligned} \Omega(x) &= S(x)\Lambda(x) \bmod x^n \\ &= S(x)\Lambda_E(x)\Lambda_F(x) \bmod x^n \\ &= S_F(x)\Lambda_E(x) \bmod x^n \end{aligned} +

+ + We can then use Berlekamp-Massey with the Forney syndromes $S_{Fi}$ to + find the error-locator polynomial $\Lambda_E(x)$. + + Combining the error-locator polynomial $\Lambda_E(x)$ and the + erasure-locator polynomial $\Lambda_F(x)$ gives us the creatively + named error-and-erasure-locator-polynomial $\Lambda(x)$, which + contains everything we need to know to find the location of both + errors and erasures: + +

+ \Lambda(x) = \Lambda_E(x) \Lambda_F(x) +

+ + At this point we can continue Reed-Solomon as normal, finding the + error/easures locations where $\Lambda(X_j^{-1})=0$, and repairing + them with Forney's Algorithm, $Y_j = X_j \frac{\Omega(X_j^{-1})}{\Lambda'(X_j^{-1})}: + +

+ C(x) = C'(x) - \sum_{j \in E \cup F} Y_j x^j +

+ + + vvv TODO vvv + +the algorithm as normal + +, which tells + us everything we need to know to find the location of errors and + erasures: + + + We can then continue the algorithm as normal, finding errors/erasures + where $\Lambda(X_j^{-1}) = 0$, and repairing them with Forney's + algorithm. + + + + + + + vvv TODO vvv + + Which has all of the properties of t + + + + + + Use Berlekamp-Massey to find the error-locator polynomial + $\Lambda_E(x)$ from the Forney syndromes $S_{Ei}$. + + + + + + + need to sort of hide the effects of our erasures from the + + In order to find the error-locator polynomial $\Lambda_E(x)$, + + + + + + Renaming $\Lambda(x)$ to the error-and-erasure-locator polynomial, we + can + + First note we can split the error-locator polynomial $\Lambda(x)$ into + separate + + + + + + + + + + vvv TODO vvv + + + + With Reed-Solomon, each unknown-location error requires 2 bytes of ECC + to find and repair, but each known-location error, usually called + "erasures", require only 1 byte of ECC to correct. + + + + + + + All of the above math assumes we don't know the location of errors, + since this is usually the case for block devices, and requires $2e$ + bytes of `ecc_size` to find $e$ errors. + + But if we know the location of errors, via parity bits or other + side-channels, we can actually do a bit better and find $e$ errors + with only $e$ bytes of `ecc_size`. We usually call these "erasures". + + You can even mix and match errors and erasures as long as you have + enough `ecc_size`, with each error needing 2 bytes of `ecc_size`, and + each erasure needing 1 bytes of `ecc_size`. + + TODO + + Find the erasure-locator polynomial $\Lambda_F(x)$: + +

+ \Lambda_F(x) = \prod_{j \in F} \left(1 - X_j x\right) +

+ + Note: + +

+ \Lambda(x) = \Lambda_E(x) \Lambda_F(x) +

+ + Find the Forney syndromes $S_{Ei}$: + +

+ S_E(x) = S(x) \Lambda_F(x) \bmod x^n +

+ + Note: + +

+ \begin{aligned} \Omega(x) &= S(x)\Lambda(x) \bmod x^n \\ &= S(x)\Lambda_E(x)\Lambda_F(x) \bmod x^n \\ &= S_F(x)\Lambda_E(x) \bmod x^n \end{aligned} +

+ + Use Berlekamp-Massey to find the error-locator polynomial + $\Lambda_E(x)$ from the Forney syndromes $S_{Ei}$. + + + Combine: + +

+ \Lambda(x) = \Lambda_E(x) \Lambda_F(x) +

+ + And continue as normal, finding $X_j$ where $\Lambda(X_j^{-1})=0$ + and solving for $Y_j$ where + $Y_j = X_j \frac{\Omega(X_j^{-1})}{\Lambda'(X_j^{-1})}$.