From a628087d79bae8d8ab422ed875b792fe9c83ce2f Mon Sep 17 00:00:00 2001 From: Nghia Truong Date: Fri, 1 Dec 2023 14:11:53 -0800 Subject: [PATCH] Change docs Signed-off-by: Nghia Truong --- docs/compatibility.md | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/docs/compatibility.md b/docs/compatibility.md index 05bb7ee00a2..8043aa12d38 100644 --- a/docs/compatibility.md +++ b/docs/compatibility.md @@ -85,16 +85,13 @@ considered to be a rare occurrence. ### `NaN` vs `NaN` -Apache Spark does not have a consistent way to handle `NaN` comparison. Sometimes, All `NaN` are +Apache Spark does not have a consistent way to handle `NaN` comparison. Sometimes, all `NaN` are considered as one unique value while other times they can be treated as different. The outcome of -`NaN` comparision can differ in various operations and also changes over time. Due to such complexity, -our plugin cannot guarantee to always match its output with Apache Spark if there are `NaN` values -in the input. - -For example, the `inset` operator can treat `NaN` as different values in Spark 3.0.2 and 3.1.2 but -considers them as the same in Spark 3.0.4, 3.1.3, and from Spark 3.2.0 -(see the [issue](https://issues.apache.org/jira/browse/SPARK-36792)). -On the other hand, our plugin always compares all `NaN` as equal value. +`NaN` comparison can differ in various operations and also changed between Spark versions. +Our plugin tries to match its output with Apache Spark except for a few operation(s) listed below: + - `IN` SQL expression: `NaN` can be treated as different values in Spark 3.1.2 but not from +Spark 3.1.3 (see the [issue](https://issues.apache.org/jira/browse/SPARK-36792)). +On the other hand, our plugin always compares all `NaN` as equal value for this operation. ## Decimal Support