StreamingDeduplicationStrategy
is an execution planning strategy that can plan streaming queries with Deduplicate
logical operators (over streaming queries) to StreamingDeduplicateExec physical operators.
Tip
|
Read up on Execution Planning Strategies in The Internals of Spark SQL book. |
Note
|
Deduplicate logical operator represents Dataset.dropDuplicates operator in a logical query plan. |
StreamingDeduplicationStrategy
is available using SessionState
.
spark.sessionState.planner.StreamingDeduplicationStrategy