Skip to content

Latest commit

 

History

History
22 lines (15 loc) · 1.04 KB

spark-sql-streaming-StreamingDeduplicationStrategy.adoc

File metadata and controls

22 lines (15 loc) · 1.04 KB

StreamingDeduplicationStrategy Execution Planning Strategy for Deduplicate Logical Operator

StreamingDeduplicationStrategy is an execution planning strategy that can plan streaming queries with Deduplicate logical operators (over streaming queries) to StreamingDeduplicateExec physical operators.

Note
Deduplicate logical operator represents Dataset.dropDuplicates operator in a logical query plan.

StreamingDeduplicationStrategy is available using SessionState.

spark.sessionState.planner.StreamingDeduplicationStrategy

Demo: Using StreamingDeduplicationStrategy

FIXME