Regarding pushdown filter predicate #174

ankit11519 · 2023-03-14T11:30:15Z

We are using apache spark to connect to Dynamodb using emr-dynamodb-connector.
Now, below are my code statements in pyspark:--

dynamoDf = spark.read.option('region', 'REGION')
.option("tableName", "TABLE_NAME")
.format("dynamodb")
.load()
dynamoDfFilter = dynamoDf.filter((F.col("colFilter").startswith('ABC')) | (F.col("colFilter").startswith('XYZ')))
print(dynamoDfFilter.count())

So, wanted to know if these filter conditions will also push down to the DynamoDB for server side filtering or will they be applied after full scan operation data being loaded into "dynamoDf"?

kevnzhao · 2023-03-16T03:15:42Z

Which connector are you using?
Predicate Pushdown is only available for Hive table in EMR DDB Hive Connector package. You can find the supported data types and operators at HERE.

@mimaomao feel free to comment if I miss anything.

ankit11519 · 2023-03-16T05:39:47Z

We are using this connector:--
[Accessing data in Amazon DynamoDB with Apache Spark]

So, we are accessing amazon DynamoDB with apache spark for scan operation. My question is if query is like this :--

dynamoDfFilter = dynamoDf.filter((F.col("colFilter").startswith('ABC')) | (F.col("colFilter").startswith('XYZ')))

print(dynamoDfFilter.count())

When spark 'filter' is being applied in scan query, whether query which is being sent to dynamoDB has 'filterExpressions' associated with it for server side filtering Or it load entire data in scan operation in a dataFrame first and then apply the filter over that dataFrame.

kevnzhao · 2023-03-17T15:18:45Z

Then you are using Hadoop Connector. Predicate push-down is not available yet. So all data is loaded from DynamoDB.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding pushdown filter predicate #174

Regarding pushdown filter predicate #174

ankit11519 commented Mar 14, 2023 •

edited

Loading

kevnzhao commented Mar 16, 2023

ankit11519 commented Mar 16, 2023

kevnzhao commented Mar 17, 2023

Regarding pushdown filter predicate #174

Regarding pushdown filter predicate #174

Comments

ankit11519 commented Mar 14, 2023 • edited Loading

kevnzhao commented Mar 16, 2023

ankit11519 commented Mar 16, 2023

kevnzhao commented Mar 17, 2023

ankit11519 commented Mar 14, 2023 •

edited

Loading