databricks · Certified-Associate-Developer-for-Apache-Spark · Q426 · multiple_choice · topic_1

The code block shown below contains an error. The code block is intended to return the exact number of distinct values…

The code block shown below contains an error. The code block is intended to return the exact number of distinct values in column division in DataFrame storesDF. Identify the error. Code block: storesDF.agg(approx_count_distinct(col(“division”)).alias(“divisionDistinct”))

A.The approx_count_distinct() operation needs a second argument to set the rsd parameter to ensure it returns the exact number of distinct values.
B.There is no alias() operation for the approx_count_distinct() operation's output.
C.There is no way to return an exact distinct number in Spark because the data Is distributed across partitions.
D.The approx_count_distinct()operation is not a standalone function - it should be used as a method from a Column object.
E.The approx_count_distinct() operation cannot determine an exact number of distinct values in a column.

Explanation

Selected Answer: E can not get exact distinct using apox function

Reference: examtopics_top_comment

Practice with progress tracking

Start tracked practice Pricing