hiexam
amazon · AWS-Certified-Machine-Learning---Specialty · Q426 · multiple_choice · topic_1

A Data Scientist is developing a binary classifier to predict whether a patient has a particular disease on a series of…

A Data Scientist is developing a binary classifier to predict whether a patient has a particular disease on a series of test results. The Data Scientist has data on 400 patients randomly selected from the population. The disease is seen in 3% of the population. Which cross-validation strategy should the Data Scientist adopt?
  • A.A k-fold cross-validation strategy with k=5
  • B.A stratified k-fold cross-validation strategy with k=5
  • C.A k-fold cross-validation strategy with k=5 and 3 repeats
  • D.An 80/20 stratified split between training and validation
Explanation
B - stratified k-fold cross-validation will enforce the class distribution in each split of the data to match the distribution in the complete training dataset.

Reference: examtopics_top_comment

Practice with progress tracking

Sign in to track wrong answers, get spaced-repetition reminders, and run timed exam mode.