# AWS-Certified-Data-Analytics---Specialty — Question 425

**Type:** multiple_choice
**Topics:** topic_1

## Question

An insurance company has raw data in JSON format that is sent without a predefined schedule through an Amazon Kinesis Data Firehose delivery stream to an
Amazon S3 bucket. An AWS Glue crawler is scheduled to run every 8 hours to update the schema in the data catalog of the tables stored in the S3 bucket. Data analysts analyze the data using Apache Spark SQL on Amazon EMR set up with AWS Glue Data Catalog as the metastore. Data analysts say that, occasionally, the data they receive is stale. A data engineer needs to provide access to the most up-to-date data.
Which solution meets these requirements?

## Correct Answer

_See scenario._

## Explanation

Answer: D
Data analysts analyze the data using Apache Spark SQL on Amazon EMR for the data stored on S3 in JSON format. 
Input JSON file landing in S3 triggers a Lambda which invokes Glue Crawler.

**Reference:** examtopics_top_comment

---
Source: https://hiexam.net/q/amazon/AWS-Certified-Data-Analytics---Specialty/425  
Practice (tracked): https://hiexam.net/study/AWS-Certified-Data-Analytics---Specialty/practice