Databricks-Certified-Data-Engineer-Associate Databricks Certified Data Engineer Associate Exam Questions and Answers

Questions 4

A data engineer needs to use a Delta table as part of a data pipeline, but they do not know if they have the appropriate permissions.

In which of the following locations can the data engineer review their permissions on the table?

Options:

Databricks Filesystem

Jobs

Dashboards

Repos

Data Explorer

Buy Now

Questions 5

A new data engineering team team. has been assigned to an ELT project. The new data engineering team will need full privileges on the database customers to fully manage the project.

Which of the following commands can be used to grant full permissions on the database to the new data engineering team?

Options:

GRANT USAGE ON DATABASE customers TO team;

GRANT ALL PRIVILEGES ON DATABASE team TO customers;

GRANT SELECT PRIVILEGES ON DATABASE customers TO teams;

GRANT SELECT CREATE MODIFY USAGE PRIVILEGES ON DATABASE customers TO team;

GRANT ALL PRIVILEGES ON DATABASE customers TO team;

Buy Now

Questions 6

A data engineer has a Job with multiple tasks that runs nightly. Each of the tasks runs slowly because the clusters take a long time to start.

Which of the following actions can the data engineer perform to improve the start up time for the clusters used for the Job?

Options:

They can use endpoints available in Databricks SQL

They can use jobs clusters instead of all-purpose clusters

They can configure the clusters to be single-node

They can use clusters that are from a cluster pool

They can configure the clusters to autoscale for larger data sizes

Buy Now

Questions 7

Which of the following is hosted completely in the control plane of the classic Databricks architecture?

Options:

Worker node

JDBC data source

Databricks web application

Databricks Filesystem

Driver node

Buy Now

Questions 8

A data engineer has been using a Databricks SQL dashboard to monitor the cleanliness of the input data to an ELT job. The ELT job has its Databricks SQL query that returns the number of input records containing unexpected NULL values. The data engineer wants their entire team to be notified via a messaging webhook whenever this value reaches 100.

Which of the following approaches can the data engineer use to notify their entire team via a messaging webhook whenever the number of NULL values reaches 100?

Options:

They can set up an Alert with a custom template.

They can set up an Alert with a new email alert destination.

They can set up an Alert with a new webhook alert destination.

They can set up an Alert with one-time notifications.

They can set up an Alert without notifications.

Buy Now

Questions 9

A data engineer needs to create a table in Databricks using data from a CSV file at location /path/to/csv.

They run the following command:

Which of the following lines of code fills in the above blank to successfully complete the task?

Options:

None of these lines of code are needed to successfully complete the task

USING CSV

FROM CSV

USING DELTA

FROM "path/to/csv"

Buy Now

Questions 10

Which of the following data lakehouse features results in improved data quality over a traditional data lake?

Options:

A data lakehouse provides storage solutions for structured and unstructured data.

A data lakehouse supports ACID-compliant transactions.

A data lakehouse allows the use of SQL queries to examine data.

A data lakehouse stores data in open formats.

A data lakehouse enables machine learning and artificial Intelligence workloads.

Buy Now

Questions 11

Identify how the count_if function and the count where x is null can be used

Consider a table random_values with below data.

What would be the output of below query?

select count_if(col > 1) as count_a. count(*) as count_b.count(col1) as count_c from random_values col1

NULL -

Options:

3 6 5

4 6 5

3 6 6

4 6 6

Buy Now

Questions 12

A data engineer wants to create a data entity from a couple of tables. The data entity must be used by other data engineers in other sessions. It also must be saved to a physical location.

Which of the following data entities should the data engineer create?

Options:

Database

Function

View

Temporary view

Table

Buy Now

Questions 13

A data engineer is maintaining a data pipeline. Upon data ingestion, the data engineer notices that the source data is starting to have a lower level of quality. The data engineer would like to automate the process of monitoring the quality level.

Which of the following tools can the data engineer use to solve this problem?

Options:

Unity Catalog

Data Explorer

Delta Lake

Delta Live Tables

Auto Loader

Buy Now

Questions 14

Which of the following Git operations must be performed outside of Databricks Repos?

Options:

Commit

Pull

Push

Clone

Merge

Buy Now

Questions 15

A data engineer has left the organization. The data team needs to transfer ownership of the data engineer’s Delta tables to a new data engineer. The new data engineer is the lead engineer on the data team.

Assuming the original data engineer no longer has access, which of the following individuals must be the one to transfer ownership of the Delta tables in Data Explorer?

Options:

Databricks account representative

This transfer is not possible

Workspace administrator

New lead data engineer

Original data engineer

Buy Now

Questions 16

In which of the following scenarios should a data engineer select a Task in the Depends On field of a new Databricks Job Task?

Options:

When another task needs to be replaced by the new task

When another task needs to fail before the new task begins

When another task has the same dependency libraries as the new task

When another task needs to use as little compute resources as possible

When another task needs to successfully complete before the new task begins

Buy Now

Questions 17

A data analyst has a series of queries in a SQL program. The data analyst wants this program to run every day. They only want the final query in the program to run on Sundays. They ask for help from the data engineering team to complete this task.

Which of the following approaches could be used by the data engineering team to complete this task?

Options:

They could submit a feature request with Databricks to add this functionality.

They could wrap the queries using PySpark and use Python’s control flow system to determine when to run the final query.

They could only run the entire program on Sundays.

They could automatically restrict access to the source table in the final query so that it is only accessible on Sundays.

They could redesign the data model to separate the data used in the final query into a new table.

Buy Now

Questions 18

A data engineer needs to use a Delta table as part of a data pipeline, but they do not know if they have the appropriate permissions.

In which location can the data engineer review their permissions on the table?

Options:

Jobs

Dashboards

Catalog Explorer

Repos

Buy Now

Questions 19

A data engineer and data analyst are working together on a data pipeline. The data engineer is working on the raw, bronze, and silver layers of the pipeline using Python, and the data analyst is working on the gold layer of the pipeline using SQL The raw source of the pipeline is a streaming input. They now want to migrate their pipeline to use Delta Live Tables.

Which change will need to be made to the pipeline when migrating to Delta Live Tables?

Options:

The pipeline can have different notebook sources in SQL & Python.

The pipeline will need to be written entirely in SQL.

The pipeline will need to be written entirely in Python.

The pipeline will need to use a batch source in place of a streaming source.

Buy Now

Questions 20

Which file format is used for storing Delta Lake Table?

Options:

Parquet

Delta

JSON

Buy Now

Questions 21

A Delta Live Table pipeline includes two datasets defined using STREAMING LIVE TABLE. Three datasets are defined against Delta Lake table sources using LIVE TABLE.

The table is configured to run in Development mode using the Continuous Pipeline Mode.

Assuming previously unprocessed data exists and all definitions are valid, what is the expected outcome after clicking Start to update the pipeline?

Options:

All datasets will be updated once and the pipeline will shut down. The compute resources will be terminated.

All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will persist until the pipeline is shut down.

All datasets will be updated once and the pipeline will persist without any processing. The compute resources will persist but go unused.

All datasets will be updated once and the pipeline will shut down. The compute resources will persist to allow for additional testing.

All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will persist to allow for additional testing.

Buy Now

Questions 22

Which tool is used by Auto Loader to process data incrementally?

Options:

Spark Structured Streaming

Unity Catalog

Checkpointing

Databricks SQL

Buy Now

Questions 23

Which of the following statements regarding the relationship between Silver tables and Bronze tables is always true?

Options:

Silver tables contain a less refined, less clean view of data than Bronze data.

Silver tables contain aggregates while Bronze data is unaggregated.

Silver tables contain more data than Bronze tables.

Silver tables contain a more refined and cleaner view of data than Bronze tables.

Silver tables contain less data than Bronze tables.

Buy Now

Questions 24

Identify the impact of ON VIOLATION DROP ROW and ON VIOLATION FAIL UPDATE for a constraint violation.

A data engineer has created an ETL pipeline using Delta Live table to manage their company travel reimbursement detail, they want to ensure that the if the location details has not been provided by the employee, the pipeline needs to be terminated.

How can the scenario be implemented?

Options:

CONSTRAINT valid_location EXPECT (location = NULL)

CONSTRAINT valid_location EXPECT (location != NULL) ON VIOLATION FAIL UPDATE

CONSTRAINT valid_location EXPECT (location != NULL) ON DROP ROW

CONSTRAINT valid_location EXPECT (location != NULL) ON VIOLATION FAIL

Buy Now

Questions 25

A data engineer is attempting to drop a Spark SQL table my_table and runs the following command:

DROP TABLE IF EXISTS my_table;

After running this command, the engineer notices that the data files and metadata files have been deleted from the file system.

Which of the following describes why all of these files were deleted?

Options:

The table was managed

The table's data was smaller than 10 GB

The table's data was larger than 10 GB

The table was external

The table did not have a location

Buy Now

Questions 26

Which of the following is a benefit of the Databricks Lakehouse Platform embracing open source technologies?

Options:

Cloud-specific integrations

Simplified governance

Ability to scale storage

Ability to scale workloads

Avoiding vendor lock-in

Buy Now

Questions 27

A new data engineering team team has been assigned to an ELT project. The new data engineering team will need full privileges on the table sales to fully manage the project.

Which of the following commands can be used to grant full permissions on the database to the new data engineering team?

Options:

GRANT ALL PRIVILEGES ON TABLE sales TO team;

GRANT SELECT CREATE MODIFY ON TABLE sales TO team;

GRANT SELECT ON TABLE sales TO team;

GRANT USAGE ON TABLE sales TO team;

GRANT ALL PRIVILEGES ON TABLE team TO sales;

Buy Now

Questions 28

Which of the following describes the relationship between Bronze tables and raw data?

Options:

Bronze tables contain less data than raw data files.

Bronze tables contain more truthful data than raw data.

Bronze tables contain aggregates while raw data is unaggregated.

Bronze tables contain a less refined view of data than raw data.

Bronze tables contain raw data with a schema applied.

Buy Now

Questions 29

A data engineer has configured a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table.

The code block used by the data engineer is below:

Which line of code should the data engineer use to fill in the blank if the data engineer only wants the query to execute a micro-batch to process data every 5 seconds?

Options:

trigger("5 seconds")

trigger(continuous="5 seconds")

trigger(once="5 seconds")

trigger(processingTime="5 seconds")

Buy Now

Questions 30

A data engineer wants to create a new table containing the names of customers who live in France.

They have written the following command:

CREATE TABLE customersInFrance

_____ AS

SELECT id,

firstName,

lastName

FROM customerLocations

WHERE country = ’FRANCE’;

A senior data engineer mentions that it is organization policy to include a table property indicating that the new table includes personally identifiable information (Pll).

Which line of code fills in the above blank to successfully complete the task?

Options:

COMMENT "Contains PIT

511

"COMMENT PII"

TBLPROPERTIES PII

Buy Now

Questions 31

A data engineer is attempting to drop a Spark SQL table my_table. The data engineer wants to delete all table metadata and data.

They run the following command:

DROP TABLE IF EXISTS my_table

While the object no longer appears when they run SHOW TABLES, the data files still exist.

Which of the following describes why the data files still exist and the metadata files were deleted?

Options:

The table’s data was larger than 10 GB

The table’s data was smaller than 10 GB

The table was external

The table did not have a location

The table was managed

Buy Now

Questions 32

A data engineer has a Python variable table_name that they would like to use in a SQL query. They want to construct a Python code block that will run the query using table_name.

They have the following incomplete code block:

____(f"SELECT customer_id, spend FROM {table_name}")

Which of the following can be used to fill in the blank to successfully complete the task?

Options:

spark.delta.sql

spark.delta.table

spark.table

dbutils.sql

spark.sql

Buy Now

Exam Code: Databricks-Certified-Data-Engineer-Associate

Exam Name: Databricks Certified Data Engineer Associate Exam

Last Update: Jul 3, 2025

Questions: 108

Databricks-Certified-Data-Engineer-Associate PDF

$29.75 ~~$84.99~~

Add to Cart

Databricks-Certified-Data-Engineer-Associate Engine

Databricks-Certified-Data-Engineer-Associate Testing Engine

$35 ~~$99.99~~

Add to Cart

Databricks-Certified-Data-Engineer-Associate PDF + Engine

Databricks-Certified-Data-Engineer-Associate PDF + Testing Engine

$47.25 ~~$134.99~~

Add to Cart

Summer Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: geek65

clapgeek logo

Databricks-Certified-Data-Engineer-Associate Databricks Certified Data Engineer Associate Exam Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options: