Mock-Test Window

20 Questions

Data engineering on microsoft azure Quiz

1 :-

You have an Azure Data Lake Storage Gen2 account that contains a JSON file for customers. The file contains two attributes named FirstName and LastName. You need to copy the data from the JSON file to an Azure Synapse Analytics table by using Azure Databricks. A new column must be created that concatenates the FirstName and LastName values. You create the following components: A destination table in Azure Synapse An Azure Blob storage container A service principal Which five actions should you perform in sequence next in is Databricks notebook? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

mount the data lake storage onto dbfs

write the results to a table in azure synapse

perform tranformation on the file

specify a temporary folder to stage the data

write the results to data lake storage

read the file into a data frame

drop the data frame

perform transformation on the data frame

Mastered Not Mastered

2 :-

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have an Azure Storage account that contains 100 GB of files. The files contain text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB. You plan to copy the data from the storage account to an Azure SQL data warehouse. You need to prepare the files to ensure that the data copies quickly. Solution: You modify the files to ensure that each row is more than 1 MB. Does this meet the goal?

Yes No

3 :-

You develop a dataset named DBTBL1 by using Azure Databricks. DBTBL1 contains the following columns: • SensorTypelD • GeographyRegionID • Year • Month • Day • Hour • Minute • Temperature • WindSpeed • Other You need to store the data to support daily incremental load pipelines that vary for each GeographyRegionID. The solution must minimize storage costs. How should you complete the code? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Mastered Not Mastered

4 :-

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You are designing an Azure Stream Analytics solution that will analyze Twitter data. You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once. Solution: You use a session window that uses a timeout size of 10 seconds. Does this meet the goal?

Yes No

5 :-

You have an Azure data factory. You need to examine the pipeline failures from the last 60 days. What should you use?

The Activity Log Blade For The Data Factory Resource The Monitor & Manage App In Data Factory The Resource Health Blade For The Data Factory Resource Azure Monitor

6 :-

You are building an Azure Stream Analytics job to identify how much time a user spends interacting with a feature on a webpage. The job receives events based on user actions on the webpage. Each row of data represents an event. Each event has a type of either 'start' or 'end'. You need to calculate the duration between start and end events. How should you complete the query? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Mastered Not Mastered

7 :-

You have an Azure event hub named retailhub that has 16 partitions. Transactions are posted to retailhub. Each transaction includes the transaction ID, the individual line items, and the payment details. The transaction ID is used as the partition key. You are designing an Azure Stream Analytics job to identify potentially fraudulent transactions at a retail store. The job will use retailhub as the input. The job will output the transaction ID, the individual line items, the payment details, a fraud score, and a fraud indicator. You plan to send the output to an Azure event hub named fraudhub. You need to ensure that the fraud detection solution is highly scalable and processes transactions as quickly as possible. How should you structure the output of the Stream Analytics job? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

number of partitions

partition key : fraud indicator

fraud score

individual line items

payment details

transaction id

Mastered Not Mastered

8 :-

You have a SQL pool in Azure Synapse. A user reports that queries against the pool take longer than expected to complete. You need to add monitoring to the underlying storage to help diagnose the issue. Which two metrics should you monitor? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

Cache Used Percentage Dwu Limit Snapshot Storage Size Active Queries Cache Hit Percentage

9 :-

You have an Azure Stream Analytics job that is a Stream Analytics project solution in Microsoft Visual Studio. The job accepts data generated by IoT devices in the JSON format. You need to modify the job to accept data generated by the IoT devices in the Protobuf format. Which three actions should you perform from Visual Studio on sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

change the event serialization format to protobuf in the input.json file of the job and reference the DLL.

add an azure stream analytics custom deserializer project net project to the solution

add net deserializer code for protobuf to the custom desrializer project

add an azure team analytics application project to the solution

Mastered Not Mastered

10 :-

You are designing a solution that will copy Parquet files stored in an Azure Blob storage account to an Azure Data Lake Storage Gen2 account. The data will be loaded daily to the data lake and will use a folder structure of {Year}/{Month}/{Day}/. You need to design a daily Azure Data Factory data load to minimize the data transfer between the two accounts. Which two configurations should you include in the design? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

Delete The Files In The Destination Before Loading New Data. Filter By The Last Modified Date Of The Source Files. Delete The Source Files After They Are Copied. Specify A File Naming Pattern For The Destination.

11 :-

- (Exam Topic 3) You have an Azure Synapse Analytics dedicated SQL Pool1. Pool1 contains a partitioned fact table named dbo.Sales and a staging table named stg.Sales that has the matching table and partition definitions. You need to overwrite the content of the first partition in dbo.Sales with the content of the same partition in stg.Sales. The solution must minimize load times. What should you do?

Switch The First Partition From Dbo.sales To Stg.sales. Switch The First Partition From Stg.sales To Db Sales Update Dbo.sales From Stg.sales. Insert The Data From Stg.sales Into Dbo.sales.

12 :-

You have a partitioned table in an Azure Synapse Analytics dedicated SQL pool.You need to design queries to maximize the benefits of partition elimination. What should you include in the Transact-SQL queries?

Join Where Distinct Group By

13 :-

You plan to create an Azure Synapse Analytics dedicated SQL pool. You need to minimize the time it takes to identify queries that return confidential information as defined by the company's data privacy regulations and the users who executed the queues. Which two components should you include in the solution? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point

Sensitivity-classification Labels Applied To Columns That Contain Confidential Information Resource Tags For Databases That Contain Confidential Information Audit Logs Sent To A Log Analytics Workspace Dynamic Data Masking For Columns That Contain Confidential Information

14 :-

You need to implement an Azure Databricks cluster that automatically connects to Azure Data Lake Storage Gen2 by using Azure Active Directory (Azure AD) integration. How should you configure the new cluster? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

cluster mode : high concurrency

premium

standard

advanced option to all : azure data lake storage gen 1 credential passthrough table access control

Mastered Not Mastered

15 :-

You plan to monitor an Azure data factory by using the Monitor & Manage app. You need to identify the status and duration of activities that reference a table in a source database. Which three actions should you perform in sequence? To answer, move the actions from the list of actions to the answer are and arrange them in the correct order.

from the data factory monitoring app add the source user property to the activity runs table

from the data factory monitoring app add the source user property to the pipeline runs the table

from the data factory authoring UI publish the pipelines

from the data factory monitoring app add a linked service to the pipeline runs table

from the data factory authoring ui generate a user property for source on all activities

from the data factory authoring UI generate a user property for source on all data sets

Mastered Not Mastered

16 :-

- (Exam Topic 3) You configure monitoring for a Microsoft Azure SQL Data Warehouse implementation. The implementation uses PolyBase to load data from comma-separated value (CSV) files stored in Azure Data Lake Gen 2 using an external table. Files with an invalid schema cause errors to occur. You need to monitor for an invalid schema error. For which error should you monitor?

. External Table Access Failed Due To Internal Error: 'java Exception Raised On Call To Hdfsbridge_connect: Error[com.microsoft.polybase.client.kerberossecurelogin] Occurred While Accessing External Files.' External Table Access Failed Due To Internal Error: 'java Exception Raised On Call To Hdfsbridge_connect: Error [no Filesystem For Scheme: Wasbs] Occurred While Accessing External File. Cannot Execute The Query "remote Query" Against Ole Db Provider "sqlncli11": For Linked Server "(null)" Query Aborted- The Maximum Reject Threshold (orows) Was Reached While Regarding From An External Source: 1 Rows Rejected Out Of Total 1 Rows Processed External Table Access Failed Due To Internal Error: 'java Exception Raised On Call To Hdfsbridge_connect: Error [unable To Instantiate Loginclass] Occurredwhile Accessing External Files.'

17 :-

You have an Azure subscription that contains a logical Microsoft SQL server named Server1. Server1 hosts an Azure Synapse Analytics SQL dedicated pool named Pool1. You need to recommend a Transparent Data Encryption (TDE) solution for Server1. The solution must meet the following requirements: Track the usage of encryption keys. Maintain the access of client apps to Pool1 in the event of an Azure datacenter outage that affects the availability of the encryption keys. What should you include in the recommendation? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

to track encryption key usage : always encrypted

TDE with customer managed keys

TDE with platform managed keys

to maintain client app access in the event of a data center outage: create and configure azure key vaults in two azure regions

enable advanced data security on server 1

implement a client app by using a microsoft

net frame work data provider

Mastered Not Mastered

18 :-

- (Exam Topic 3) You are building an Azure Analytics query that will receive input data from Azure IoT Hub and write the results to Azure Blob storage. You need to calculate the difference in readings per sensor per hour. How should you complete the query? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

lag limit duration

least offset

lead when

Mastered Not Mastered

19 :-

You have an Azure Active Directory (Azure AD) tenant that contains a security group named Group1. You have an Azure Synapse Analytics dedicated SQL pool named dw1 that contains a schema named schema1. You need to grant Group1 read-only permissions to all the tables and views in schema1. The solution must use the principle of least privilege. Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

create a database role named role 1 and grant role 1 select permission to schema 1

create a database role named role 1 and grant role 1 select permission to dw1

assign the azure role based access control ( AZURE RBAC) reader role for dw1 to group 1

create a database user in dw1 that represents group 1 and uses the from external provider clause

assign role 1to the group 1 database user

Mastered Not Mastered

20 :-

You are designing an Azure Databricks table. The table will ingest an average of 20 million streaming events per day. You need to persist the events in the table for use in incremental load pipeline jobs in Azure Databricks. The solution must minimize storage costs and incremental load times. What should you include in the solution?

Partition By Datetime Fields. Sink To Azure Queue Storage. Include A Watermark Column Use A Json Format For Physical Data Storage.

Back

Data engineering on microsoft azure Quiz

🏅