20 Questions

Data engineering on microsoft azure Quiz

1 :-

You have an Azure Data Lake Storage Gen2 account that contains a JSON file for customers. The file contains two attributes named FirstName and LastName. You need to copy the data from the JSON file to an Azure Synapse Analytics table by using Azure Databricks. A new column must be created that concatenates the FirstName and LastName values. You create the following components: A destination table in Azure Synapse An Azure Blob storage container A service principal Which five actions should you perform in sequence next in is Databricks notebook? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

mount the data lake storage onto dbfs

write the results to a table in azure synapse

perform tranformation on the file

specify a temporary folder to stage the data

write the results to data lake storage

read the file into a data frame

drop the data frame

perform transformation on the data frame

 

2 :-

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have an Azure Storage account that contains 100 GB of files. The files contain text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB. You plan to copy the data from the storage account to an Azure SQL data warehouse. You need to prepare the files to ensure that the data copies quickly. Solution: You modify the files to ensure that each row is more than 1 MB. Does this meet the goal?

3 :-

You develop a dataset named DBTBL1 by using Azure Databricks. DBTBL1 contains the following columns: • SensorTypelD • GeographyRegionID • Year • Month • Day • Hour • Minute • Temperature • WindSpeed • Other You need to store the data to support daily incremental load pipelines that vary for each GeographyRegionID. The solution must minimize storage costs. How should you complete the code? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

4 :-

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You are designing an Azure Stream Analytics solution that will analyze Twitter data. You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once. Solution: You use a session window that uses a timeout size of 10 seconds. Does this meet the goal?

5 :-

You have an Azure data factory. You need to examine the pipeline failures from the last 60 days. What should you use?

 

6 :-

You are building an Azure Stream Analytics job to identify how much time a user spends interacting with a feature on a webpage. The job receives events based on user actions on the webpage. Each row of data represents an event. Each event has a type of either 'start' or 'end'. You need to calculate the duration between start and end events. How should you complete the query? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

7 :-

You have an Azure event hub named retailhub that has 16 partitions. Transactions are posted to retailhub. Each transaction includes the transaction ID, the individual line items, and the payment details. The transaction ID is used as the partition key. You are designing an Azure Stream Analytics job to identify potentially fraudulent transactions at a retail store. The job will use retailhub as the input. The job will output the transaction ID, the individual line items, the payment details, a fraud score, and a fraud indicator. You plan to send the output to an Azure event hub named fraudhub. You need to ensure that the fraud detection solution is highly scalable and processes transactions as quickly as possible. How should you structure the output of the Stream Analytics job? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

number of partitions

                                 1

                                  8

                                  16

                                   32

partition key : fraud indicator

                        fraud score

                         individual line items

                         payment details

                        transaction id

 

 

 

8 :-

You have a SQL pool in Azure Synapse. A user reports that queries against the pool take longer than expected to complete. You need to add monitoring to the underlying storage to help diagnose the issue. Which two metrics should you monitor? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

9 :-

You have an Azure Stream Analytics job that is a Stream Analytics project solution in Microsoft Visual Studio. The job accepts data generated by IoT devices in the JSON format. You need to modify the job to accept data generated by the IoT devices in the Protobuf format. Which three actions should you perform from Visual Studio on sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. 

change the event serialization  format to protobuf in the input.json file of the job and reference the DLL.

add an azure stream analytics custom deserializer project net project to the solution 

add net deserializer code for protobuf to the custom desrializer project 

add an azure team analytics application project to the solution 

10 :-

You are designing a solution that will copy Parquet files stored in an Azure Blob storage account to an Azure Data Lake Storage Gen2 account. The data will be loaded daily to the data lake and will use a folder structure of {Year}/{Month}/{Day}/. You need to design a daily Azure Data Factory data load to minimize the data transfer between the two accounts. Which two configurations should you include in the design? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

11 :-

- (Exam Topic 3) You have an Azure Synapse Analytics dedicated SQL Pool1. Pool1 contains a partitioned fact table named dbo.Sales and a staging table named stg.Sales that has the matching table and partition definitions. You need to overwrite the content of the first partition in dbo.Sales with the content of the same partition in stg.Sales. The solution must minimize load times. What should you do?

12 :-

You have a partitioned table in an Azure Synapse Analytics dedicated SQL pool.You need to design queries to maximize the benefits of partition elimination. What should you include in the Transact-SQL queries?

 

13 :-

You plan to create an Azure Synapse Analytics dedicated SQL pool. You need to minimize the time it takes to identify queries that return confidential information as defined by the company's data privacy regulations and the users who executed the queues. Which two components should you include in the solution? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point

14 :-

You need to implement an Azure Databricks cluster that automatically connects to Azure Data Lake Storage Gen2 by using Azure Active Directory (Azure AD) integration. How should you configure the new cluster? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

cluster mode : high concurrency 

                         premium

                         standard

advanced option to all : azure data lake storage gen 1 credential passthrough table access  control

15 :-

You plan to monitor an Azure data factory by using the Monitor & Manage app. You need to identify the status and duration of activities that reference a table in a source database. Which three actions should you perform in sequence? To answer, move the actions from the list of actions to the answer are and arrange them in the correct order.

from the data factory monitoring app add the source user property to the activity runs table

from the data factory monitoring app add the source user property to the pipeline runs the table

from the data factory authoring UI publish the pipelines

from the data factory monitoring app add a linked service to the pipeline runs table

from the data factory authoring ui generate a user property  for source on all activities

from the data factory authoring UI generate a user property for source on all data sets 

 

16 :-

- (Exam Topic 3) You configure monitoring for a Microsoft Azure SQL Data Warehouse implementation. The implementation uses PolyBase to load data from comma-separated value (CSV) files stored in Azure Data Lake Gen 2 using an external table. Files with an invalid schema cause errors to occur. You need to monitor for an invalid schema error. For which error should you monitor?

17 :-

You have an Azure subscription that contains a logical Microsoft SQL server named Server1. Server1 hosts an Azure Synapse Analytics SQL dedicated pool named Pool1. You need to recommend a Transparent Data Encryption (TDE) solution for Server1. The solution must meet the following requirements: Track the usage of encryption keys. Maintain the access of client apps to Pool1 in the event of an Azure datacenter outage that affects the availability of the encryption keys. What should you include in the recommendation? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

to track encryption key usage : always encrypted

                                                     TDE with customer managed keys

                                                     TDE with platform managed keys

to maintain client app access in the event of a data center outage: create and configure azure key vaults in two azure regions

                                                     enable advanced data security on server 1

                                                      implement a client app by using a microsoft

                                                      net frame work data provider

 

18 :-

- (Exam Topic 3) You are building an Azure Analytics query that will receive input data from Azure IoT Hub and write the results to Azure Blob storage. You need to calculate the difference in readings per sensor per hour. How should you complete the query? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

lag                                  limit duration

least                                offset

lead                                when

19 :-

You have an Azure Active Directory (Azure AD) tenant that contains a security group named Group1. You have an Azure Synapse Analytics dedicated SQL pool named dw1 that contains a schema named schema1. You need to grant Group1 read-only permissions to all the tables and views in schema1. The solution must use the principle of least privilege. Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

create a database role named role 1  and grant role 1 select  permission to  schema 1

create a database role named  role 1 and grant role 1  select permission to dw1

assign the azure role based  access control ( AZURE RBAC) reader role for dw1  to group 1

create a database user in dw1 that represents group 1 and uses the from external provider clause 

assign role 1to the group 1 database user

20 :-

You are designing an Azure Databricks table. The table will ingest an average of 20 million streaming events per day. You need to persist the events in the table for use in incremental load pipeline jobs in Azure Databricks. The solution must minimize storage costs and incremental load times. What should you include in the solution?

Mock Test Completion Certificate
You scored 0 out of 20 questions.

🏅

For your extraordinary service and contributions to your profession.
We are delighted in providing this certificate to you.

Tips for improving your score:

  • Make sure to read the questions and answer choices carefully.
  • Don't try to answer any choices that you don't know. It's better to skip a question.