Azure Databricks Samples

Topic 1: Deployment of R models w/ Azure Databricks

If we are discussing a deployment architecture for ML batch scoring scenarios w/ R code, the core components of a deployable architecture could be:

Step 0: This is a nice community post to read if SparkR is a novelty to you:

Important quote:

“The SparkR API presents a full R interface, supplemented with the {SparkR} package. As an experienced R user, you will be familiar with the R data.frame object. Here's the critical point - SparkR has its own DataFrame object, which is not the same thing as an R data.frame. You can convert between them easily (sometimes too easily), but you must respect which is which.”

Step 1: Create an Azure Databricks Workspace

Step 2: Create an ADLS (Azure Data Lake Storage)

Obs.: Create the ADLS on the same region that you’ve provisioned Azure Databricks

Step 3: Create a cluster inside Databricks

Step 4: Execute and understand this sample code (SparkR + ADLS.r) . Tasks performed on this sample:

Here are some additional resources for understanding the orchestration of the R models execution:

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
SparkR + ADLS.html		SparkR + ADLS.html
SparkR + ADLS.r		SparkR + ADLS.r