KeyVault getting a secret out fails

Here is my code, this was working a few days ago and now it throws the error when trying to fetch the secret out of the vault. I am using the latest version of Azure's Python SDK which was...

ARM Deployment using KeyVault and certificates for Azure Data Lake Store

I basically want to create my HDI/Spark Cluster which accesses an Azure Data Lake Store by using ARM templates and also Azure Key Vault. So far I created the cluster manually and stored the ARM...

How to copy files and folder from one ADLS to another one on different subscription?

I need to be able to copy files and folder from one DataLake to another DataLake on a different subscription, I'm in possession of both Auth Token and secret key. I've tried different solution...

Structured Streaming - java.util.NoSuchElementException: key not found

We've recently started with structured streaming on azure data-bricks. Currently we're consuming events from event hubs and writing them on to azure datalake store as parquet. I'm able to write...

How to loop through Azure Datalake Store files in Azure Databricks

I am currently listing files in Azure Datalake Store gen1 successfully with the following command: dbutils.fs.ls('mnt/dbfolder1/projects/clients') The structure of this folder is -...

Azure Databricks to Event Hub

I am very new to Databricks. So, pardon me please. Here is my requiremnt I have data stored in Azure DataLake As per the requirement, we can only access data via Azure Databricks notebook We have...

Intermittent errors using C# Azure Datalake Gen1 Client "The underlying connection was closed"

I am logging some data to a Gen1 Azure Datalake Store, using the Microsoft.Azure.DataLake.Store driver. I am authenticating and creating a client like so: var adlCreds = await...

Can azure data lake files be filtered based on Last Modified time using azure python sdk?

I am trying to perform in-memory operations on files stored in azure datalake. I am unable to find documentation regarding using a matching pattern without using the ADL Downloader. For a single...

Azure databricks: Installing maven libraries to cluster through API causes error (Library resolution failed. Cause: java.lang.RuntimeException)

I am trying to install some maven libraries to existing azure data bricks' cluster/newly created cluster through API from python. Cluster details: Python 3 5.5 LTS (includes Apache Spark 2.4.3,...

Connecting C# Application to Azure Databricks

I am currently working on a project where we have data stored on Azure Datalake. The Datalake is hooked to Azure Databricks. The requirement asks that the Azure Databricks is to be connected to a...

How to access captured data from Event Hub in Azure Data Lake Storage Gen2 using Python

I'm using the connection_string to access an Azure Data Lake Gen2 storage, in which lots of Avro files were stored by an Event Hubs Capture, under the typical directory structure containing...

Error Installing Pyarrow with Python 3.7.4

I'm developing a python script to deploy an Azure Function App. For this reason I can't use another Python version to make this easier. In azure portal I get this error: Azure Function app pyarrow...

Data Factory New Linked Service connection failure ACL and firewall rule

I'm trying to move data from a datalake stored in Azure Data Lake Storage Gen1 to a table in an Azure SQL database. In Data Factory "new Linked Service" when I test the connection I get a...

Create SQL table from parquet files

I am using R to handle large datasets (largest dataframe 30.000.000 x 120). These are stored in Azure Datalake Storage as parquet files, and we would need to query these daily and restore these in...

How good is Azure Data Lake for storing an SQL database used for Power BI visualizations?

We have an Azure SQL database where we collect a large amount of sensor data and we regularly extract the data from it and transform it a bit with a python script. The end result is a pandas...

How to copydata from RestAPI using datafactory and save it in Datalake?

I'm trying to fetch data from REST API and save the json string it into DataLake and I'm getting an error. I've followed the steps mentioned...

Azure Datalake Analytics U-SQL with Azure Datalake Storage Gen 2

Question : what is the path forward for using ADLA (U-SQL) with ADLS(Gen2) ? I have been running Azure Data lake Analytics (U-SQL) jobs via Azure Data factory (ADF v2) with Azure Data lake Store...

Databricks pyspark, Difference in result of Dataframe.count() and Display(Dataframe) while using header='false'

I am reading CSV (present on Azure datalake store) file in dataframe by following code: df = spark.read.load(filepath, format="csv", schema = mySchema, header="false",...

Is it possible to create a SAS token for a directory in DataLake Gen2 storage?

I have an Azure Function that triggers from a directory (namespace) nested within an ADLS Gen 2 storage...

How to configure Spark / Databricks memory to collect large R data.frame?

Out of memory issues caused by collecting spark DataFrame into R data.frame has been discussed here several times (e.g. here or here). However, none answer seems to be usable in my...

How to connect and access Azure Datalake Gen1 storage using Azure Ad username and password only - c#

I want to connect and access Azure Datalake Gen1 storage using Azure Ad username and password only. I have a service account that has access to the Azure Datalake Gen1 storage. I am able to...

How to insert into Delta table in parallel

I have a process which in short runs 100+ of the same databricks notebook in parallel on a pretty powerful cluster. Each notebook at the end of its process writes roughly 100 rows of data to the...

How to fix 'Could not find a version that satisfies the requirement' for install_requires list when pip installing in custom package?

I am trying to build my own Python package (installable by pip) using the twine package. This is all going well right up until the point where I try to pip install my actual package (so after...

Unable to run the Powershell Script using SQL Server Job Agent

I am trying to execute my PowerShell script using the SQL Server Job Agent but unable to do so. I am able to execute the script successfully via PowerShell Prompt. Here in the Agent I am Operating...

Azure Synapse Polybase/External tables - return only latest file

We have an files partitioned in the datalake and are using Azure Synapse SQL Serverless pool to query them using external tables before visualising in Power BI. Files are stored in the following...

Trying to open parquet in Synapse - cannot be opened because it does not exist or it is used by another process

I am trying to open a Parquet files that is generated by Stream Analytics and stored in Azure Datalake V2. I have connected datalake and Synapse successfully, but I keep getting...

Cannot upgrade azure cli to the latest version

az --version showing an updated version available as shown below. [email protected]:~$ az --version azure-cli 2.18.0 * core 2.18.0 * telemetry ...

Using apache-airflow-providers-snowflake on airflow (no module named Snowflake)

I have installed package apache-airflow-providers-snowflake on airflow on docker and i am getting error No module named Snowflake Please refer attachment (check the error mentioned for the...

Conflicting with version dependencies when running pip install

Having issues with version dependencies when running pip install on docker. However, when installing on my mac without docker and just virtualenv, works perfectly fine. These are the versions I...

Cannot create Append Blobs in Azure Data Lake Gen2 using python azure-storage-file-datalake SDK

My use case requires me to continuously write incoming messages into files stored in an Azure Data Lake Gen2 storage account. I am able to create the files by triggering a function, which uses the...