Skip to main content

Data Science

Needs Votes

Support for developing python/R modules files in Spark

Vote (4) Share
Matthias Wong's profile image

Matthias Wong on 05 Jun 2023 06:14:51

Very excited about Fabric in general, and the support for data scientists with delta as the underlying interoperability.


However, it is currently impossible to build a sophisticated Fabric Spark pipeline with more than 2 or 3 python files. This is the current set up really supports one notebook at a time, rather than a folder of python files/modules. To include python files as reference files/modules, we have to upload them or provide and ABFS link one at a time


In contrast Databricks allow "workspace files" where developers can author arbitrary number of python files in the workspace, and then they work as normal python files for importing.

Work with Python and R modules | Databricks on AWS


The ability to work with an arbitrary number of python files will be essential for any serious Spark pipeline development