Power BI

New

Make better use of caching to speedup development of models

Vote (2)

Pascal Bellerose on 15 Oct 2021 14:40:44

This is coming from past experiences using RStudio and practically any other programming languages.

Make it so that when I read the data source on the first step, it will cache the data read and use this cache for future steps instead of downloading a new preview everytime.

For example, if I'm setting the data profiling on the full dataset, it should keep a cached copy of the dataset in memory to make it faster to process.

That's how I would do it in RStudio (any other R programming interface).
I would load the source on the first line, then transform it by executing any of the steps in the script.

example:
dset <= read_csv("c:\myfile.csv")
head(dset,10)

This will store the data read in "myfile.csv" into a data.table object named "dset"
I can later refer to this object and view a preview of its contents by using "head" function.
I can even tell it how many records I want in the preview.

This is very effective even when reading large files (10M lines).

I think the whole issue is coming from the fact that PowerQuery M uses an iterative way of making changes to data, therefore it has to read from source and apply all steps again every time I click on a step and that's what making it so damn slow.

This is why I'm taking the time to put in an idea.