Skip to main content

Data Engineering

New

Lakehouse Table Optimization Control Panel

Vote (2) Share
Frithjof Aarrestad Vassbø's profile image

Frithjof Aarrestad Vassbø on 17 Aug 2024 18:51:07

We would like a user interface (control panel) where we can schedule Lakehouse table maintenance operations like OPTIMIZE and VACUUM. Also the option to set the retention period for the VACUUM operations.


We would like to set table maintenance settings and schedule at the Capacity, Workspace, Lakehouse and Table level.


Settings applied at more granular levels would override the default settings applied at higher levels. As an example, we could apply maintenance settings on a specific Lakehouse or a specific Table, which would override the default Capacity settings.


We would like to have a visual overview of the maintenance/optimization status of our Lakehouse tables.

E.g. indicators which can tell whether our tables are suffering from "small files problem", average file size, etc.

A "health check" monitoring solution for our Lakehouse tables, if you will.