Skip to main content

Data Engineering

Planned

Organize tables in the Lakehouse

Vote (62) Share
Michal Mlaka's profile image

Michal Mlaka on 06 Jun 2023 10:41:07

Please enable schemas in the Lakehouses.

Publish some referential architecture how to organize Lakehouse resources in bronze/silver/gold arch. (raw/..)

Publish referential architecture how to organize Lakehouse dev/test/sandbox ... resources in OneLake architecture.



Ted Vilutis (administrator) on 23 Oct 2023 19:02:06

This feature is in our roadmap. Please stay tuned. We will share more details as they become available.

Comments (3)
Michal Mlaka's profile image Profile Picture

Jonathan Boarman on 23 May 2024 18:38:53

RE: Organize tables in the Lakehouse

If we think of gold and silver as "table qualities", then maybe folder is not an adequate structure in all cases. Sometimes, we could pull out the metadata on a table, e.g., TBLPROPERTIES("quality" = "silver"), then leveraging that convention to add a color or gold/silver/bronze badge to the object would be a better way to express table qualities.

Michal Mlaka's profile image Profile Picture

Lammert Heijnen on 21 Feb 2024 09:41:27

RE: Organize tables in the Lakehouse

Hello,I would like to have the opportunity to 'label' lakehouses with additional meta data.To be more precise, I'm building a workspace to collect relevant external data which is useful for reporting and use in my Data Science projects. To do effectively, I'm adopting the Medallion Architecture as described by Piethein Strengholt (Chief Data Officer Microsoft). For example, I want to collect the exchange rates (EUR to another currency). So a currently named and described the lakehouse in the following way (right-click under lakehouse settings):Name: ECB_EXR_BronzeDescription:Database: European Central Bank - ECB Dataflow: Exchange Rates - EXR Medallion: Bronze (raw data)Since I'm using multiple sources and multiple layers it would be nice to be able to filter on this kind of labels/tags/metadata to navigate quickly to the right lakehouse to continue working, run the right pipeline or debug. Currently we can filter by Type, Product, and Owner, build it would be nice to include an additional filter related to the intent of the workspace, to include the domain knowledge (from the example above: ECB exchange rates). Regards,Lammert

Michal Mlaka's profile image Profile Picture

Piotr Palka on 13 Jun 2023 14:25:41

RE: Organize tables in the Lakehouse

I also thought about this,it might be case by caseI have the following architectureworkspace per domaindifferent DB ( lakehause / dwh ) for each layer bronze/silver/goldbut don't what will be best way to handle different env. probably separate workspaces what will be equivalent of separate RG. but not sure if ci/cd will be able to handle it.also would like to hear some referential architecture / best practices