Michal Mlaka on 06 Jun 2023 10:41:07
Please enable schemas in the Lakehouses.
Publish some referential architecture how to organize Lakehouse resources in bronze/silver/gold arch. (raw/..)
Publish referential architecture how to organize Lakehouse dev/test/sandbox ... resources in OneLake architecture.
Ted Vilutis (administrator) on 23 Oct 2023 19:02:06
This feature is in our roadmap. Please stay tuned. We will share more details as they become available.
- Comments (3)
RE: Organize tables in the Lakehouse
If we think of gold and silver as "table qualities", then maybe folder is not an adequate structure in all cases. Sometimes, we could pull out the metadata on a table, e.g., TBLPROPERTIES("quality" = "silver"), then leveraging that convention to add a color or gold/silver/bronze badge to the object would be a better way to express table qualities.
RE: Organize tables in the Lakehouse
Hello,I would like to have the opportunity to 'label' lakehouses with additional meta data.To be more precise, I'm building a workspace to collect relevant external data which is useful for reporting and use in my Data Science projects. To do effectively, I'm adopting the Medallion Architecture as described by Piethein Strengholt (Chief Data Officer Microsoft). For example, I want to collect the exchange rates (EUR to another currency). So a currently named and described the lakehouse in the following way (right-click under lakehouse settings):Name: ECB_EXR_BronzeDescription:Database: European Central Bank - ECB Dataflow: Exchange Rates - EXR Medallion: Bronze (raw data)Since I'm using multiple sources and multiple layers it would be nice to be able to filter on this kind of labels/tags/metadata to navigate quickly to the right lakehouse to continue working, run the right pipeline or debug. Currently we can filter by Type, Product, and Owner, build it would be nice to include an additional filter related to the intent of the workspace, to include the domain knowledge (from the example above: ECB exchange rates). Regards,Lammert
RE: Organize tables in the Lakehouse
I also thought about this,it might be case by caseI have the following architectureworkspace per domaindifferent DB ( lakehause / dwh ) for each layer bronze/silver/goldbut don't what will be best way to handle different env. probably separate workspaces what will be equivalent of separate RG. but not sure if ci/cd will be able to handle it.also would like to hear some referential architecture / best practices