Looking for a large relational dataset which can be one of those open datasets but want them to be relational data model so the tables are connected either normalized or denormalized. We need some FRS, and use cases that can be derived out of these datasets, something like ETL, and BI reporting.
Industries preferred : (Aviation, Retail & Healthcare) [Exception: Northwind / Sakila / AdventureWorks ]
3 - Large Fact Tables (Tables should be partitioned by Date Fields)
5 to 10 - Dimension Tables
FRS - End to End data flow with any transformation requirements and use cases (I know need those use cases for cleansing, and transforming those datasets, not looking for any codes. Overall, looking for a FRS / BRD document)
Reporting - What are some useful insights that can be derived from the data. Like joins, aggregation, staging tables etc
Data Authenticity - Can be completely mock data, but appreciate if it can be meaningful data such that any correlations makes sense.
Size - 5-10 Gigs of Data when uncompressed.
Overall, the requirement is to build sample practice projects to provide an understanding of how end to end ETL works. From ingesting data to Reporting.