Scope
- Use cases
- Join raw events with user/product dimensions.
- Create enriched fact tables for BI and ML features.
- Non-use cases
- Building SCD history (see S001, S008).
- Heavy aggregations (see S005).
Common steps
Build context
- Identify event streams to enrich (e.g.,
ProductClicked,CheckoutStarted). - Confirm dimension availability and freshness (e.g.,
dim_user_current,dim_product_current).
Implementation notes
- Use temporal joins for point-in-time correctness when dimensions change frequently.
- If dimensions change rarely, a simple left join on current snapshots is often sufficient and simpler.
- Consider projecting only needed columns to keep costs low.
RESINK.AI recommendations
Example
Variations
- Enrich checkout funnel events
Troubleshooting
Partitioning syntax not supported
Partitioning syntax not supported
Flink SQL has limited support for partitioning syntax in CREATE TABLE statements. For complex partitioning schemes, create tables without partitioning first, then add partitioning via ALTER TABLE or use simpler partitioning patterns:
Missing enriched fields
Missing enriched fields
Confirm join keys and nullability. Use
LEFT JOIN to retain events even when dimension is missing. Consider adding default values for downstream consumers:Current snapshot filtering performance
Current snapshot filtering performance
For frequently changing dimensions, the
WHERE valid_to = TIMESTAMP '2099-12-31 23:59:59' pattern works well for current snapshots. For better performance with large dimension tables, consider creating materialized views:Join skew and performance
Join skew and performance
If a few keys are very hot, try salting or broadcasting the smaller dimension. Limit selected columns and consider using temporal joins for better performance.

