Scope
- Use cases
- Rebuild in-memory/table-view dimension tables from warehouse storage.
- Use with stream enrichment jobs that
JOINagainst entity tables.
- Non-use cases
- Initial ingestion of raw events.
- Complex SCD history creation (see S008).
Common steps
Build context
- Confirm entities persisted previously (e.g.,
dim_user,dim_product). - Identify point-in-time or current snapshot semantics required for lookups.
Implementation notes
- For SCD Type 2, build current snapshot using
ROW_NUMBER() ... PARTITION BY ... ORDER BY valid_from DESCand filterrn = 1. - For static dimensions or infrequent updates, broadcast the dimension table for efficient joins.
- Use temporal table joins with primary keys and event-time for point-in-time correctness.
RESINK.AI recommendations
Example
Variations
- Paimon catalog alternative
- Building current product snapshot
Troubleshooting
ROW_NUMBER() with DESC ordering not supported
ROW_NUMBER() with DESC ordering not supported
Flink only supports ASCENDING order in window functions. For current snapshots, use alternative approaches:
Temporal join not returning expected rows
Temporal join not returning expected rows
Verify that event-time attributes and watermarks are defined on the streaming side, and that keys match exactly. Consider using snapshots (direct queries) if temporal joins are too complex for your needs.
Default database name requires quoting
Default database name requires quoting
When referencing Iceberg tables in the default database, always quote the database name:
Stale lookups after updates
Stale lookups after updates
If dimension updates are frequent, consider materializing a changelog stream from the dimension table and using upsert-kafka or Paimon changelogs to keep lookups fresh.

