#12751 adds support of platform_instance in many, but missing the support in DataProcessInstance.
The DataProcessInstanceKey class only includes 3 fields for URN hash generation:
class DataProcessInstanceKey(DatahubKey):
cluster: Optional[str] = None
orchestrator: str
id: str
Source: dataprocess_instance.py
The URN is generated in _post_init_:
self.urn = DataProcessInstanceUrn(
id=DataProcessInstanceKey(
cluster=self.cluster,
orchestrator=self.orchestrator,
id=self.id,
).guid()
)
Source: dataprocess_instance.py#L78-L84
Impact
platform_instance is NOT included in DataProcessInstanceKey, so PLATFORM_A and PLATFORM_B Airflow instances running the same DAG generate identical URNs.
Originally posted by @q30327 in #13358
Originally posted by @q30327 in #13358