Describe the bug
MultipleAspectTransformer silently drops all non-matching MetadataChangeProposalWrapper (MCPW) records passing through BaseTransformer.transform(). For example, a transformer that handles globalTags will swallow domains, subTypes, dataPlatformInstance, and any other aspect MCPWs that don't match its target. This means source-level domain configuration has no effect when a MultipleAspectTransformer is present in the pipeline.
To Reproduce
Steps to reproduce the behavior:
from datahub.ingestion.transformer.base_transformer import BaseTransformer, MultipleAspectTransformer
from datahub.emitter.mcp import MetadataChangeProposalWrapper
from datahub.ingestion.api.common import EndOfStream, PipelineContext, RecordEnvelope
from datahub.metadata.schema_classes import DomainsClass, GlobalTagsClass
class MyTransformer(BaseTransformer, MultipleAspectTransformer):
def entity_types(self): return ["dataset"]
def aspect_name(self): return "globalTags"
def transform_aspects(self, entity_urn, aspect_name, aspect): yield (aspect_name, aspect)
@classmethod
def create(cls, config_dict, ctx): return cls()
def __init__(self): super().__init__()
ctx = PipelineContext(run_id="test")
transformer = MyTransformer.create({}, ctx)
urn = "urn:li:dataset:(urn:li:dataPlatform:hive,db.table,PROD)"
tags_mcp = MetadataChangeProposalWrapper(entityUrn=urn, aspect=GlobalTagsClass(tags=[]))
domain_mcp = MetadataChangeProposalWrapper(entityUrn=urn, aspect=DomainsClass(domains=["urn:li:domain:domm"]))
inputs = [RecordEnvelope(r, metadata={}) for r in [tags_mcp, domain_mcp, EndOfStream()]]
outputs = list(transformer.transform(inputs))
domain_outputs = [o.record for o in outputs if isinstance(o.record, MetadataChangeProposalWrapper) and isinstance(o.record.aspect, DomainsClass)]
assert len(domain_outputs) == 1 # FAILS — domain_outputs is empty
Expected behavior
MCPWs whose aspect does not match the transformer's target aspect should pass through unchanged, just as SingleAspectTransformer does.
Screenshots
If applicable, add screenshots to help explain your problem.
Environment
- DataHub Version:
acryl-datahub==1.5.0.1
Additional context
Describe the bug
MultipleAspectTransformer silently drops all non-matching MetadataChangeProposalWrapper (MCPW) records passing through BaseTransformer.transform(). For example, a transformer that handles globalTags will swallow domains, subTypes, dataPlatformInstance, and any other aspect MCPWs that don't match its target. This means source-level domain configuration has no effect when a MultipleAspectTransformer is present in the pipeline.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
MCPWs whose aspect does not match the transformer's target aspect should pass through unchanged, just as SingleAspectTransformer does.
Screenshots
If applicable, add screenshots to help explain your problem.
Environment
acryl-datahub==1.5.0.1Additional context