This question came up because I have two different approaches to my ingestions:
- ADF Pipeline iterates through files
- ADF Data Flows runs through files based on wildcard filename/location
What I need is to be able to audit each file ingestion by recording the filename, source row count, target row count and status. I can do this fairly easily in pipelines, but I m new to data flows and not sure how after all the branching, derived columns and so on, how you add the audit record "per" file.
My initial thought is to change my data flow so that it only handles one file at a time using parameters, then changing my pipeline so that it iterates over the list of files calling the data flow for each file. This allows me to do all the auditing in the pipeline.
I m not sure if this is optimal though in terms of performance?