I have question about use of Distinct with Entity Framework, using Sql 2005. In this example:
practitioners = from p in context.Practitioners
join pn in context.ProviderNetworks on
p.ProviderId equals pn.ProviderId
(notNetworkIds.Contains(pn.Network))
select p;
practitioners = practitioners
.Distinct()
.OrderByDescending(p => p.UpdateDate);
data = practitioners.Skip(PageSize * (pageOffset ?? 0)).Take(PageSize).ToList();
It all works fine, but the use of distinct
is very inefficient. Larger result sets incur unacceptable performance. The DISTINCT
is killing me. The distinct is only needed because multiple networks can be queried, causing Providers records to be duplicated. In effect I need to ask the DB "only return providers ONCE even if they re in multiple networks". If I could place the DISTINCT
on the ProviderNetworks, the query runs much faster.
How can I cause EF to add the DISTINCT
only the subquery, not to the entire resultset?
由此而形成的简化标准是:
select DISTINCT p.* from Providers
inner join Networks pn on p.ProviderId = pn.ProviderId
where NetworkName in ( abc , def )
IDEAL ql是:
select p.* from Providers
inner join (select DISTINCT ProviderId from Networks
where NetworkName in ( abc , def ))
as pn on p.ProviderId = pn.ProviderId
Thanks Dave