English 中文(简体)
计算服务器中的中位数的功能
原标题:Function to Calculate Median in SQL Server
  • 时间:2009-08-27 18:24:33
  •  标签:

根据,没有媒体作为Transact-SQL的整体功能。 然而,我要指出,是否可以建立这一功能(使用Create Aggregate功能、用户定义功能或其他一些方法)。

这样做的最佳方式(如果可能的话)是,在总询问中计算中值(假定数字数据类型)?

最佳回答

<><>2019 UPDATE: 在我撰写本答复以来的10年里,找到了更多的解决办法,可以产生更好的结果。 此外,从那时起(特别是2012年)服务器的释放也引入了新的T-SQL特征,可用于计算QL机。 页: 1 服务器的释放还改进了电梯优化,这可能影响到各种中位解决办法。 净额网,我的原2009年员额仍为科索沃,但现代地理坐标表服务器的解决方案可能更好。 从2012年起,研究这一条款,这是一个巨大的资源:

该条认为,以下模式远比所有其他选择都快得多,至少是在它们测试的简单图象上。 这一解决办法比最缓慢的解决办法(PERCENTILE_CONT)要快373x。 请注意,这一陷阱需要两个不同的问题,在所有案件中可能并不实际。 这还需要2012年或以后。

DECLARE @c BIGINT = (SELECT COUNT(*) FROM dbo.EvenRows);

SELECT AVG(1.0 * val)
FROM (
    SELECT val FROM dbo.EvenRows
     ORDER BY val
     OFFSET (@c - 1) / 2 ROWS
     FETCH NEXT 1 + (1 - @c % 2) ROWS ONLY
) AS x;

当然,由于2012年对一个图象进行的一次测试取得了巨大成果,你的里程可能有所不同,特别是如果你在2014年或之后重新使用服务器。 如果对你的中位计算很重要,我强烈建议尝试并测试该条所建议的几种选择,以确保你找到最佳办法。

我也特别谨慎地利用(新版于服务器2012年)职能,因为上述条款认为这一内在功能比最快的解决办法要慢373x。 从那时起,这种差异有可能在7年内得到改善,但我个人在我核实其业绩和其他解决办法之前,却不把这一职能放在一大桌上。

<><>>ORIGINAL 2009 POST IS BELOW:

这样做有很多方法,表现差别很大。 这里有一个特别好的解决方案,从Medians, ROW_NUMBERs,and Performance。 这对于执行期间产生的实际的I/Os而言,是一种特别最佳的解决办法——其成本比其他解决办法要高,但实际速度要快得多。

该网页还载有关于其他解决办法和绩效测试细节的讨论。 请注意,如果中值一栏的多行存在,则使用独一栏作为异构体。

如同所有数据库的性能假设情景一样,总是试图用真实硬件的真实数据来检验解决办法——你从未知道,如果更换服务器,服务器优化或你环境中的特殊性,就会使通常的快速解决方案放慢。

SELECT
   CustomerId,
   AVG(TotalDue)
FROM
(
   SELECT
      CustomerId,
      TotalDue,
      -- SalesOrderId in the ORDER BY is a disambiguator to break ties
      ROW_NUMBER() OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue ASC, SalesOrderId ASC) AS RowAsc,
      ROW_NUMBER() OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue DESC, SalesOrderId DESC) AS RowDesc
   FROM Sales.SalesOrderHeader SOH
) x
WHERE
   RowAsc IN (RowDesc, RowDesc - 1, RowDesc + 1)
GROUP BY CustomerId
ORDER BY CustomerId;
问题回答

如果你重新使用2005年或更妥善地使用该表单列的单体计算:

SELECT
(
 (SELECT MAX(Score) FROM
   (SELECT TOP 50 PERCENT Score FROM Posts ORDER BY Score) AS BottomHalf)
 +
 (SELECT MIN(Score) FROM
   (SELECT TOP 50 PERCENT Score FROM Posts ORDER BY Score DESC) AS TopHalf)
) / 2 AS Median

我最初的快速回答是:

select  max(my_column) as [my_column], quartile
from    (select my_column, ntile(4) over (order by my_column) as [quartile]
         from   my_table) i
--where quartile = 2
group by quartile

这将给你一个骨折的中度和中度。 如果你真的只想一行,那是中间的,那就不谈什么条款。

当你将数据贴上解释性计划时,60%的工作正在对数据进行分类,在计算职位依赖性统计数字时,这种数据是不可避免的。

我现根据Robert Ševčík-Robajz在以下评论中提出的极好建议修改了答案:

;with PartitionedData as
  (select my_column, ntile(10) over (order by my_column) as [percentile]
   from   my_table),
MinimaAndMaxima as
  (select  min(my_column) as [low], max(my_column) as [high], percentile
   from    PartitionedData
   group by percentile)
select
  case
    when b.percentile = 10 then cast(b.high as decimal(18,2))
    else cast((a.low + b.high)  as decimal(18,2)) / 2
  end as [value], --b.high, a.low,
  b.percentile
from    MinimaAndMaxima a
  join  MinimaAndMaxima b on (a.percentile -1 = b.percentile) or (a.percentile = 10 and b.percentile = 10)
--where b.percentile = 5

当你拥有甚至数量的数据项目时,应计算正确的中值和百分数值。 同样,如果你只希望中位而不是整个百分位分配的话,就没有最后条款。

服务器2012年(及以后) DISC发挥功能,根据分类的数值计算出一定百分比。 PERCENTILE_DISC (0.5) will compute the median - https://msdn.microsoft.com/en-us/library/h231327.aspx

简单、快捷、准确

SELECT x.Amount 
FROM   (SELECT amount, 
               Count(1) OVER (partition BY  A )        AS TotalRows, 
               Row_number() OVER (ORDER BY Amount ASC) AS AmountOrder 
        FROM   facttransaction ft) x 
WHERE  x.AmountOrder = Round(x.TotalRows / 2.0, 0)  

如果你想在服务器中使用“Aggregate”功能,那么,这是怎样做的。 采用这一方式,可以提出清理问题。 值得注意的是,这一过程可以很容易地加以调整,以计算出一种保值。

创立一个新的视觉演播室项目,并将目标框架设定到NET3.5(2008年为QL,2012年可能有所不同)。 之后,编造一个班级档案,列入以下法典,或 c 等值:

Imports Microsoft.SqlServer.Server
Imports System.Data.SqlTypes
Imports System.IO

<Serializable>
<SqlUserDefinedAggregate(Format.UserDefined, IsInvariantToNulls:=True, IsInvariantToDuplicates:=False, _
  IsInvariantToOrder:=True, MaxByteSize:=-1, IsNullIfEmpty:=True)>
Public Class Median
  Implements IBinarySerialize
  Private _items As List(Of Decimal)

  Public Sub Init()
    _items = New List(Of Decimal)()
  End Sub

  Public Sub Accumulate(value As SqlDecimal)
    If Not value.IsNull Then
      _items.Add(value.Value)
    End If
  End Sub

  Public Sub Merge(other As Median)
    If other._items IsNot Nothing Then
      _items.AddRange(other._items)
    End If
  End Sub

  Public Function Terminate() As SqlDecimal
    If _items.Count <> 0 Then
      Dim result As Decimal
      _items = _items.OrderBy(Function(i) i).ToList()
      If _items.Count Mod 2 = 0 Then
        result = ((_items((_items.Count / 2) - 1)) + (_items(_items.Count / 2))) / 2@
      Else
        result = _items((_items.Count - 1) / 2)
      End If

      Return New SqlDecimal(result)
    Else
      Return New SqlDecimal()
    End If
  End Function

  Public Sub Read(r As BinaryReader) Implements IBinarySerialize.Read
     deserialize it from a string
    Dim list = r.ReadString()
    _items = New List(Of Decimal)

    For Each value In list.Split(","c)
      Dim number As Decimal
      If Decimal.TryParse(value, number) Then
        _items.Add(number)
      End If
    Next

  End Sub

  Public Sub Write(w As BinaryWriter) Implements IBinarySerialize.Write
     serialize the list to a string
    Dim list = ""

    For Each item In _items
      If list <> "" Then
        list += ","
      End If      
      list += item.ToString()
    Next
    w.Write(list)
  End Sub
End Class

然后将其汇编成册,并将DLLL和PDB的档案抄送服务器机,并在服务器库克操作以下指挥:

CREATE ASSEMBLY CustomAggregate FROM  {path to your DLL} 
WITH PERMISSION_SET=SAFE;
GO

CREATE AGGREGATE Median(@value decimal(9, 3))
RETURNS decimal(9, 3) 
EXTERNAL NAME [CustomAggregate].[{namespace of your DLL}.Median];
GO

You can then write a query to calculate the median like this: SELECT dbo.Median(Field) FROM Table

我刚刚走过这个页,同时寻找一套基于媒体的解决办法。 在审视一下这里的一些解决办法之后,我提出了以下意见。 希望是帮助/工作。

DECLARE @test TABLE(
    i int identity(1,1),
    id int,
    score float
)

INSERT INTO @test (id,score) VALUES (1,10)
INSERT INTO @test (id,score) VALUES (1,11)
INSERT INTO @test (id,score) VALUES (1,15)
INSERT INTO @test (id,score) VALUES (1,19)
INSERT INTO @test (id,score) VALUES (1,20)

INSERT INTO @test (id,score) VALUES (2,20)
INSERT INTO @test (id,score) VALUES (2,21)
INSERT INTO @test (id,score) VALUES (2,25)
INSERT INTO @test (id,score) VALUES (2,29)
INSERT INTO @test (id,score) VALUES (2,30)

INSERT INTO @test (id,score) VALUES (3,20)
INSERT INTO @test (id,score) VALUES (3,21)
INSERT INTO @test (id,score) VALUES (3,25)
INSERT INTO @test (id,score) VALUES (3,29)

DECLARE @counts TABLE(
    id int,
    cnt int
)

INSERT INTO @counts (
    id,
    cnt
)
SELECT
    id,
    COUNT(*)
FROM
    @test
GROUP BY
    id

SELECT
    drv.id,
    drv.start,
    AVG(t.score)
FROM
    (
        SELECT
            MIN(t.i)-1 AS start,
            t.id
        FROM
            @test t
        GROUP BY
            t.id
    ) drv
    INNER JOIN @test t ON drv.id = t.id
    INNER JOIN @counts c ON t.id = c.id
WHERE
    t.i = ((c.cnt+1)/2)+drv.start
    OR (
        t.i = (((c.cnt+1)%2) * ((c.cnt+2)/2))+drv.start
        AND ((c.cnt+1)%2) * ((c.cnt+2)/2) <> 0
    )
GROUP BY
    drv.id,
    drv.start

以下查询从一栏的数值清单中恢复median。 它不能作为整体功能或与整体功能一起使用,但你仍然可以把它作为内部选择中带有《WHERE条款》的分局。

<>SQL服务器 2005+:

SELECT TOP 1 value from
(
    SELECT TOP 50 PERCENT value 
    FROM table_name 
    ORDER BY  value
)for_median
ORDER BY value DESC

尽管Justin 赠款的解决办法似乎很牢固,但我发现,如果你在某个分治中拥有一些重复价值,那么ASC重复数值的滚动数字会随着顺序的推移而终止,从而无法适当调整。

这里是我结果的一个部分:

KEY VALUE ROWA ROWD  

13  2     22   182
13  1     6    183
13  1     7    184
13  1     8    185
13  1     9    186
13  1     10   187
13  1     11   188
13  1     12   189
13  0     1    190
13  0     2    191
13  0     3    192
13  0     4    193
13  0     5    194

我将Justin的法典作为这一解决办法的基础。 尽管由于使用多个衍生表格,其效率并不高,但它的确解决了我遇到的滚动问题。 任何改进都会受到欢迎,因为我不是在T-SQL中经历过。

SELECT PKEY, cast(AVG(VALUE)as decimal(5,2)) as MEDIANVALUE
FROM
(
  SELECT PKEY,VALUE,ROWA,ROWD,
   FLAG  = (CASE WHEN ROWA IN (ROWD,ROWD-1,ROWD+1) THEN 1 ELSE 0 END)
  FROM
  (
    SELECT
    PKEY,
    cast(VALUE as decimal(5,2)) as VALUE,
    ROWA,
    ROW_NUMBER() OVER (PARTITION BY PKEY ORDER BY ROWA DESC) as ROWD 

    FROM
    (
      SELECT
      PKEY, 
      VALUE,
      ROW_NUMBER() OVER (PARTITION BY PKEY ORDER BY VALUE ASC,PKEY ASC ) as ROWA 
      FROM [MTEST]
    )T1
  )T2
)T3
WHERE FLAG =  1 
GROUP BY PKEY
ORDER BY PKEY

在乌干达国防军中写:

 Select Top 1 medianSortColumn from Table T
  Where (Select Count(*) from Table
         Where MedianSortColumn <
           (Select Count(*) From Table) / 2)
  Order By medianSortColumn

上文提到的十大例子很好。 但是,应该非常清楚地说明主要的需求。 我已经看到,没有钥匙,野生动物的法典是坏的。

The complaint I get about the Percentile_Cont is that it wont give you an actual value from the dataset. To get to a "median" that is an actual value from the dataset use Percentile_Disc.

SELECT SalesOrderID, OrderQty,
    PERCENTILE_DISC(0.5) 
        WITHIN GROUP (ORDER BY OrderQty)
        OVER (PARTITION BY SalesOrderID) AS MedianCont
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY SalesOrderID DESC

使用单一说明——一种方式是利用ROW_NUMBER(NUMBER),COUNT(UNT)窗户功能和过滤次频率。 这里要找到中值工资:

 SELECT AVG(e_salary) 
 FROM                                                             
    (SELECT 
      ROW_NUMBER() OVER(ORDER BY e_salary) as row_no, 
      e_salary,
      (COUNT(*) OVER()+1)*0.5 AS row_half
     FROM Employee) t
 WHERE row_no IN (FLOOR(row_half),CEILING(row_half))

我看到了使用FLOOR和CEILING这一净额的类似解决办法,但试图使用单一发言。 (编辑)

<>Median Finding

这是找到属性中位的最简单方法。

Select round(S.salary,4) median from employee S 
where (select count(salary) from station 
where salary < S.salary ) = (select count(salary) from station
where salary > S.salary)

在上文Jeff Atwood的回答基础上,由小组和相关的分局为每个群体提供中位。

SELECT TestID, 
(
 (SELECT MAX(Score) FROM
   (SELECT TOP 50 PERCENT Score FROM Posts WHERE TestID = Posts_parent.TestID ORDER BY Score) AS BottomHalf)
 +
 (SELECT MIN(Score) FROM
   (SELECT TOP 50 PERCENT Score FROM Posts WHERE TestID = Posts_parent.TestID ORDER BY Score DESC) AS TopHalf)
) / 2 AS MedianScore,
AVG(Score) AS AvgScore, MIN(Score) AS MinScore, MAX(Score) AS MaxScore
FROM Posts_parent
GROUP BY Posts_parent.TestID

表1

select col1  
from
    (select top 50 percent col1, 
    ROW_NUMBER() OVER(ORDER BY col1 ASC) AS Rowa,
    ROW_NUMBER() OVER(ORDER BY col1 DESC) AS Rowd
    from table1 ) tmp
where tmp.Rowa = tmp.Rowd

我们常常需要计算中位数,不仅计算整个表格,而且计算某些发展指数的合计数。 换言之,在我们表格中计算每个发展指标的中位数,每个发展指标都有许多记录。 (根据@gdoron编辑的解决方案:良好业绩和许多文件)

SELECT our_id, AVG(1.0 * our_val) as Median
FROM
( SELECT our_id, our_val, 
  COUNT(*) OVER (PARTITION BY our_id) AS cnt,
  ROW_NUMBER() OVER (PARTITION BY our_id ORDER BY our_val) AS rnk
  FROM our_table
) AS x
WHERE rnk IN ((cnt + 1)/2, (cnt + 2)/2) GROUP BY our_id;

希望会有所助益。

对于大型数据集,你可以尝试这一全球清单:

https://gist.github.com/chrisknoll/1b38761ce8c5016ec5b

它的工作是汇总你在你的定点(如年龄、出生年份等)中发现的不同价值,并使用King窗功能查找你在询问中指明的任何百分位。

从雇员表中获取工资中值

with cte as (select salary, ROW_NUMBER() over (order by salary asc) as num from employees)

select avg(salary) from cte where num in ((select (count(*)+1)/2 from employees), (select (count(*)+2)/2 from employees));

我希望自己找到解决办法,但我的大脑 trip,走了。 页: 1 它发挥了作用,但没有要求我在上午解释。 iii

DECLARE @table AS TABLE
(
    Number int not null
);

insert into @table select 2;
insert into @table select 4;
insert into @table select 9;
insert into @table select 15;
insert into @table select 22;
insert into @table select 26;
insert into @table select 37;
insert into @table select 49;

DECLARE @Count AS INT
SELECT @Count = COUNT(*) FROM @table;

WITH MyResults(RowNo, Number) AS
(
    SELECT RowNo, Number FROM
        (SELECT ROW_NUMBER() OVER (ORDER BY Number) AS RowNo, Number FROM @table) AS Foo
)
SELECT AVG(Number) FROM MyResults WHERE RowNo = (@Count+1)/2 OR RowNo = ((@Count+1)%2) * ((@Count+2)/2)
--Create Temp Table to Store Results in
DECLARE @results AS TABLE 
(
    [Month] datetime not null
 ,[Median] int not null
);

--This variable will determine the date
DECLARE @IntDate as int 
set @IntDate = -13


WHILE (@IntDate < 0) 
BEGIN

--Create Temp Table
DECLARE @table AS TABLE 
(
    [Rank] int not null
 ,[Days Open] int not null
);

--Insert records into Temp Table
insert into @table 

SELECT 
    rank() OVER (ORDER BY DATEADD(mm, DATEDIFF(mm, 0, DATEADD(ss, SVR.close_date,  1970 )), 0), DATEDIFF(day,DATEADD(ss, SVR.open_date,  1970 ),DATEADD(ss, SVR.close_date,  1970 )),[SVR].[ref_num]) as [Rank]
 ,DATEDIFF(day,DATEADD(ss, SVR.open_date,  1970 ),DATEADD(ss, SVR.close_date,  1970 )) as [Days Open]
FROM
 mdbrpt.dbo.View_Request SVR
 LEFT OUTER JOIN dbo.dtv_apps_systems vapp 
 on SVR.category = vapp.persid
 LEFT OUTER JOIN dbo.prob_ctg pctg 
 on SVR.category = pctg.persid
 Left Outer Join [mdbrpt].[dbo].[rootcause] as [Root Cause] 
 on [SVR].[rootcause]=[Root Cause].[id]
 Left Outer Join [mdbrpt].[dbo].[cr_stat] as [Status]
 on [SVR].[status]=[Status].[code]
 LEFT OUTER JOIN [mdbrpt].[dbo].[net_res] as [net] 
 on [net].[id]=SVR.[affected_rc]
WHERE
 SVR.Type IN ( P ) 
 AND
 SVR.close_date IS NOT NULL 
 AND
 [Status].[SYM] =  Closed 
 AND
 SVR.parent is null
 AND
 [Root Cause].[sym] in (  RC - Application , RC - Hardware ,  RC - Operational ,  RC - Unknown )
 AND
 (
  [vapp].[appl_name] in ( 3PI , Billing Rpts/Files , Collabrent , Reports , STMS , STMS 2 , Telco , Comergent , OOM , C3-BAU , C3-DD , DIRECTV , DIRECTV Sales , DIRECTV Self Care , Dealer Website , EI Servlet , Enterprise Integration , ET , ICAN , ODS , SB-SCM , SeeBeyond , Digital Dashboard , IVR , OMS , Order Services , Retail Services , OSCAR , SAP , CTI , RIO , RIO Call Center , RIO Field Services , FSS-RIO3 , TAOS , TCS )
 OR
  pctg.sym in ( Systems.Release Health Dashboard.Problem , DTV QA Test.Enterprise Release.Deferred Defect Log )
 AND  
  [Net].[nr_desc] in ( 3PI , Billing Rpts/Files , Collabrent , Reports , STMS , STMS 2 , Telco , Comergent , OOM , C3-BAU , C3-DD , DIRECTV , DIRECTV Sales , DIRECTV Self Care , Dealer Website , EI Servlet , Enterprise Integration , ET , ICAN , ODS , SB-SCM , SeeBeyond , Digital Dashboard , IVR , OMS , Order Services , Retail Services , OSCAR , SAP , CTI , RIO , RIO Call Center , RIO Field Services , FSS-RIO3 , TAOS , TCS )
 )
 AND
 DATEADD(mm, DATEDIFF(mm, 0, DATEADD(ss, SVR.close_date,  1970 )), 0) = DATEADD(mm, DATEDIFF(mm,0,DATEADD(mm,@IntDate,getdate())), 0)
ORDER BY [Days Open]



DECLARE @Count AS INT
SELECT @Count = COUNT(*) FROM @table;

WITH MyResults(RowNo, [Days Open]) AS
(
    SELECT RowNo, [Days Open] FROM
        (SELECT ROW_NUMBER() OVER (ORDER BY [Days Open]) AS RowNo, [Days Open] FROM @table) AS Foo
)


insert into @results
SELECT 
 DATEADD(mm, DATEDIFF(mm,0,DATEADD(mm,@IntDate,getdate())), 0) as [Month]
 ,AVG([Days Open])as [Median] FROM MyResults WHERE RowNo = (@Count+1)/2 OR RowNo = ((@Count+1)%2) * ((@Count+2)/2) 


set @IntDate = @IntDate+1
DELETE FROM @table
END

select *
from @results
order by [Month]

著作:

DECLARE @testTable TABLE 
( 
    VALUE   INT
)
--INSERT INTO @testTable -- Even Test
--SELECT 3 UNION ALL
--SELECT 5 UNION ALL
--SELECT 7 UNION ALL
--SELECT 12 UNION ALL
--SELECT 13 UNION ALL
--SELECT 14 UNION ALL
--SELECT 21 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 29 UNION ALL
--SELECT 40 UNION ALL
--SELECT 56

--
--INSERT INTO @testTable -- Odd Test
--SELECT 3 UNION ALL
--SELECT 5 UNION ALL
--SELECT 7 UNION ALL
--SELECT 12 UNION ALL
--SELECT 13 UNION ALL
--SELECT 14 UNION ALL
--SELECT 21 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 23 UNION ALL
--SELECT 29 UNION ALL
--SELECT 39 UNION ALL
--SELECT 40 UNION ALL
--SELECT 56


DECLARE @RowAsc TABLE
(
    ID      INT IDENTITY,
    Amount  INT
)

INSERT INTO @RowAsc
SELECT  VALUE 
FROM    @testTable 
ORDER BY VALUE ASC

SELECT  AVG(amount)
FROM @RowAsc ra
WHERE ra.id IN
(
    SELECT  ID 
    FROM    @RowAsc
    WHERE   ra.id -
    (
        SELECT  MAX(id) / 2.0 
        FROM    @RowAsc
    ) BETWEEN 0 AND 1

)

对于像我这样学习基本知识的新来者,我个人认为这一榜样更容易效仿,因为了解真实情况以及中位价值来自......更容易理解。

select
 ( max(a.[Value1]) + min(a.[Value1]) ) / 2 as [Median Value1]
,( max(a.[Value2]) + min(a.[Value2]) ) / 2 as [Median Value2]

from (select
    datediff(dd,startdate,enddate) as [Value1]
    ,xxxxxxxxxxxxxx as [Value2]
     from dbo.table1
     )a

absolute!

这是我可以提出的简单答案。 我的数据很好地发挥作用。 如果你想排除某些价值,那么就在内部选择中添加一个条款。

SELECT TOP 1 
    ValueField AS MedianValue
FROM
    (SELECT TOP(SELECT COUNT(1)/2 FROM tTABLE)
        ValueField
    FROM 
        tTABLE
    ORDER BY 
        ValueField) A
ORDER BY
    ValueField DESC

以下解决办法基于这些假设:

  • No duplicate values
  • No NULLs

法典:

IF OBJECT_ID( dbo.R ,  U ) IS NOT NULL
  DROP TABLE dbo.R

CREATE TABLE R (
    A FLOAT NOT NULL);

INSERT INTO R VALUES (1);
INSERT INTO R VALUES (2);
INSERT INTO R VALUES (3);
INSERT INTO R VALUES (4);
INSERT INTO R VALUES (5);
INSERT INTO R VALUES (6);

-- Returns Median(R)
select SUM(A) / CAST(COUNT(A) AS FLOAT)
from R R1 
where ((select count(A) from R R2 where R1.A > R2.A) = 
      (select count(A) from R R2 where R1.A < R2.A)) OR
      ((select count(A) from R R2 where R1.A > R2.A) + 1 = 
      (select count(A) from R R2 where R1.A < R2.A)) OR
      ((select count(A) from R R2 where R1.A > R2.A) = 
      (select count(A) from R R2 where R1.A < R2.A) + 1) ; 
DECLARE @Obs int
DECLARE @RowAsc table
(
ID      INT IDENTITY,
Observation  FLOAT
)
INSERT INTO @RowAsc
SELECT Observations FROM MyTable
ORDER BY 1 
SELECT @Obs=COUNT(*)/2 FROM @RowAsc
SELECT Observation AS Median FROM @RowAsc WHERE ID=@Obs

我尝试了几种备选办法,但由于我的数据记录重复了价值观,因此,ROW_NUMBER版本似乎不是我的选择。 因此,我在此使用了问询表(与NCTILE的版本):

SELECT distinct
   CustomerId,
   (
       MAX(CASE WHEN Percent50_Asc=1 THEN TotalDue END) OVER (PARTITION BY CustomerId)  +
       MIN(CASE WHEN Percent50_desc=1 THEN TotalDue END) OVER (PARTITION BY CustomerId) 
   )/2 MEDIAN
FROM
(
   SELECT
      CustomerId,
      TotalDue,
     NTILE(2) OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue ASC) AS Percent50_Asc,
     NTILE(2) OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue DESC) AS Percent50_desc
   FROM Sales.SalesOrderHeader SOH
) x
ORDER BY CustomerId;

关于你的问题,杰夫·阿特伍德已经提供了简单而有效的解决办法。 但是,如果你寻找一些替代方法计算中位数,则低于《标准》将有助于你。

create table employees(salary int);

insert into employees values(8); insert into employees values(23); insert into employees values(45); insert into employees values(123); insert into employees values(93); insert into employees values(2342); insert into employees values(2238);

select * from employees;

declare @odd_even int; declare @cnt int; declare @middle_no int;


set @cnt=(select count(*) from employees); set @middle_no=(@cnt/2)+1; select @odd_even=case when (@cnt%2=0) THEN -1 ELse 0 END ;


 select AVG(tbl.salary) from  (select  salary,ROW_NUMBER() over (order by salary) as rno from employees group by salary) tbl  where tbl.rno=@middle_no or tbl.rno=@middle_no+@odd_even;

如果你想计算一下MySQL中的中位数,那么,> githubLink将是有益的。





相关问题