English 中文(简体)
在缓冲(优化)中排出首选轨道
原标题:Find first unset bit in buffer (optimization)

找到一种最快/最准确的方法,在一系列任意的长篇中找到第一个截断点的界限?

您职能的原型,如size_t first_unset_bit(char unsign const *buf, 尺寸_t bit_count, 尺寸_tstart_bit);, 并且可以多次在同一个缓冲地带 quick接。 如果你能够提供更好的原型,请说明理由。

如果您使用任何组装,请提供一至八六种样本,该样本将按核心2或以后进行。 我将回答提供最佳速度和美丽组合的解决办法。

Update0

这里是我的冷静执行。 如果说它实际上正确的话,我没有想法,它还没有在生活系统中使用。

static size_t first_unset_bit(char unsigned const *buf, size_t bit_count, size_t start_bit)
{
    for (; start_bit < bit_count; ++start_bit)
    {
        size_t buf_index = start_bit / CHAR_BIT;
        int bit_index = start_bit % CHAR_BIT;
        if (!((buf[buf_index] >> bit_index) & 1))
            return start_bit;
    }
    return -1;
}
最佳回答

据认为,在“定义_first_zero_bit”的名下,六氯环己烷是一种高度调整的实施。

问题回答

rel=“nofollow noreferer”>x-86组装语言

REPE SCAS 0xFFFFFFFF

......可能构成答案的重要部分!

你们别无选择,只能先检查一下,然后看看看看一看一看一看一看,这样,你就能够迅速做到这一点。 比较时的32个轨道是一个良好的开端,一旦你开始知道WORD含有第一种不定之处,你就可以将第一种不定之处放在同一词中。

优化定位:绘制图,将按部值绘制的地图,以达到先发的比照,而不是tes。

经常被忽视,strings.h (es, that standard Header) contained a bunch of functions: ffs,ffsl and the similars, see .here,以获得更多信息。 至少加上“gcc”和“x86”,这汇编了相应的one-MT/strong>指示,例如BSFL。

因此,我建议:

  1. add a sentinel 0xFFFF at the end of your array
  2. divide bit_count by 4 (so your iterating over 4-byte blocks instead of bytes)
  3. use a while loop to find the block with the first set bit

例如:

cursor = start_pos;
while(position = ffsl(buf))
  cursor++;
return (cursor - startpos) * 32 + pos;

(如果你必须检验你是否到达了发送的机内,在这种情况下,缓冲是空白的)

虽然你应当用盐类提取粮食,因为我并不称我是大会专家......你基本上每32个比值使用3个以上的周期(一次加热,一次比较,一次比照BSFL指示),并想象你能够更好地利用这一长期功能。

如果不使用任何组装语言,而是使用海合会的建筑,并假定bit_countlong的多个借方,就如此。 我改变了你的职能,用<代码>撤销*缓冲论点,以避免出现其他问题。 完全没有经过测试,我可能已经夸大了数学,特别是在领先的“如果(如果开始-% LONG_BIT)区。

#include <stddef.h>
#include <limits.h>
#define LONG_BIT (CHAR_BIT * sizeof(unsigned long))

size_t
first_unset_bit(const void *buf, size_t bit_count, size_t start_bit)
{
    size_t long_count = bit_count / LONG_BIT;
    size_t start_long = start_bit / LONG_BIT;

    const unsigned long *lbuf = (const unsigned long *)buf;

    if (start_bit % LONG_BIT)
    {
        size_t offset = start_bit % LONG_BIT;
        unsigned long firstword = lbuf[start_long];
        firstword = ~(firstword | ~((1UL << offset) - 1));
        if (firstword)
            return start_bit - offset + __builtin_clzl(firstword);

        start_long += 1;
    }

    for (size_t i = start_long; i < long_count; i++)
    {
        unsigned long word = lbuf[i];
        if (~word)
            return i*LONG_BIT + __builtin_clzl(~word);
    }
    return bit_count + 1; // not found
}

显而易见的解决办法是,从一开始,直到你到达阵列末或找到一条不定的轨道。

由于它可能具有任意的长度,你只能把它变成一个数目,并找到这种方式的价值,因为它可能比一个两倍的规模大。

我假设你的缓冲,即通过<代码>小孔归还的缓冲。 如果没有,你就首先需要扫描不结盟的部分。

uint32_t *p = (void *)buf;
while (!(*p+1)) p++;
size_t cnt = (unsigned char *)p - buf << CHAR_BIT;
if (*p>=0xFFFF0000)
  if (*p>=0xFFFFFF00)
    if (*p>=0xFFFFFFF0)
      if (*p>=0xFFFFFFFC)
        if (*p>=0xFFFFFFFE) cnt+=31;
        else cnt+=30;
      else
        if (*p>=0xFFFFFFF9) cnt+=29;
        else cnt+=28;
    else
      if (*p>=0xFFFFFFC0)
        if (*p>=0xFFFFFFE0) cnt+=27;
        else cnt+=26;
      else
        if (*p>=0xFFFFFF90) cnt+=25;
        else cnt+=24;
  else
    ...

我请你在双轨搜索的其余部分填满。

正如其他人提到的那样,组装语言可以发挥最佳效果。 如果这不是一种选择,你不妨考虑以下做法(未经测试)。 不是你所要求的,而是应该足够接近,以便你能够适应你们的需要。

size_t findFirstNonFFbyte (
    unsigned char const *buf,       /* ptr to buffer in which to search */
    size_t               bufSize,   /* length of the buffer */
    size_t               startHint  /* hint for the starting byte (<= bufSize) */
    ) {
    unsigned char * pBuf = buf + startHint;
    size_t          bytesLeft;

    for (bytesLeft = bufSize - startHint;
         bytesLeft > 0;
         bytesLeft = startHint, pBuf = buf) {
        while ((bytesLeft > 0) && (*pBuf == 0xff)) {
            *pBuf++;
            bytesLeft--;
        }

        if (bytesLeft > 0) {
            return ((int) (pBuf - buf));
        }
    }
    return (-1);
}

使用......

index = findFirstNonFFbyte (...);
bit_index = index + bitTable[buffer[index]];

http://www.ohchr.org。

上述法典将随时检查8个轨道。 如果你知道你的缓冲将四舍五入,其长度甚至超过四 by,那么你就可以在略微 t淡的情况下测试32条轨道(登机忘记了收益价值计算)。

如果您的起点不是直线的,而是绝对的,那么你就能够绕过。

你们需要提供自己的比喻。 它应当是256个长期阵列。 每一条目都确定了该条目指数的首个直线线。 个人经验告诉我,不同的人会以不同的方式把这些比喻。 有些电话线0是星体最快的标线;其他电话线0是最小的星体。 无论你采取什么风格,肯定是前后一致的。

希望这一帮助。

利用Microsoft s __BitScanReverse等同建筑的电梯,我利用这样的东西,为我的记忆系统找到第一个免费的借方(代表机使用):

        __forceinline DWORD __fastcall GetNextFreeBlockIndex(PoolBlock* pPoolBlock)
        {
            DWORD dwIndex;
            DWORD dwOffset = 0;
            DWORD* pUsage = &pPoolBlock->fUsage[0];
            while(dwOffset < MMANAGER_BLOCKS_PER_POOL)
            {
                DWORD dwUsage = *pUsage;
                if(dwUsage != 0xFFFFFFFF && _BitScanForward(&dwIndex,~dwUsage))
                {
                    #if !( MMANAGER_ATOMIC_OPS )
                        pPoolBlock->pSync.Enter();
                    #endif

                    ATOMIC_Write(DWORD,pPoolBlock->dwFreeIndex,dwOffset);
                    ATOMIC_Write(DWORD*,pPoolBlock->pFreeUsage,pUsage);

                    #if !( MMANAGER_ATOMIC_OPS )
                        pPoolBlock->pSync.Leave();
                    #endif

                    return dwIndex + dwOffset;
                }

                pUsage++;
                dwOffset += 32;
            }

            return 0xFFFFFFFF;
        }

        __forceinline DWORD __fastcall GetFreeBlockIndex(PoolBlock* pPoolBlock)
        {
            DWORD dwIndex;
            DWORD dwUsage = *pPoolBlock->pFreeUsage;
            if(dwUsage == 0xFFFFFFFF)
                return GetNextFreeBlockIndex(pPoolBlock);

            if(_BitScanForward(&dwIndex,~dwUsage))
                return dwIndex + pPoolBlock->dwFreeIndex;

            return 0xFFFFFFFF;
        }

excuse the tabbing, this is straight outta some #if/#endif VS code. ofc this code is made just for DWORDS s, you can just do block_size & 3 to find if there are any odd bytes, copy those odd bytes to a DWORD and scan the DWORD, then cut of any results greater than (block_size & 3) << 3





相关问题
How to add/merge several Big O s into one

If I have an algorithm which is comprised of (let s say) three sub-algorithms, all with different O() characteristics, e.g.: algorithm A: O(n) algorithm B: O(log(n)) algorithm C: O(n log(n)) How do ...

Grokking Timsort

There s a (relatively) new sort on the block called Timsort. It s been used as Python s list.sort, and is now going to be the new Array.sort in Java 7. There s some documentation and a tiny Wikipedia ...

Manually implementing high performance algorithms in .NET

As a learning experience I recently tried implementing Quicksort with 3 way partitioning in C#. Apart from needing to add an extra range check on the left/right variables before the recursive call, ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Enumerating All Minimal Directed Cycles Of A Directed Graph

I have a directed graph and my problem is to enumerate all the minimal (cycles that cannot be constructed as the union of other cycles) directed cycles of this graph. This is different from what the ...

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...