我正在开发一个本土图书馆,供我使用ARM组优化和多读物,以便在MSM8660双重计分的ARM芯片上取得最大的业绩。 在进行一些测量时,我注意到:
- The single-threaded library with NEON optimizations is faster than the single-threaded library with ARMv6 optimizations (as expected).
- The multi-threaded library with ARMv6 optimizations is faster than the single-threaded library with ARMv6 optimizations (as expected).
- The multi-threaded library with NEON optimizations is slower than the single-threaded library with NEON optimizations (definitely not expected!).
我试图在网上搜寻所有人员,以解释为何如此,但迄今还没有找到。 似乎所有核心都拥有同样的近地天体N管道或像这种管道一样,但所有图象似乎都表明,每个核心都应当有自己的近地天体N。 是否有任何人知道这种情况发生?