English 中文(简体)
Qualcomm Scorpion Double-core ARM/2007/2N代码?
原标题:Problems with Qualcomm Scorpion dual-core ARM NEON code?

我正在开发一个本土图书馆,供我使用ARM组优化和多读物,以便在MSM8660双重计分的ARM芯片上取得最大的业绩。 在进行一些测量时,我注意到:

  1. The single-threaded library with NEON optimizations is faster than the single-threaded library with ARMv6 optimizations (as expected).
  2. The multi-threaded library with ARMv6 optimizations is faster than the single-threaded library with ARMv6 optimizations (as expected).
  3. The multi-threaded library with NEON optimizations is slower than the single-threaded library with NEON optimizations (definitely not expected!).

我试图在网上搜寻所有人员,以解释为何如此,但迄今还没有找到。 似乎所有核心都拥有同样的近地天体N管道或像这种管道一样,但所有图象似乎都表明,每个核心都应当有自己的近地天体N。 是否有任何人知道这种情况发生?

问题回答

首先,你使用什么图书馆?

更正确的是,每个核心单位都有自己的近地天体单元。 然而,这是它们自己的独有的VeNum单位,而没有提供有关该单位的许多信息,而是为位于8x50年的Corx-A8Scorpion设计的,而且比ARM自己实施近地天体N SIMD更好的,但是,一个很好的补救办法是,它们(qcom)设计其硬件的方式符合基础反常设计,因此,大多数关于 co-A8的编码将与Scorpion公司合作,尽管由于可能不同的教学时间,有些性能受到打击。

如果你重新使用“软笔”来汇编你的节目,那么你所呼吁的每项功能的频率为20年左右,即使用浮点论点,或利用近地天体单位将登记数据从ARM核心转移到Neon单位,反之亦然,其速度相当缓慢,有时会阻碍许多周期等待输油管的核心职能。

此外,对于使用浮动点单位的翻新方案,梯子必须在环境转换过程中拯救菲律宾武装部队的登记册,从而对read子造成额外惩罚,因为我们已经知道从中子到武装的登记工作进展缓慢,而且人们知道会阻碍输油管道。

Additionally many other factors can lead to this such as a bad optimization from compiler, cache miss, not using the double issue feature of scorpion, bad instruction scheduling and switching of your thread from one core to another repeatedly.

这可能是因为砍刀。 它难以说出更多的信息。

我的猜测是,这是因为绕开“近地天体”管道的周期性惩罚。 近地天体的管道落在核心的其余部分,因此,你看到对被遗弃的分支部门等实行额外的周期处罚。

如果read子必须经常同步,或者如果你有许多锁,我认为你会看到对近地天体的重罚。

如果该守则是并行不悖的,那么将如何利用近地天体N来全面提高多面编码的性能,那是你的唯一途径,而read之间的沟通很少,也很少。





相关问题
What to look for in performance analyzer in VS 2008

What to look for in performance analyzer in VS 2008 I am using VS Team system and got the performance wizard and reports going. What benchmarks/process do I use? There is a lot of stuff in the ...

SQL Table Size And Query Performance

We have a number of items coming in from a web service; each item containing an unknown number of properties. We are storing them in a database with the following Schema. Items - ItemID - ...

How to speed up Visual Studio 2008? Add more resources?

I m using Visual Studio 2008 (with the latest service pack) I also have ReSharper 4.5 installed. ReSharper Code analysis/ scan is turned off. OS: Windows 7 Enterprise Edition It takes me a long time ...

Manually implementing high performance algorithms in .NET

As a learning experience I recently tried implementing Quicksort with 3 way partitioning in C#. Apart from needing to add an extra range check on the left/right variables before the recursive call, ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

热门标签