English 中文(简体)
便携式(大多是“Windows”) 4bytes 提取/比较
原标题:Portable (Linux & Windows mostly) 4bytes extracting/comparing

原文照发。

我先研究像我国这样的类似问题。

• 如何以便携式方式(没有超标/流放)取道/包裹?

I have never learned C and because of that I am a living proof that without knowing the basics everything becomes a nasty mess afterwards. Anyway, writing words (already) is no time to say start with the alphabet .

    ulHashPattern = *(unsigned long *)(pbPattern);
        for (a=0; a < ASIZE; a++) bm_bc[a]=cbPattern;
        for (j=0; j < cbPattern-1; j++) bm_bc[pbPattern[j]]=cbPattern-j-1;
        i=0;
        while (i <= cbTarget-cbPattern) {
            if ( *(unsigned long *)&pbTarget[i] == ulHashPattern ) {

The above fragment works as it must on Windows 32bit compiler. My desire is all such 4vs4 comparisons to work under 64bit Windows and Linux as well. Many times I need 2,4,8 bytes transfers, in above example I need explicitly 4bytes from some pbTarget offset. Here the actual question: what type should I use instead of unsigned long? (I guess something close to UINT16,UINT32,UINT64 will do). In other words, what 3 types I need in order to represent 2,4,8 bytes ALWAYS independently from the environment.

我认为,这一基本问题造成了许多麻烦,因此应该加以澄清。

Add-on 2012-Jan-16:

@Richard J. Ross III
I am double-confused! Since I don t know whether Linux uses 1] or 2] i.e. is _STD_USING defined in Linux, in other words which group is portable the types uint8_t,...,uint64_t or the _CSTD uint8_t,...,_CSTD uint64_t?

页: 1 h

typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned int uint32_t;
typedef _ULonglong uint64_t;

2) MVS 10.0 st。 h

 #if defined(_STD_USING)
...
using _CSTD uint8_t; using _CSTD uint16_t;
using _CSTD uint32_t; using _CSTD uint64_t;
...

微软C 32bit没有问题:

; 3401 :           if ( *(_CSTD uint32_t *)&pbTarget[i] == *(_CSTD uint32_t *)(pbPattern) )

  01360 8b 04 19     mov     eax, DWORD PTR [ecx+ebx]
  01363 8b 7c 24 14  mov     edi, DWORD PTR _pbPattern$GSCopy$[esp+1080]
  01367 3b 07        cmp     eax, DWORD PTR [edi]
  01369 75 2c        jne     SHORT $LN80@Railgun_Qu@6

但是,当有64个目标代码时,情况是:

D:\_KAZE_Simplicius_Simplicissimus_Septupleton_r2-_strstr_SHORT-SHOWDOWN_r7>cl /Ox /Tcstrstr_SHORT-SHOWDOWN.c /Fastrstr_SHORT-SHOWDOWN /w /FAcs
Microsoft (R) C/C++ Optimizing Compiler Version 15.00.30729.01 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

strstr_SHORT-SHOWDOWN.c
strstr_SHORT-SHOWDOWN.c(1925) : fatal error C1083: Cannot open include file:  stdint.h : No such file or directory

D:\_KAZE_Simplicius_Simplicissimus_Septupleton_r2-_strstr_SHORT-SHOWDOWN_r7>

究竟是怎样介绍一下是否总是介绍的。

下面,我做了评论:/#include <stdint.h>,然后汇编成:

; 3401 :           if ( !memcmp(&pbTarget[i], &ulHashPattern, 4) ) 
  01766 49 63 c4     movsxd  rax, r12d
  01769 42 39 2c 10  cmp     DWORD PTR [rax+r10], ebp
  0176d 75 38        jne     SHORT $LN1@Railgun_Qu@6

; 3401 :           if ( *(unsigned long *)&pbTarget[i] == ulHashPattern ) 
  01766 49 63 c4     movsxd  rax, r12d
  01769 42 39 2c 10  cmp     DWORD PTR [rax+r10], ebp
  0176d 75 38        jne     SHORT $LN1@Railgun_Qu@6

时间太长,* 令我感到不安,因为gcc-m64将充斥QWORD而不是DWORD。

@Mysticial
Just wanted to show the three different translations done by Microsoft CL 32bit v16:
1]

; 3400 :           if ( !memcmp(&pbTarget[i], pbPattern, 4) )
  01360 8b 04 19     mov     eax, DWORD PTR [ecx+ebx]
  01363 8b 7c 24 14  mov     edi, DWORD PTR _pbPattern$GSCopy$[esp+1080]
  01367 3b 07        cmp     eax, DWORD PTR [edi]
  01369 75 2c        jne     SHORT $LN84@Railgun_Qu@6

2)

; 3400 :           if ( !memcmp(&pbTarget[i], &ulHashPattern, 4) )
  01350 8b 44 24 14  mov     eax, DWORD PTR _ulHashPattern$[esp+1076]
  01354 39 04 2a     cmp     DWORD PTR [edx+ebp], eax
  01357 75 2e        jne     SHORT $LN83@Railgun_Qu@6

3)

; 3401 :           if ( *(uint32_t *)&pbTarget[i] == ulHashPattern )
  01350 8b 44 24 14  mov     eax, DWORD PTR _ulHashPattern$[esp+1076]
  01354 39 04 2a     cmp     DWORD PTR [edx+ebp], eax
  01357 75 2e        jne     SHORT $LN79@Railgun_Qu@6

The initial goal was to extract (with a single mov instruction respectively *(uint32_t *)&pbTarget[i]) and compare 4bytes versus a register variable 4bytes in length i.e. one RAM access one comparision in a single instruction. Nastily I managed only to reduce the memcmp() s 3 RAM accesses (applied on pbPattern which points to 4 or more bytes) down to 2, thankfully to the inlining. Now if I want to use memcmp() on first 4bytes of pbPattern (as in 2]) ulHashPattern should be not of type register, whereas 3] needs not such a restriction.

; 3400 :           if ( !memcmp(&pbTarget[i], &ulHashPattern, 4) )

上面的行文有一处错误(图HashPattern被定义为:登记未经签名的长期拘留;):

strstr_SHORT-SHOWDOWN.c(3400) : error C2103:  &  on register variable

Yes, you are right: memcmp() saves the situation (but with a limitation) - the fragment 2] is identical to 3] mine dirty style. Obviously my inclination not to use a function when it might be manually coded is a thing of the past but I like it.

然而,我并不完全高兴的是,我把ul子定义为一个登记册变量,但它每次都从援助团装上? Maybe 我错了一些事情,但这一条(去掉了轴心,DWORD PTR_ulHashPattern$[esp+1076])线却贬低了等级——我认为,这是一部简单的法典。

问题回答

严格来说,唯一可以使用的类型是char<>/code>。 这是因为你是violating Cruel-aliasing,其类型如下:

*(unsigned long *)(pbPattern);
*(unsigned long *)&pbTarget[i]

<代码>char*是本条规则的唯一例外,因为你可以把任何数据类型的数据与char*>。

如果你对海湾合作委员会发出警告,你就应当得到严格的警告。 (AFAIK, MSVC don t warn about cruel-aliasing.)


I can t quite tell exactly what you are trying to do in that code-snippet, but the idea still holds, you should not be using unsigned long or any other data-type to load and compare larger chunks of data that are of different types.

在现实中,你确实应当使用<代码>memcmp(),因为它是直截了当的,而且你将绕过把一切都强加于char*的低效率。

您是否有理由使用<条码>。


如果你对科索沃进行严格控制,你可使用<代码>uint32_t界定的固定分类(例如<代码>t;stdint.h>。 然而,人们知道,这些货物是固定在借方的,而不是由 by确定的。





相关问题
Fastest method for running a binary search on a file in C?

For example, let s say I want to find a particular word or number in a file. The contents are in sorted order (obviously). Since I want to run a binary search on the file, it seems like a real waste ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Tips for debugging a made-for-linux application on windows?

I m trying to find the source of a bug I have found in an open-source application. I have managed to get a build up and running on my Windows machine, but I m having trouble finding the spot in the ...

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

Good, free, easy-to-use C graphics libraries? [closed]

I was wondering if there were any good free graphics libraries for C that are easy to use? It s for plotting 2d and 3d graphs and then saving to a file. It s on a Linux system and there s no gnuplot ...

Encoding, decoding an integer to a char array

Please note that this is not homework and i did search before starting this new thread. I got Store an int in a char array? I was looking for an answer but didn t get any satisfactory answer in the ...

热门标签