English 中文(简体)
How to use VC++ intrinsic functions w/o run-time library
原标题:

I m involved in one of those challenges where you try to produce the smallest possible binary, so I m building my program without the C or C++ run-time libraries (RTL). I don t link to the DLL version or the static version. I don t even #include the header files. I have this working fine.

Some RTL functions, like memset(), can be useful, so I tried adding my own implementation. It works fine in Debug builds (even for those places where the compiler generates an implicit call to memset()). But in Release builds, I get an error saying that I cannot define an intrinsic function. You see, in Release builds, intrinsic functions are enabled, and memset() is an intrinsic.

I would love to use the intrinsic for memset() in my release builds, since it s probably inlined and smaller and faster than my implementation. But I seem to be a in catch-22. If I don t define memset(), the linker complains that it s undefined. If I do define it, the compiler complains that I cannot define an intrinsic function.

Does anyone know the right combination of definition, declaration, #pragma, and compiler and linker flags to get an intrinsic function without pulling in RTL overhead?

Visual Studio 2008, x86, Windows XP+.

To make the problem a little more concrete:

extern "C" void * __cdecl memset(void *, int, size_t);

#ifdef IMPLEMENT_MEMSET
void * __cdecl memset(void *pTarget, int value, size_t cbTarget) {
    char *p = reinterpret_cast<char *>(pTarget);
    while (cbTarget > 0) {
        *p++ = static_cast<char>(value);
        --cbTarget;
    }
    return pTarget;
}
#endif

struct MyStruct {
    int foo[10];
    int bar;
};

int main() {
    MyStruct blah;
    memset(&blah, 0, sizeof(blah));
    return blah.bar;
}

And I build like this:

cl /c /W4 /WX /GL /Ob2 /Oi /Oy /Gs- /GF /Gy intrinsic.cpp
link /SUBSYSTEM:CONSOLE /LTCG /DEBUG /NODEFAULTLIB /ENTRY:main intrinsic.obj

If I compile with my implementation of memset(), I get a compiler error:

error C2169:  memset  : intrinsic function, cannot be defined

If I compile this without my implementation of memset(), I get a linker error:

error LNK2001: unresolved external symbol _memset
最佳回答

I think I finally found a solution:

First, in a header file, declare memset() with a pragma, like so:

extern "C" void * __cdecl memset(void *, int, size_t);
#pragma intrinsic(memset)

That allows your code to call memset(). In most cases, the compiler will inline the intrinsic version.

Second, in a separate implementation file, provide an implementation. The trick to preventing the compiler from complaining about re-defining an intrinsic function is to use another pragma first. Like this:

#pragma function(memset)
void * __cdecl memset(void *pTarget, int value, size_t cbTarget) {
    unsigned char *p = static_cast<unsigned char *>(pTarget);
    while (cbTarget-- > 0) {
        *p++ = static_cast<unsigned char>(value);
    }
    return pTarget;
}

This provides an implementation for those cases where the optimizer decides not to use the intrinsic version.

The outstanding drawback is that you have to disable whole-program optimization (/GL and /LTCG). I m not sure why. If someone finds a way to do this without disabling global optimization, please chime in.

问题回答
  1. I m pretty sure there s a compiler flag that tells VC++ not to use intrinsics

  2. The source to the runtime library is installed with the compiler. You do have the choice of excerpting functions you want/need, though often you ll have to modify them extensively (because they include features and/or dependencies you don t want/need).

  3. There are other open source runtime libraries available as well, which might need less customization.

  4. If you re really serious about this, you ll need to know (and maybe use) assembly language.

Edited to add:

I got your new test code to compile and link. These are the relevant settings:

Enable Intrinsic Functions: No
Whole Program Optimization: No

It s that last one that suppresses "compiler helpers" like the built-in memset.

Edited to add:

Now that it s decoupled, you can copy the asm code from memset.asm into your program--it has one global reference, but you can remove that. It s big enough so that it s not inlined, though if you remove all the tricks it uses to gain speed you might be able to make it small enough for that.

I took your above example and replaced the memset() with this:

void * __cdecl memset(void *pTarget, char value, size_t cbTarget) {
    _asm {
    push ecx
    push edi

    mov al, value
    mov ecx, cbTarget
    mov edi, pTarget
    rep stosb

    pop edi
    pop ecx
    }
    return pTarget;
}

It works, but the library s version is much faster.

I think you have to set Optimization to "Minimize Size (/O1)" or "Disabled (/Od)" to get the Release configuration to compile; at least this is what did the trick for me with VS 2005. Intrinsics are designed for speed so it makes sense that they would be enabled for the other Optimization levels (Speed and Full).

This definitely works with VS 2015: Add the command line option /Oi-. This works because "No" on Intrinsic functions isn t a switch, it s unspecified. /Oi- and all your problems go away (it should work with whole program optimization, but I haven t properly tested this).

This certainly wasn t an answer when you first asked the question, but it is now possible to do what you want by using the version of Clang that is available with Visual Studio 2019, where it works just as you would like without any particular hoops to jump through.

The use of Clang has some other benefits too - especially if you wish to achieve similar goals using x64 architecture too, as it seems to be the only way to make the blasted pdata section go away!

Per Visual C++ itself, I took the approach of putting the implementations of memset/memcpy in a separate source file and, as rc-1290 mentioned, excluded just that one file from Global Optimizations, so the cost was not so high - albeit irritating!

Just name the function something slightly different.

The way the "regular" runtime library does this is by compiling an assembly file with a definition of memset and linking it into the runtime library (You can find the assembly file in or around C:Program FilesMicrosoft Visual Studio 10.0VCcrtsrcintelmemset.asm). That kind of thing works fine even with whole-program optimization.

Also note that the compiler will only use the memset intrinsic in some special cases (when the size is constant and small?). It will usually use the memset function provided by you, so you should probably use the optimized function in memset.asm, unless you re going to write something just as optimized.





相关问题
Undefined reference

I m getting this linker error. I know a way around it, but it s bugging me because another part of the project s linking fine and it s designed almost identically. First, I have namespace LCD. Then I ...

C++ Equivalent of Tidy

Is there an equivalent to tidy for HTML code for C++? I have searched on the internet, but I find nothing but C++ wrappers for tidy, etc... I think the keyword tidy is what has me hung up. I am ...

Template Classes in C++ ... a required skill set?

I m new to C++ and am wondering how much time I should invest in learning how to implement template classes. Are they widely used in industry, or is this something I should move through quickly?

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

typedef ing STL wstring

Why is it when i do the following i get errors when relating to with wchar_t? namespace Foo { typedef std::wstring String; } Now i declare all my strings as Foo::String through out the program, ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

Window iconification status via Xlib

Is it possible to check with the means of pure X11/Xlib only whether the given window is iconified/minimized, and, if it is, how?

热门标签