English 中文(简体)
GLSL shader generation of normals
原标题:

Hi I am writing 3D modeling app and I want to speed up rendering in OpenGL. Currently I use glBegin/glEnd which is really slow and deprecated way. I need to draw very fast flat shaded models. I generate normals on CPU every single frame. This is very slow. I tried to use glDrawElements with indexed geometry, but there is problem in normal generation, because normals are specified at vertex not at triangle level.
Another idea was to use GLSL to generate normals on GPU in geometry shader. I written this code for normal generation:

#version 120 
#extension GL_EXT_geometry_shader4 : enable

vec3 NormalFromTriangleVertices(vec3 triangleVertices[3])
{
    // now is same as RedBook (OpenGL Programming Guide)
    vec3 u = triangleVertices[0] - triangleVertices[1];
    vec3 v = triangleVertices[1] - triangleVertices[2];
    return cross(v, u);
}

void main()
{
    // no change of position
    // computes normal from input triangle and front color for that triangle

    vec3 triangleVertices[3];
    vec3 computedNormal;

    vec3 normal, lightDir;
    vec4 diffuse;
    float NdotL;

    vec4 finalColor;

    for(int i = 0; i < gl_VerticesIn; i += 3)
    {
        for (int j = 0; j < 3; j++)
        {
            triangleVertices[j] = gl_PositionIn[i + j].xyz;
        }
        computedNormal = NormalFromTriangleVertices(triangleVertices);
        normal = normalize(gl_NormalMatrix * computedNormal);

        // hardcoded light direction 
        vec4 light = gl_ModelViewMatrix * vec4(0.0, 0.0, 1.0, 0.0);
        lightDir = normalize(light.xyz);

        NdotL = max(dot(normal, lightDir), 0.0);

        // hardcoded
        diffuse = vec4(0.5, 0.5, 0.9, 1.0);

        finalColor = NdotL * diffuse; 
        finalColor.a = 1.0; // final color ignores everything, except lighting

        for (int j = 0; j < 3; j++)
        { 
            gl_FrontColor = finalColor;
            gl_Position = gl_PositionIn[i + j];
            EmitVertex();
        }
    }
    EndPrimitive();
}

When I integrated shaders to my application, no speed improvement occurred. It was worse than before. I am newbie in GLSL and shaders overall so I don t know what I done wrong. I tried this code on MacBook with Geforce 9400M.

To be more clear, this is code I want to replace:

- (void)drawAsCommandsWithScale:(Vector3D)scale
{
    float frontDiffuse[4] = { 0.4, 0.4, 0.4, 1 };
    CGFloat components[4];
    [color getComponents:components];
    float backDiffuse[4];
    float selectedDiffuse[4] = { 1.0f, 0.0f, 0.0f, 1 };

    for (uint i = 0; i < 4; i++)
        backDiffuse[i] = components[i];

    glMaterialfv(GL_BACK, GL_DIFFUSE, backDiffuse);
    glMaterialfv(GL_FRONT, GL_DIFFUSE, frontDiffuse);

    Vector3D triangleVertices[3];

    float *lastDiffuse = frontDiffuse; 

    BOOL flip = scale.x < 0.0f || scale.y < 0.0f || scale.z < 0.0f;

    glBegin(GL_TRIANGLES);

    for (uint i = 0; i < triangles->size(); i++)
    {
        if (selectionMode == MeshSelectionModeTriangles) 
        {
            if (selected->at(i))
            {
                if (lastDiffuse == frontDiffuse)
                {
                    glMaterialfv(GL_FRONT_AND_BACK, GL_DIFFUSE, selectedDiffuse);
                    lastDiffuse = selectedDiffuse;
                }
            }
            else if (lastDiffuse == selectedDiffuse)
            {
                glMaterialfv(GL_BACK, GL_DIFFUSE, backDiffuse);
                glMaterialfv(GL_FRONT, GL_DIFFUSE, frontDiffuse);
                lastDiffuse = frontDiffuse;
            }
        }    
        Triangle currentTriangle = [self triangleAtIndex:i];
        if (flip)
            currentTriangle = FlipTriangle(currentTriangle);

        [self getTriangleVertices:triangleVertices fromTriangle:currentTriangle];
        for (uint j = 0; j < 3; j++)
        {
            for (uint k = 0; k < 3; k++)
            {
                triangleVertices[j][k] *= scale[k];
            }
        }
        Vector3D n = NormalFromTriangleVertices(triangleVertices);
        n.Normalize();
        for (uint j = 0; j < 3; j++)
        {
            glNormal3f(n.x, n.y, n.z);
            glVertex3f(triangleVertices[j].x, triangleVertices[j].y, triangleVertices[j].z);            
        }
    }

    glEnd();    
}

As you can see it is very inefficient, but working.triangles is array of indexes into vertices array.

I tried to use this code for drawing, but I can t have only one index array not two (one for vertices and second for normals).

glEnableClientState(GL_VERTEX_ARRAY);

uint *trianglePtr = (uint *)(&(*triangles)[0]); 
float *vertexPtr = (float *)(&(*vertices)[0]);

glVertexPointer(3, GL_FLOAT, 0, vertexPtr);
glDrawElements(GL_TRIANGLES, triangles->size() * 3, GL_UNSIGNED_INT, trianglePtr);
glDisableClientState(GL_VERTEX_ARRAY);

Now, how can I specify pointer to normals, when some vertices are shared by different triangles, so different normals for them?

最佳回答

So I finally managed to increase rendering speed. I recalculate normals on CPU, only when vertices or triangles changes, which occurs only when working in one mesh not in whole scene. It is not solution that I wanted but in real world it is better than previous approaches.
I cache whole geometry into separate normal and vertex array, indexed drawing cannot be used because I want flat shading (similar problem to smoothing groups in 3ds max).
I use simple glDrawArrays and for lighting vertex shader, that is because I want in triangle mode different color for selected triangle and another one for unselected and there is no array of materials (I didn t found any one).

问题回答

You wouldn t usually calculate the normals every frame, only when the geometry changes. And to have one normal per triangle just set the same normal for each vertex in the triangle. That does mean you can t share vertices between adjacent triangles in your mesh but that s not unusual at all in this kind of thing.

Your question makes me remember this Normals without Normals blog post.





相关问题
Optimizing a LAN server for a game

I m the network programmer on a school game project. We want to have up to 16 players at once on a LAN. I am using the Server-Client model and am creating a new thread per client that joins. ...

SQL Table Size And Query Performance

We have a number of items coming in from a web service; each item containing an unknown number of properties. We are storing them in a database with the following Schema. Items - ItemID - ...

Most optimized way to store crawler states?

I m currently writing a web crawler (using the python framework scrapy). Recently I had to implement a pause/resume system. The solution I implemented is of the simplest kind and, basically, stores ...

Do bitwise operations distribute over addition?

I m looking at an algorithm I m trying to optimize, and it s basically a lot of bit twiddling, followed by some additions in a tight feedback. If I could use carry-save addition for the adders, it ...

Improve INSERT-per-second performance of SQLite

Optimizing SQLite is tricky. Bulk-insert performance of a C application can vary from 85 inserts per second to over 96,000 inserts per second! Background: We are using SQLite as part of a desktop ...

Profiling Vim startup time

I’ve got a lot of plugins enabled when using Vim – I have collected plugins over the years. I’m a bit fed up with how long Vim takes to start now, so I’d like to profile its startup and see which of ...

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签