English 中文(简体)
How to do paging with simpledb?
原标题:

I know how to page forward with SimpleDB data by using NextToken. However, how exactly does one handle previous pages? I m on .NET, but I don t think that matters. I m more interested in the general strategy.

Mike Culver s An Introduction to Amazon SimpleDB webinar mentions that breadcrumbs are used, but he doesn t implement them in the video.

EDIT: The video mentions a sample project which implements backwards paging, but the video ends before the URL for the download can be displayed. The one sample project I found didn t deal with paging.

最佳回答

When going to the next page you may be able to simplify the use case by only allowing a "next page" and not arbitrary paging. You can do this in SimpleDB by using the LIMIT clause:

SELECT title, summary, votecount FROM posts WHERE userid =  000022656  LIMIT 25

You already know how to handle the NextToken, but if you use this tactic, you can support "previous page" by storing the breadcrumb trail of next tokens (e.g. in the web session) and re-issuing the query with a previous NextToken rather than a subsequent one.

However, the general case for handling arbitrary pagination in SimpleDB is the same for previous and next. In the general case, the user may click on an arbitrary page number, like 5, without ever having visited page 4 or 6.

You handle this in SimpleDB by using the fact that NextToken only requires the WHERE clause to be the same to work properly. So rather than querying through every page in sequence pulling down all the intervening items, you can usually do it in two steps.

  1. Issue your query with a limit value of where the desired page should start, and SELECT count(*) instead of the actual attributes you want.
  2. Use the NextToken from step one to fetch the actual page data using the desired attributes and the page size as the LIMIT

So in pseudo code:

int targetPage, pageSize;
...
int jumpLimit = pageSize * (targetPage - 1);
String query = "SELECT %1 FROM posts WHERE userid =  000022656  LIMIT %2";
String output = "title, summary, votecount";
Result temp = sdb.select(query, "count(*)", jumpLimit);
Result data = sdb.select(query, output, pageSize, temp.getToken());

Where %1 and %2 are String substitutions and "sdb.select()" is a fictitious method that includes the String substitution code along with the SimpleDB call.

Whether or not you can accomplish this in two calls to SimpleDB (as shown in the code) will depend on the complexity of your WHERE clause and the size of your data set. The above code is simplified in that the temp result may have returned a partial count if the query took more than 5 seconds to run. You would really want to put that line in a loop until the proper count is reached. To make the code a little more realistic I ll put it within methods and get rid of the String substitutions:

private Result fetchPage(String query, int targetPage)
{
    int pageSize = extractLimitValue(query);
    int skipLimit = pageSize * (targetPage - 1);
    String token = skipAhead(query, skipLimit);
    return sdb.select(query, token);
}

private String skipAhead(String query, int skipLimit)
{
    String tempQuery = replaceClause(query, "SELECT", "count(*)");
    int accumulatedCount = 0;
    String token = "";
    do {
        int tempLimit = skipLimit - accumulatedCount;
        tempQuery = replaceClause(tempQuery , "LIMIT", tempLimit + "");
        Result tempResult = sdb.select(query, token);
        token = tempResult.getToken();
        accumulatedCount += tempResult.getCount();
    } while (accumulatedCount < skipLimit);
    return token;
}

private int extractLimitValue(String query) {...}
private String replaceClause(String query, String clause, String value){...}

This is the general idea without error handling, and works for any arbitrary page, excluding page 1.

问题回答

I recall that in one of the brown bag webinars, it was mentioned in passing that the tokens could be resubmitted and you d get the corresponding result set back.

I haven t tried it, and it is just an idea, but how about building a list of the tokens as you are paging forward? To go back, then, just traverse the list backwards and resubmit the token (and select statement).

i m stuck at getting the token - is that the same thing as RequestId?

The PHP SimpleDB library that i m using doesn t seem to return it. http://sourceforge.net/projects/php-sdb/

Found this documentation http://docs.amazonwebservices.com/AmazonSimpleDB/2009-04-15/DeveloperGuide/index.html?SDB_API_Select.html

which seems to indicate that there is a nextToken element, but in the sample response, it shows RequestId...

Figured it out - our PHP lib was indeed abstracting the nexttoken away from where we had access to it. Dug into the library and found it.

I have created a Java version of the sampling proposed above with the official SimpleDB API, maybe this is useful for anybody.

private static Set<String> getSdbAttributes(AmazonSimpleDBClient client,
            String domainName, int sampleSize) {
        if (!client.listDomains().getDomainNames().contains(domainName)) {
        throw new IllegalArgumentException("SimpleDB domain  " + domainName
                + "  not accessible from given client instance");
    }

    int domainCount = client.domainMetadata(
            new DomainMetadataRequest(domainName)).getItemCount();
    if (domainCount < sampleSize) {
        throw new IllegalArgumentException("SimpleDB domain  " + domainName
                + "  does not have enough entries for accurate sampling.");
    }

    int avgSkipCount = domainCount / sampleSize;
    int processedCount = 0;
    String nextToken = null;
    Set<String> attributeNames = new HashSet<String>();
    Random r = new Random();
    do {
        int nextSkipCount = r.nextInt(avgSkipCount * 2) + 1;

        SelectResult countResponse = client.select(new SelectRequest(
                "select count(*) from `" + domainName + "` limit "
                        + nextSkipCount).withNextToken(nextToken));

        nextToken = countResponse.getNextToken();

        processedCount += Integer.parseInt(countResponse.getItems().get(0)
                .getAttributes().get(0).getValue());

        SelectResult getResponse = client.select(new SelectRequest(
                "select * from `" + domainName + "` limit 1")
                .withNextToken(nextToken));

        nextToken = getResponse.getNextToken();

        processedCount++;

        if (getResponse.getItems().size() > 0) {
            for (Attribute a : getResponse.getItems().get(0)
                    .getAttributes()) {
                attributeNames.add(a.getName());
            }
        }
    } while (domainCount > processedCount);
    return attributeNames;
}




相关问题
Anyone feel like passing it forward?

I m the only developer in my company, and am getting along well as an autodidact, but I know I m missing out on the education one gets from working with and having code reviewed by more senior devs. ...

NSArray s, Primitive types and Boxing Oh My!

I m pretty new to the Objective-C world and I have a long history with .net/C# so naturally I m inclined to use my C# wits. Now here s the question: I feel really inclined to create some type of ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

How to Use Ghostscript DLL to convert PDF to PDF/A

How to user GhostScript DLL to convert PDF to PDF/A. I know I kind of have to call the exported function of gsdll32.dll whose name is gsapi_init_with_args, but how do i pass the right arguments? BTW, ...

Linqy no matchy

Maybe it s something I m doing wrong. I m just learning Linq because I m bored. And so far so good. I made a little program and it basically just outputs all matches (foreach) into a label control. ...

热门标签