English 中文(简体)
Serving Images with on-the-fly resize
原标题:

my company has recently started to get problems with the image handling for our websites.

We have several websites (adult entertainment) that display images like dvd covers, snapshots and similar. We have about 100 000 movies and for each movie we have an average of 30 snapshots + covers. Almost every image has an additional version with blurring and overlay for non-members, this results in about 50 images per movie or a total of 5 million base images. Each of the images is available in several versions, depending on where it s placed on the page (thumbnail, original, small preview, not-so-small preview, small image in the top-list, etc.) which results in more images than i cared to count.

Now i had the idea to use a server to generate the images on-the-fly since it became quite clumsy to generate all the different images for all the different pages (as different pages sometimes even need different image sizes for basically the same task).

Does anyone know of an image processing server that can scale down images on-the-fly so we only need to provide the original images and the web guys can just request whatever size they need?

Requirements:

  • Very High performance (Several thousand users per day)
  • On-the-fly blurring and overlay creation
  • On-the-fly resize (with and without keeping aspect ratio)
  • Can handle millions of images
  • Must be able to read JPG, GIF, PNG and BMP and convert between them

Security is not that much of a concern as i.e. the unblurred images can already be reached by URL manipulation and more security would be nice but it s not required and frankly i stopped caring (After failing to get into my coworkers heads why (for our small reseller page) it s a bad idea to use http://example.com/view_image.php?filename=/data/images/01020304.jpg to display the images).

We tried PHP scripts to do this but the performance was too slow for this many users.

Thanks in advance for any suggestions you have.

最佳回答

I suggest you set up a dedicated web server to handle image resize and serve the final result. I have done something similar, although on a much smaller scale. It basically eliminates the process of checking for the cache.

It works like this:

  • you request the image appending the required size to the filename like http://imageserver/someimage.150x120.jpg
  • if the image exists, it will be returned with no other processing (this is the main point, the cache check is implicit)
  • if the image does not exist, handle the 404 not found via .htaccess and reroute the request to the script that generates the image of the required size
  • in the script specify the list of allowed sizes to avoid attacks like scripts requesting every possible size to shut your server down
  • keep this on a cookieless domain to minimize unnecessary traffic

EDIT: I don t think that PHP itself would slow the process much, as PHP scripting in this case is reduced to a minimum: the image scaling is done by a builtin library written in C. Whatever you do you ll have to use a library like this (GD or libmagick or so) so that s unavoidable. With my system at least you totally skip the overhead of checking the cache, thus further reducing PHP interaction. You can implement this on your existing server, so I guess it s a solution well suited for your budget.

问题回答

Based on

We tried PHP scripts to do this but the performance was too slow for this many users.

I m going to assume you weren t caching the results. I d recommend caching the resulting images for a day or two (i.e. have your script check to see if the thumbnail has already been generated, if so use it, if it hasn t generate it on the fly).

This would improve performance dramatically as I d imagine the main/start page probably has a lot more hits than random video X, thus when viewing the main page no images have to be created as they re cached. When User Y views Movie X, they won t notice the delay as much since it just has to generate that one page.

For the "On-the-fly resize" aspect - how important is bandwidth to you? I d want to assume you re going through so much with movies that a few extra kb in images per request wouldn t do too much harm. If that s the case, you could just use larger images and set the width and height and let the browser do the scaling for you.

The ImageCache and Image Exact Sizes solutions from the Drupal community might do this, and like most solutions OSS use the libraries from ImageMagik

There are some AMI images for Amazons EC2 service to do image scaling. It used Amazon S3 for image storage, original and scales, and could feed them through to Amazons CDN service (Cloud Front). Check on EC2 site for what s available

Another option is Google. Google docs now supports all file types, so you can load the images up to a Google docs folder, and share the folder for public access. The URL s are kind of long e.g.

http://lh6.ggpht.com/VMLEHAa3kSHEoRr7AchhQ6HEzHVTn1b7Mf-whpxmPlpdrRfPW216UhYdQy3pzIe4f8Q7PKXN79AD4eRqu1obC7I

Add the =s paramter to scale the image, cool! e.g. for 200 pixels wide

http://lh6.ggpht.com/VMLEHAa3kSHEoRr7AchhQ6HEzHVTn1b7Mf-whpxmPlpdrRfPW216UhYdQy3pzIe4f8Q7PKXN79AD4eRqu1obC7I=s200

Google only charge USD5/year for 20GB. There is a full API for uploading docs etc

Other answers on SO How best to resize images off-server

Ok first problem is that resizing an image with any language takes a little processing time. So how do you support thousands of clients? We ll you cache it so you only have to generate the image once. The next time someone asks for that image, check to see if it has already been generated, if it has just return that. If you have multiple app servers then you ll want to cache to a central file-system to increase your cache-hit ratio and reduce the amount of space you will need.

In order to cache properly you need to use a predictable naming convention that takes into account all the different ways that you want your image displayed, i.e. use something like myimage_blurred_320x200.jpg to save a jpeg that has been blurred and resized to 300 width and 200 height, etc.

Another approach is to sit your image server behind a proxy server that way all the caching logic is done automatically for you and your images are served by a fast, native web server.

Your not going to be able to serve millions of resized images any other way. That s how Google and Bing maps do it, they pre-generate all the images they need for the world at different pre-set extents so they can provide adequate performance and be able to return pre-generated static images.

If php is too slow you should consider using the 2D graphic libraries from Java or .NET as they are very rich and can support all your requirements. To get a flavour of the Graphics API here is a method in .NET that will resize any image to the new width or height specified. If you omit a height or width, it will resize maintaining the right aspect ratio. Note Image can be a created from a JPG, GIF, PNG or BMP:

// Creates a re-sized image from the SourceFile provided that retails the same aspect ratio of the SourceImage. 
// -    If either the width or height dimensions is not provided then the resized image will use the 
//      proportion of the provided dimension to calculate the missing one.
// -    If both the width and height are provided then the resized image will have the dimensions provided 
//      with the sides of the excess portions clipped from the center of the image.
public static Image ResizeImage(Image sourceImage, int? newWidth, int? newHeight)
{
    bool doNotScale = newWidth == null || newHeight == null; ;

    if (newWidth == null)
    {
        newWidth = (int)(sourceImage.Width * ((float)newHeight / sourceImage.Height));
    }
    else if (newHeight == null)
    {
        newHeight = (int)(sourceImage.Height * ((float)newWidth) / sourceImage.Width);
    }

    var targetImage = new Bitmap(newWidth.Value, newHeight.Value);

    Rectangle srcRect;
    var desRect = new Rectangle(0, 0, newWidth.Value, newHeight.Value);

    if (doNotScale)
    {
        srcRect = new Rectangle(0, 0, sourceImage.Width, sourceImage.Height);
    }
    else
    {
        if (sourceImage.Height > sourceImage.Width)
        {
            // clip the height
            int delta = sourceImage.Height - sourceImage.Width;
            srcRect = new Rectangle(0, delta / 2, sourceImage.Width, sourceImage.Width);
        }
        else
        {
            // clip the width
            int delta = sourceImage.Width - sourceImage.Height;
            srcRect = new Rectangle(delta / 2, 0, sourceImage.Height, sourceImage.Height);
        }
    }

    using (var g = Graphics.FromImage(targetImage))
    {
        g.SmoothingMode = SmoothingMode.HighQuality;
        g.InterpolationMode = InterpolationMode.HighQualityBicubic;

        g.DrawImage(sourceImage, desRect, srcRect, GraphicsUnit.Pixel);
    }

    return targetImage;
}

In the time that this question has been asked, a few companies have sprung up to deal with this exact issue. It is not an issue that s isolated to you or your company. Many companies reach the point where they need to look for a more permanent solution for their image processing needs.

Services like imgix serve as a proxy and CDN for image operations like resizing and applying overlays. By manipulating the URL, you can apply different transformations to each image. imgix serves billions of requests per day.

You can also stand up services on your own and put them behind a CDN. Open source projects like imageproxy are good for this. This puts the burden of maintenance on your operations team.

(Disclaimer: I work for imgix.)

What you are looking for is best matched by Thumbor http://thumbor.readthedocs.org/en/latest/index.html , which is open source, backed by a huge company (means it will not disappear tomorrow), and ships with a lot of nice features like detecting what is important on an image when cropping.

For low-cost plus CDN I d suggest to combine it with Cloudfront and AWS storage, or a comparable solution with a free CDN like Cloudflare. These might not be the best performing CDN providers, but at least still perform better than one server and also offload your image server on the cheap. Plus, it will save you a TON of bandwidth cost.

If each different image is uniquely identifiable by a single URL then I d simply use a CDN such as AKAMAI. Let your PHP script do the job and let AKAMAI handle the load.

Since this kind of business doesn t usually have budget problems, that d be the only place I d look at.

Edit: that works only if you do find a CDN that will serve this kind of content for you.

This exact same problem is now being solved by image resize services dedicated to this task. They provide following features:

  1. In built CDN - you need not worry about image distribution
  2. Image resize on the fly - any size needed is available
  3. No storage needed - you just store base image and all variants are handled by service
  4. Ecosystem libraries - you can just include javascript and your job is done for all devices and all browsers.

One such service is Gumlet. You can also try some open source alternative like nginx plugin which can also resize image on the fly.

(I work for Gumlet.)





相关问题
What to look for in performance analyzer in VS 2008

What to look for in performance analyzer in VS 2008 I am using VS Team system and got the performance wizard and reports going. What benchmarks/process do I use? There is a lot of stuff in the ...

SQL Table Size And Query Performance

We have a number of items coming in from a web service; each item containing an unknown number of properties. We are storing them in a database with the following Schema. Items - ItemID - ...

How to speed up Visual Studio 2008? Add more resources?

I m using Visual Studio 2008 (with the latest service pack) I also have ReSharper 4.5 installed. ReSharper Code analysis/ scan is turned off. OS: Windows 7 Enterprise Edition It takes me a long time ...

Manually implementing high performance algorithms in .NET

As a learning experience I recently tried implementing Quicksort with 3 way partitioning in C#. Apart from needing to add an extra range check on the left/right variables before the recursive call, ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

热门标签