What is the fastest way to draw pixels in AS3?
March 26th, 2008ActionScript 3 can display a dynamically generated bitmap, in which the RGB values of each pixel are computed by code. Using this capability, it is possible to render 3D scenes, process video and images, and create other effects. To draw pixels to the screen in AS3, one creates instances of the Bitmap and BitmapData classes. The BitmapData represents an array of pixels, while the Bitmap is a DisplayObject that renders those pixels. In order to draw a dynamically animated bitmap, it may be necessary to write to every pixel in the BitmapData each frame. There are at least two ways to write to all of the pixels in a BitmapData- one is to call setPixel once for every pixel, and the other is to build an appropriatedly sized ByteArray, then pass it to BitmapData.setPixels (once per frame). So, if you want your code to run as fast as possible, which should you choose? I implemented and benchmarked 8 variations of these two approaches. The code, benchmarks, and results follow.
In each test, there is an 800×600 Bitmap, showing the same simple color animation. Performance is measured in frames per second, on my MacBook with dual 2 ghz Intel core dual processors, 1 gb ram, and Safari 3.0.4.
Buffer Test 1 - 20 fps (this is the fastest)
Run Test 1 Source Code: BufferTest1.as
The fastest of the 8 implementations calls setPixels once per pixel, uses y in the outer loop and x in the inner loop, and locks and unlocks the BitmapData before and after writing to it. The main loop of code looks like this:
outputBitmapData.lock();
for(y = 0; y < STAGE_HEIGHT; ++y)
{
for(x = 0; x < STAGE_WIDTH; ++x)
{
r = (t*100 + 255 * x / STAGE_WIDTH)%255;
g = 180;
b = 180;
outputBitmapData.setPixel(x, y, (r<<16) + (g<<8) + b);
}
}
outputBitmapData.unlock();
Before we move on to implementations using ByteArray, let us consider a few variations of the code above.
Buffer Test 6 - 20 fps (loop order doesn’t matter)
Run Test 6 Source Code: BufferTest6.as
This test is the same as test 1, except the loops iterating over x and y are interchanged, so the the outer loop is over x, and the inner loop is over y. If this were C++, I would expect this to be slower than test 1, because it writes to memory addresses that are out of order. However, it doesn’t seem to matter in AS3.
Buffer Test 7 - 9 fps (make sure to lock and unlock before setPixel)
Run Test 7 Source Code: BufferTest7.as
This test is the same as test 1, except the calls to lock and unlock are removed. It is much slower than test 1.
That wraps it up for approaches based on calling setPixel once per pixel. Now, lets look at some implementations that call setPixels once, with a ByteArray.
Buffer Test 5 - 10 to 11 fps (best ByteArray)
Run Test 5 Source Code: BufferTest5.as
This is one of the best ByteArray based implementations that I came up with. The main loop for this test looks like this:
buffer.position = 0; // buffer is a ByteArray
for(y = 0; y < BUFFER_HEIGHT; ++y)
{
for(x = 0; x < BUFFER_WIDTH; ++x)
{
r = (t*100 + 255 * x / BUFFER_WIDTH)%255;
g = 180;
b = 180;
buffer.writeInt( (255<<24) + (r<<16) + (g<<8) + b );
}
}
var rect:Rectangle = new Rectangle(0,0,BUFFER_WIDTH,BUFFER_HEIGHT);
buffer.position = 0;
outputBitmapData.lock();
outputBitmapData.setPixels(rect, buffer);
outputBitmapData.unlock();
Buffer Test 4 - 10 to 11 fps (writeInt vs writeUnsignedInt doesn’t matter)
Run Test 4 Source Code: BufferTest4.as
This is identical to test 5, except the call to writeInt is replaced with a call to writeUnsignedInt. There isn’t much change in performance here.
Buffer Test 3 - 6 fps (writeByte is SLOW)
Run Test 3 Source Code: BufferTest3.as
In tests 5, the call to writeInt effectively writes 4 bytes at once. It is also possible to call writeByte 4 times instead. Note that each 4 byte block in the ByteArray represents a pixel, in ARGB format (yes, A comes first). So this is the sequence of calls to writeByte to output 1 pixel:
buffer.writeByte(255); buffer.writeByte(r); buffer.writeByte(g); buffer.writeByte(b)
This approach is much slower than calling writeInt or writeUnsignedInt. I think this is because of the overhead of the function calls (and maybe because ByteArray is doing some internal index calculations that aren’t necessary).
Buffer Test 2 - 9 fps (ByteArray’s operator [] is fast)
Run Test 2 Source Code: BufferTest2.as
Calling readByte and writeByte is one way to get and set the data in a ByteArray, but one can also use the [] operator, which is faster. In this test, my code keeps track of the buffer offset for the current pixel / color channel (rather than in the previous tests, where this position is buffer.position, and is internally managed by ByteArray). Here’s the main loop for this test:
buffer.position = 0;
var offset:int = 0;
for(y = 0; y < BUFFER_HEIGHT; ++y)
{
for(x = 0; x < BUFFER_WIDTH; ++x)
{
r = (t*100 + 255 * x / BUFFER_WIDTH)%255;
g = 180;
b = 180;
++offset; // skip alpha channel
buffer[offset++] = r;
buffer[offset++] = g;
buffer[offset++] = b;
}
}
var rect:Rectangle = new Rectangle(0,0,BUFFER_WIDTH,BUFFER_HEIGHT);
buffer.position = 0;
outputBitmapData.lock();
outputBitmapData.setPixels(rect, buffer);
outputBitmapData.unlock();
Buffer Test 8 - 9 fps (locking doesn’t matter for ByteArray)
Run Test 8 Source Code: BufferTest8.as
This is identical to test 2, except that the calls to lock and unlock around the single call to setPixels are removed. The frame rate does not change, indicating that there isn’t any benefit to locking in this case. I think this is because most of the time is spent getting data into the ByteArray, not copying it into the BitmapData with setPixels.
Conclusion
The fastest way that I have found to draw pixels in AS3 is to call setPixel once for each pixel, making sure to lock and unlock before hand. If you know of a faster way, please tell me about it!
These tests imply a fairly severe upper bound on the performance that can be expected for per pixel rendering in Flash. In any real code, frame rates will be lower than those reported here, because there will be more work to do to compute the RGB values for each pixel (in these tests, the values are calculated trivially).
One of the most interesting uses of drawing pixels in AS3 is 3D rendering. As far as I know, there are a few 3D engines in AS3 that uses per-pixel drawing, including my real time ray tracer, and Away3D. On the other hand, the mainstream engines Sandy and PaperVision3D, do not rasterize into pixel buffers, but instead achieve higher performance by hacking Flash’s capabilities to draw scaled and rotated textures (which are not ideally suited for use in a 3D triangle renderer).
However, there are still other options available to make drawing pixels faster. For example, this thread on FlashKit describes a method of drawing pixels in a palletized color mode. This reduces the memory required per pixel from 4 bytes to 1 byte, but comes at the cost of restricting the number of colors on screen to 256 (+ other effects on top of that). HaXe may provide another possible avenue for accelerating per-pixel rendering in the Flash 9 virtual machine, as it is possible to program in Flash’s bytecode assembly language.