[crossfire] Protocol & compression.

Sun Mar 26 20:00:41 CST 2006

Sebastian Andersson wrote:
> On Sat, Mar 25, 2006 at 10:02:56PM -0800, Mark Wedel wrote:
>>   There are a few likely differences which may or may not effect crossfire in 
>> different ways:
>>
>> 1) Some data crossfire sends is just not compressible.  The PNG data comes to 
>> mind - trying to compress it is at best a waste of time, and at worse, increases 
>> amount of data being transmitted.  The other problem related to this is you'll 
> I've found PNG data to be compressable with zlib. At least if many
> images are sent one after another.
> 
> Perhaps there is a bug in the 1.6 server, but many of the image2
> commands sent contained hundreds of identical octets in a row,
> clearly compressable on their own.

  I just ran a quick test (find . -name "*.png" -exec gzip -vc {} > /dev/null 
\;) , and this is a somewhat typical result of the entire arch tree:

./light/light_bulb_2.base.111.png:       -0.6%
./light/flint_and_steel.base.111.png:     1.5%
./light/torch_unlit.base.111.png:        73.0%
./light/lantern.base.111.png:     4.2%
./light/torch_lit2.base.111.png:         61.9%
./light/light_bulb_1.base.111.png:       -0.9%
./light/lantern2.base.111.png:    4.2%
./light/torch_lit1.base.111.png:         61.2%
./light/light_bulb_4.base.111.png:       -0.6%
./light/light_bulb_3.base.111.png:       -1.5%
./light/lantern_off.base.111.png:         6.7%
./light/lantern2_off.base.111.png:        5.5%

  Some files not compressible, some marginally compressible, and some very 
compressible.

  It seems that one that is compressible, at least doing a single quick test on 
torch_unlit.base.111.png, is that they are indexed images instead of RGB images. 
  I don't know why the difference or if there is even any reason there is one 
(should we convert all the indexed images to RGB?)  AFAIK, none of the code in 
the clients care what the format is, as they just use the png library routines 
to get the image in RGBA format

> If overhead is given, then the system will add that overhead to each
> compressed writing and it will not compress data that is less than
> 16 bytes long. This is not totaly correct of course, the real
> compression overhead (both in bytes and in CPU time) would be larger
> if one would flush after each command. Anyway, the result was that
> with an overhead of just 4 bytes (ie two length bytes + "gz"),
> the result was that 447KB was saved, instead of the 485KB when
> everything was compressed (8% less). On the other hand, 15% less CPU
> was used for the whole program, measured with time and just counting
> the "user time" (~170ms vs ~200ms).

  Thanks for the data.

  One note - if we use a 'gz' prefix, it would have to be 3 bytes, as we would 
need a space after the gz for proper client handling.

  However, the length bytes may or may note be needed, if we only compress one 
protocol command at a time.  For example, right now, it would be '<length 
bytes>map2 ....'.  If we do compression, it would be '<length bytes>gz 
<compressed map2 ...'.  This is because we can use the length bytes in the gz 
packet to know how long the map2 data is.

  However, if we want to collapse multiple protocol commands into one compressed 
packet, than that doesn't work.

  Also, at least as far as the 'map1 command' goes, it seems like the cutoff for 
compression there is in the 50-100 byte range.  OTOH, map1 is more binary than 
say drawinfo, so it may sort of depend on output.

  In my test run, map1 lengths of 50 bytes sometimes compressed shorter, but 
often times compressed longer.

  It does sort of become a balancing act - is it worth it to try and compress 
everything at the expense of using more CPU, or do we use a little more 
bandwidth to save more cpu (15% less cpu with 8% more bandwidth).

> 
> (The CPU used was an AMD Athlon(TM) XP 2400+; 2GHz, 256KB cache,
>  ~4k bogomips on linux-2.6.15) 
> 
> I did some other quick measurements with time over the whole file.
> 
> With compression level 9, the run time was twice as long,
> but it only saved 5KB more.
> 
> With compression level 1, the run time was 10% less and the output
> was 27KB larger (6% more) than at level 5.

  Which is once again interesting - may be worthwhile to compress only at level 1.

> 
>> often need to send a bunch of image data at one time (player changes to new 
>> map), and now there is lots of data you are trying to compress - the time it 
>> takes to compress that much data could become significant.
> 
> So is receiving uncompressed data if you've got a slow link and need
> compression. When measuring lag to modem users, they usualy have
> less lag when compression is turned on because of the increased
> bandwidth. With crossfire, the same might be experienced for 64-128kbps
> connections.

  My point was really that player changes maps, get sent 40 images which don't 
compress any smaller (well, maybe they do), but we spent the time trying to 
compress that uncompressable data.

  All that said, this is my thoughts:

  Adding a 'gz' compress prefix is an easy thing to do right now, and can get us 
pretty good results.

  Adding a compresstart/compressend command and then compressing might be 
worthwhile to do, but is harder to do given the multiple layers of buffers 
currently used in the server.  I need to actually think about that one more - 
because of those different levels of buffers, we can actually combine those 
small packets together - if the OS buffer to the client is full, there is 
nothing preventing us from compressing all the data we have pending before 
putting it to the OS buffer.

  But I suppose the real question is whether the bandwidth saved by compressing 
everything is worth the extra CPU time we spend compressing everything.  Not 
100% sure what the answer is there (and with the other message in the thread 
about compressing/not compressing based on server load, to do that would require 
nocompress transitions, which then once again, can hurt performance numbers.