[crossfire] Protocol & compression.
Mark Wedel
mwedel at sonic.net
Wed Mar 29 00:57:21 CST 2006
tchize wrote:
>>>
>> This point is the real gotcha however - since the server can choose what data
>> to compress, the question then becomes what portion of the commands are being
>> compressed?
>> <sniped most>
>>
> I was unclear. I just meaned, the server would do something like
> <command x> and his datas
> <compressed> <datas> (--> decompress to <command y> and it's data)
> <command y> and his datas
Ok. That is what I was thinking. What I thought you were talking about was
something like:
compress_start
command1 (compressed)
command2 (compressed)
...
compress_end
type of setup, which then leads to the question how many of that data should
be compressed.
>> If we are going to do stream compression, I'd say we just compress everything
>> we send to the client, and don't care about the cpu time and/or data that
>> doesn't compress well. That is the simpler approach, and could get done in
>> relatively easily.
>>
>>
> I don't agree, we can just have a flag telling 'can compress' when we
> send a command, and the socket writer will decide if it encapsulate it
> in a compression. However, once again if client assume data can be
> compressed or uncompressed, we can implement selective compression later
> and for now compress everything.
To me, there are really 3 ways to deal with the compression:
1) Have Send_With_Handling compress the packet, if so requested.
Pros: Very simple to do - just a few lines of code to add.
Cons: Each compress is only 1 command, so multiple small commands wouldn't be
combined together and compresed to save more space.
2) Middle approach - have Send_With_Handling queue all data that can be
compressed. The instance it gets called with data that shouldn't be compressed,
it compresses everything it has queued up and sends that, then sends the
uncompressed command. There would also need to be a seperate flush_queue() that
is called at the end of each tick to also flush this queued data.
Pros: Lets us combine various small packets into a single larger block to
compress, thus getting better results (think a bunch of drawinfos).
Cons: Adds some level of complication, but not a huge amount (need another
buffer to store data to compress - this could be made a little simpler by having
logic such that if the buffer would overflow, we compress that and then put the
new block in that buffer to compress. Also, if there are a lot of compressible
packets interleaved with non compressible (image/smooth/image/smooth), you have
small blocks to try and compress again.
3) Compress everything sent. This should be done at a lower level (when we
actually write to the socket).
Pros: Everything is compressed - interleaved data isn't a problem.
Cons: Stille harder to do - if we compress a block of data, but can only write
half to the socket, we have to put the other have in a 'this data is compressed
and send it next' buffer (current logic just moves the pointer in the ring
buffer). We may end up compressing more data than we need - isn't really a
convenient way to turn on/off compression.
Now my personal thought is point 1 is really easy to do, and doesn't really in
any way prevent point 2 from happening or make it harder to do. I'd personally
start with the easiest solution and then start moving to more complicated
solutions after we see how that does.
For example, if hypothetically, it gets us 90% of the compression we could get
compared to method #3, we may say 'that's good enough' If it gets us 50%, then
clearly we need to do more work.
> Don't agree, we might now be ok with zlib, but people might give a try
> to a few other algorithms and in 6 month someone comes with an algorithm
> that get 20% better compression with 5% less cpu overhead. That day we
> don't want to have an awfull hack in client/server protocol to handle
> this new algorithm. As you said, it's better to have a clean code, that
> also mean a clean protocol :)
but I don't consider the setup commands to figure out compression a bad hack.
That code is all there already.
The only really question would be should just a general 'compress' be used no
matter what the compression method, or should each method have its own prefix
(gz, bz2, lzop, whatever). I personally lean to the second - client knows what
compress method it will be getting. But also, it then allows us to mix
compression methods on the same socket. Suppose the client supports every
compression method the server does. It could be that through experimentation,
we know the method gz works best on maps, bz2 best on text, lzop best on
something else. So for best results, the code could perhaps optimize for that
(but then as I type this in, that starts to get pretty complicated).
>
> As currently all commands are words and data are either binary either
> text, i would suggest to use a very small command for compress header
> (so we don't lose all gain of compression :/) a simple character like #
> or @ or & should be enough
>
> The would end, in
> S -> C # <compressed datas>
Yes, a 1 or 2 character method would be best. perhaps use Z, since z seems to
be the standard letter for compression. Then, it could be something like:
zg - zlib (gzip)
zb - bzip2
zo - lzop, etc.
>
> I think it's good to have this
> C -> S: setup compress zlib,bzip2,rle
> S -> C: setup compress zlib
Having a comma separated list in the setup command is different than current
semantics.
That setup, I think you could do:
C->S: setup compress zlib compress bzip2 compress rle
And since the server process in that order, it would basically use that as
preference (right most being one to use). Hmmm. I'll have see about doing that
with the map commands
More information about the crossfire
mailing list