I don't really have any problems with adding a 'media' protocol command that the server sends to the client. Arguably, the media command should be used for all sounds. My only thought is that it won't get used that much (except for perhaps sound). With the difficult in just getting png images down for the map size, I don't really see it likely that arbitrary sized images that display special things will get done anytime soon. Even less so for movies. Similar for voice acting. So I guess in summary: IMO, yes, this is a cool feature. I just don't see it getting used for much at all in the near future. This, in my mind, puts it more at the bottom of list of things to do.