Wow, that's an impressive analysis, at least of the abstract capability of a wide range of Cortex-M3 chips.
Please allow me to offer a couple specific examples. These are based on Cortex-M4 (Freescale Kinestis K20) running at 96 MHz. Minimal use of the DSP extensions was made, so you could probably expect similar performance from Cortex-M3 at about 100 MHz, assuming similar buses and DMA capability.
First, here's a LED panel project I made several months ago, displaying 30 Hz video at low resolution (90x48 pixels) and streaming 44.1 kHz (mono) audio.
http://community.arm.com/groups/embedded/blog/2014/05/23/led-video-panel-at-maker-faire-2014
The video and audio data leverage the Kinetis eDMA engine. The uncompressed data is read from a SD card, in SPI mode, using polling for the SPI peripheral. Testing showed approx 50% CPU usage, mostly for the SPI data transfer.
A 320x240 SPI-interface display was mentioned. This is another place I've done significant optimization work. Here's a blog article, with a sample video:
http://www.dorkbotpdx.org/blog/paul/display_spi_optimization
As you can see in that article, simplistic software design for these displays results in slow performance, even with a fast CPU.
Above, a theoretical 32.5 Hz refresh rate was mentioned, based on the assumption of 50 MHz SPI clock. Since publishing that article, I've talked with others attempting similar optimization. Testing has shown many of those displays do not work reliably with 42 MHz clock speed. Reliable specs are hard to find, but some datasheets spec a maximum SPI clock of only 10 MHz. I have personally done a LOT of testing with 24 MHz SPI clock, but always while running the display at 3.3V (well above its minimum 2.2 or 2.4V), with good results.
If your video is compressed with any DCT-based algorithm, odds are slim a Cortex-M3 will be capable of decoding the data in real time at any significant resolution. Even just moving the bytes from a SPI port to RAM can take a lot of CPU time if DMA is not leveraged efficiently.
Cortex-M7, when chips appear in the 350 to 400 MHz range, might open up more possibilities. Maybe? As you can see, I actually do quite a bit of work on optimizing open source middleware. If anyone from ARM actually reads this message, please let me know when an updated v7m architecture reference manual is published?