A big part of enjoying the vintage computers in my collection is, of course, reliving fond memories of olde by using the systems as I did so long ago: playing the favorites of my gaming past and using applications that were part of my daily routine once upon a time, all while enjoying the physical presence of the machine. But, every so often, modern hardware and software efforts targeting these venerable platforms come together to deliver something today that would’ve been rather hard to believe if seen back in their long-ago heyday. Kris Kennaway’s recent effort in full-motion video playback on an unaccelerated Apple IIe is one such example.
Kris has developed a video playback system that allows full motion video, along with digitized audio, to be played back on an a 128K enhanced Apple IIe fitted with a CFFA3000 floppy image / large volume interface. This is achieved with the system’s standard 1.02MHz 65C02 processor — no CPU accelerator required. What’s more, the video is rendered in the 16-color Double High-Resolution graphics mode, 560×192 monochrome pixels rendering a 140×192 color image (with a memory layout complexity that fairly boggles the mind), offered by late-model Apple IIs.
The video above shows the famous Apple 1984 Macintosh commercial being played on my 128k enhanced Apple IIe equipped with a VidHD video interface (providing the HDMI output to the modern LCD standing on the IIe’s head) and the requisite CFFA3000 fitted with a USB stick and a Compact Flash card from which the video is being read. (The VidHD interface is not required by Kris’ player.)
For this full motion video playback system Kris was able to partially reuse the core of his previous work, the ][-Vision ethernet video streamer from 2019 [see: demo video, KansasFest presentation] which used routines from Bill Buckels’ Bmp2DHR, to encode the video stream. He goes into great detail on the conversion process of the source video frames into Double High-Res images in his description on the GitHub page of ][-pix, the image conversion utility used in this project.
In a reddit comment thread, Kris gives an overview about the operation of his playback system:
Basically, I reverse engineered the CFFA firmware to understand how it worked internally, and discovered that it reads data by first materializing it into an internal 512-byte buffer in the card’s address space, and then (normally) copying it out from there a byte at a time to the caller’s requested buffer.
I was able to reverse engineer the protocol used to talk between the 6502 firmware and its onboard hardware and cut it down to the minimal sequence needed to cause it to read a block into this internal buffer.
So, I can read a block every 600 cycles, or about 850KB/sec, but if I copy data out of it (even if directly to screen memory) then I lose an order of magnitude in throughput.
But what if instead of copying data out, I *execute* the contents of the buffer instead, i.e. fill it with up to 512 bytes of 6502 code, which updates screen memory (and manages the speaker) as a result of being executed. This is less dense (3 bytes for a single memory store), but…
It turns out that 512 bytes of straight-line 6502 code typically takes about 600 cycles to execute!
So with careful I/O management, we have just enough time to execute the block while reading the next one at close to the limit of the underlying hardware.
So then it’s “just” a matter of writing a 6502 program, in 512-byte chunks, which causes a video to be rendered when the program is executed. These chunks are chained together by the I/O code.
Fortunately, I pretty much had one of these lying around from that older project, it just needed a new output representation :-)
Among the information Kris provided me in our recent discussion of the system is a detailed account of how he handles audio playback through the II’s meager speaker output:
To encode the video, there are two 6502 opcode sequences that are interleaved together. The audio uses pulse width modulation with a 50 cycle period, which effectively gives 5.4 bits of audio quality at 20.4KHz sample rate. This basically means that you have to toggle the speaker every (N, 50-N) cycles where N is an audio sample. So that boils down to toggling the speaker address at precise cycle intervals with a lot of padding in between. We can use the padding to store to video memory. The video stream is produced by computing bytes that have changed between image frames and ranking them according to which byte stores would make the most perceptual difference to the image. So we interleave these video stores in the gaps between speaker toggles until we either run out of time and need to move on to the next video frame, or run out of work to do (and can twiddle our thumbs until the next frame)
There’s additional fiddling needed to also flip memory AUX/MAIN memory banks every so often because DHGR has an insane memory layout, to pad/align to the 512 byte block boundary, and to enqueue additional audio samples that we can keep playing while mapping the next block which is done by pushing the next 5 audio samples onto the stack, splitting the I/O code into 50 or 100-cycle segments, and having 44 variants of each segment that does the same thing while interleaved with a (N, 50-N) cycle speaker duty cycle.
Watching these videos playback, complete with audio, on an unaccelerated Apple IIe — and at such quality given the limitations of the hardware involved — is one of the biggest vintage computing “wow moments” I’ve experienced in years. And, since I began putting this post together, Kris has sent several new videos, all of which are amazing to see in action on the IIe.
Kris indicates that more information on the project is forthcoming, and when he posts a public download link with associated videos, this post will be updated.
Update [05/10/2021]: Kris informs me that this effort will be merged into the aforementioned ][-Vision system, which will allow it to source from either an Uthernet II (Ethernet card) or CFFA3000 for video playback.
This is amazing. Simply amazing. No other words!