HN Companion◀︎ back | HN Companion home | new | best | ask | show | jobs
Show HN: I ran a language model on a PS2 (github.com/xaskasdf)
5 points by xaskasdf 3 hours ago | 2 comments
The Emotion Engine has 32 MB of RAM total, so the trick is streaming weights from CD-ROM one matrix at a time during the forward pass — only activations, KV cache and embeddings live in RAM. This means models bigger than the RAM can still run, they just read more from disc.

Had to build a custom quantized format (PSNT), hack endianness, write a tokenizer pipeline, and most of the PS2 SDK from scratch (releasing that separately). The model itself is also custom — a 10M param Llama-style architecture I trained specifically for this.

And it works. On real hardware.



Love this project. The CD streaming trick is such a smart constraint hack, and honestly the best part is you trained the model for the hardware instead of forcing a desktop recipe onto PS2.

Curious about 2 things if you can share:

whats your per-token latency on real hardware how much quality loss came from PSNT quantization vs fp16 baseline Either way this is peak hacker energy, shipping on actual hardware makes it 10x cooler.


It didn't had any quality loss, since the PSNT as quantization it's mainly to convert the model over the console constraints (you can convert any model you want, even when i trained a model for this hw); it's q8 quantization, so quality loss is negligible for these sizes. For the speed, I will fix the Tok/sec count since now drops 0 always for showing measures

PS: Thank you! And forgot to mention PSNT also supports bitnet models, they work like crap tho