LLM on Compute-In-Memory (CIM)

Make LLM inference feasible on compute-in-memory (CIM) hardware, which suffer from limited write endurance and only support integer.
Ongoing project in Pallas Lab, UC Berkeley.