Ultra-fast LLM inference on custom hardware. Run Llama and Mixtral at 10x speed of GPU inference.
Share this tool