Kimi In October 2023, Moonshot launched its first AI
chatbot, Kimi, whose name comes from Yang's English nickname. It had emerged as the closest rival to
Baidu's
Ernie Bot. In March 2024, Moonshot claimed Kimi could handle 2 million Chinese characters in a single prompt which was a significant upgrade from the previous version that could only handle 200,000. Due to the increased number of users, on 21 March, Kimi suffered an outage for two days and Moonshot had to issue an apology. As of August 2024, Kimi ranked third in active monthly users according to aicpb.com. On 20 January 2025, Kimi K1.5 was released. Moonshot claimed it matched the performance of
OpenAI o1 in mathematics, coding, and multimodal reasoning capabilities. In June 2025, Kimi dropped in popularity to seventh place in active monthly users. The model uses a
mixture-of-experts (MoE) architecture, where 32 billion parameters are active during inference. K2 was trained on 15.5 trillion tokens of data and is released under a modified MIT license. Kimi K2 is an open source LLM, meaning that it can be downloaded and built upon by users. The day after its release, Kimi K2 had the most downloads on the platform, an increase in popularity from previous months. The release of Kimi K2 follows a trend amongst Chinese companies to make their AI models open sourced likely trying to counter US's efforts to limit China's tech growth. In China, Kimi has six tiers of plans ranging from 5.2 yuan for four days to 399 yuan for a year of priority use.
Mooncake serving platform Mooncake is the platform that serves Moonshot's Kimi chatbot and processes 100 billion tokens daily. Moonshot was awarded the Erik Riedel Best Paper Award at the USENIX FAST conference for the paper detailing the architecture of Mooncake. The researchers indicate that Muon improves computational efficiency by a factor of 2 compared to the standard optimizer, AdamW, in training large models. The researchers note that long context scaling and improved policy optimization methods were key, without relying on complex techniques like
Monte Carlo tree search,
value functions, and process reward models. == See also ==