What is Kimi? Kimi k1.5 vs Kimi K2

Following the release of R1 from DeepSeek and Qwen from Alibaba, other advanced models have emerged from China, with one of the most notable being the Kimi series by Moonshot AI. It began with Kimi K1.5, a multimodal model that incorporates advanced reinforcement learning (RL) and is claimed to deliver state-of-the-art performance in math, code, vision, and long-context reasoning tasks similar to a well-known model from OpenAI - o1, but without restrictions or subscriptions. Now, with the launch of Kimi K2, Moonshot AI has taken things further, releasing a powerful open-source model with a focus on agentic tasks like automation and orchestration.

In this blog, we will discuss what Kimi is, compare Kimi K1.5 and Kimi K2, review their features and use cases, and ask: can Kimi challenge its US counterparts and become a major force in the global AI race?

What is Kimi and Its Origin?

Kimi is an advanced artificial intelligence model developed by Beijing-based startup Moonshot AI. The company was founded in early 2023 by three young researchers - Yang Zhilin, Zhou Xinyu, and Wu Yuxin, who were striving to build something ambitious. Their startup quickly gained attention for its focus on creating powerful language models that can process extensive textual inputs. Just months after its founding, the team raised $60 million in seed funding, valuing the company at around $300 million.

Their clear breakthrough came with the launch of Kimi, a chatbot they introduced in late 2023. Kimi could read and respond to extremely long texts in Chinese, making it stand out from other AI assistants. Initially, the chatbot supported up to 200,000 Chinese characters in a single prompt, while by early 2024, Kimi could handle over two million characters in a single conversation, an impressive leap in context understanding. In early 2024, the company secured $1 billion in funding, led by Alibaba, which pushed its valuation to $2.5 billion. Later that year, investments from Tencent and others raised an additional $300 million, positioning Moonshot as one of the fastest-growing AI companies in China, with a valuation of approximately $3.3 billion.

In January 2025, Moonshot released Kimi K1.5, a more advanced version of the chatbot, designed to deliver improved performance in multiturn conversations, coding assistance, and long-context understanding, rivaling the world's best AImodels. It can process and reason across multiple data types, including textual and visual data like images and videos, and even code, making it a versatile tool for tasks requiring complex reasoning, such as mathematics, coding, and multimodal data analysis. It is free to use with no usage limits, and is offered in two versions - one for detailed reasoning (long-CoT) and one for short answers (short-CoT).

On July 11, 2025, Kimi released K2, a state-of-the-art Mixture-of-Experts (MoE) model with 1 trillion total parameters and 32 billion activated parameters, using 384 experts with 8 selected during inference. It focuses on agentic tasks, coding, and tool integration. The model has been released in two versions: Kimi-K2-Base for researchers and fine-tuning, and Kimi-K2-Instruct for general-purpose chat and agentic tasks. Kimi K2 is also open-source, distributed under a Modified MIT License, with model weights available on Hugging Face for researchers and developers.

How to Use Kimi?

Users can interact with Kimi through Kimi Chat, which can be accessed via the Kimi Chat website or the Kimi mobile app (for iOS and Android). There is also a QR code on the Kimi website, so users can scan it and install the app. The chat interface is clean and functional, pretty similar to one from OpenAI. In the drop-down menu, you can select Kimi K1.5 or K2. There is also an option to select "Long Thinking" for deeper analysis. Both models are also available via API through the Kimi OpenPlatform for integration into applications.

On June 20, 2025, MoonshotAI also presented Kimi‑Researcher, a new autonomous reasoning agent built on Moonshot's Kimi k‑series models. It is trained end-to-end using reinforcement learning and can autonomously search, reason, and use tools at scale, marking a significant step toward truly intelligent AI researchers.

How Does Kimi K1.5 Work?

Moonshot AI used innovative training methods to improve the model's performance. Unlike traditional LLMs that rely on static datasets and next-token prediction, Kimi K1.5 uses Reinforcement Learning to learn dynamically through exploration and feedback. It generates solutions via a sequence of reasoning steps (Chain of Thought, or CoT) and refines them based on a reward model evaluating the correctness of responses. The team focused on the end result, which gave the model more freedom to find paths to the correct answer. The model employs Online Mirror Descent for policy optimization, ensuring stable training and efficient convergence. This approach helps a model learn and improve step by step, especially in changing or uncertain environments, and avoids complex techniques like Monte Carlo tree search, focusing instead on autoregressive prediction and adaptive sampling to refine reasoning strategies.

Model fusion and Shortest Rejection Sampling were also used during the development process. It allows Kimi k1.5 to choose the shortest and most accurate answers. In addition, the Kimi development team implemented penalties for too-long answers to prevent the model from taking unnecessary steps and wasting resources. This length penalty also discourages verbose responses, ensuring concise yet accurate outputs. To optimize computational efficiency, the model reuses prior reasoning steps (partial rollouts) instead of regenerating entire trajectories, reducing training costs. Increasing the context length to 128 thousand tokens improves the accuracy of decisions, allowing models to perform more complex reasoning. The model's ability to process extended sequences (e.g., entire documents or complex problem sets) while maintaining long-term dependencies is crucial for tasks that require deep reasoning, such as solving multi-step math problems or analyzing lengthy code.

Kimi K1.5 Key Features and Capabilities

Long Context Window: Kimi K1.5 supports a 128K token context window, allowing it to handle large amounts of information equivalent to processing entire books or massive PDFs in a single prompt for coherent, context-aware responses. Such a long context window places Kimi K1.5 among the top LLMs with ultra-long context support, making it ideal for tasks like:

Reading and analyzing legal documents
Summarizing lengthy research papers
Providing answers based on entire codebases or product manuals

Multimodal Capabilities: In contrast to its Chinese competitor DeepSeek R-1, Kimi K1.5 can process not only text but also vision data (captioning, image-text interleaving, optical character recognition- OCR, QA datasets), and even code, all simultaneously. It enables tasks like image-to-code conversion, visual reasoning, and mixed-format data analysis, such as analyzing an X-ray while interpreting patient symptoms or converting images to code.

Multilingual Understanding: Kimi K1.5 is trained on multilingual data and is capable of handling tasks in Chinese and other major languages, making it accessible to a broad user base. Its fluency in Chinese, in particular, gives it an edge in the Asian market. The company continues to improve language support.

Competitive Coding Skills: Kimi K1.5 can generate code in various programming languages, explain code snippets, debug logic, and assist with algorithm design. Moonshot AI has been positioning it as a helpful tool for developers and engineers.

Free and Unlimited Access: The model is available without restrictions on Moonshot AI's platform, supporting real-time web searches and analysis of up to 50 files per day (e.g., PDFs, docs, images). A free API with limits is available via platform.moonshot.ai. Unlike Kimi K2, which is open-source and text-only, K1.5 is proprietary but excels in multimodal reasoning and web search tasks.

Advanced Reasoning with Flexibility in Answers: The model offers two modes of operation: long-CoT for deep analysis and step-by-step reasoning, and short-CoT for short, concise answers. According to the company's technical report, both versions match or outperform leading models from OpenAI and DeepSeek.

In the majority of tests, Kimi K1.5 short-CoT performed similarly or better than GPT-4o and Sonnet's Claude 3.5. The model achieves state-of-the-art results based on Moonshot's self-reported benchmarks like AIME (60.8), MATH 500 (94.6), and LiveCodeBench (47.3), significantly surpassing its main competitors among Vision-Language Models.

Performance of Kimi k1.5 short-CoT and flagship open-source and proprietary models (source)

Kimi k1.5 long-CoT model achieves state-of-the-art results in benchmarks like AIME (77.5), MATH 500 (96.2), and Codeforces (94th percentile), matching or surpassing OpenAI's o1 model.

Performance of Kimi k1.5 long-CoT and flagship open-source and proprietary models (source)

How Does Kimi 2 Work and How Does it Differ from Kimi K1.5?

In contrast to Kimi K1.5, which is a transformer-based, dense multimodal large language model (LLM), Kimi K2 is a Mixture of Experts (MoE) model. It is a type of AI model design where the system uses multiple smaller "expert" models, each specializing in different tasks or types of data. Instead of one big model handling everything, MoE picks the best experts for a specific task and combines their outputs to give a final answer. It is like a team of specialists: if you have a math problem, the math expert steps in; for a language task, the language expert takes over. It makes MoE models faster and more efficient because it activates only a small portion of the model for each task, saving computational resources while maintaining high performance, especially in coding and text processing, making them suitable for both cloud-based APIs and local deployment on high-end hardware. For example, Kimi K2 has 384 experts but uses only 8 for each task, saving computational power while still being highly capable. It is worth pointing out that local deployment requires significant hardware (multiple GPUs or a strong cluster), but a shared backend demo is available on Hugging Face.

Kimi 2 has an Optimizer upgrade. One of the most common optimizers, tools that help control training huge AI models, is AdamW. Researchers at Moonshot AI found their own optimizer, Muon, could be much more efficient, meaning the model gets smarter, faster. When they used Muon, the model's internal calculations (called attention logits) would "explode" or get too big, leading to unstable training — like a car going out of control at high speed. They invented MuonClip, a smarter version of Muon. MuonClip adds a special technique called qk-clip, which watches the model's internal signals (queries and keys in attention layers), rescales them gently to keep things from getting too large, and prevents "logit explosions" without breaking the learning process. Think of it like putting a speed limiter on a race car engine — you can still go fast, but not so fast that you crash.

Kimi K2 Key Features and Capabilities

Long Context Window: Kimi K2 also supports a 128K context window, suitable for extensive text inputs, such as entire codebases, lengthy technical documents, or complex project specifications, though its implementation differs due to the MoE architecture. Kimi K2 is built on an architecture comparable to DeepSeek-V3, which is known to use FlashAttention-2, sliding window attention, or Ring Attention—methods that scale attention efficiently across long sequences. As part of their scaling strategy, Kimi K2 reduced the number of attention heads, which decreases compute overhead, essential for very long sequences.

Multilingual Support: Like Kimi K1.5, Kimi K2 is trained on multilingual data, with strong proficiency in Chinese and support for other major languages. This ensures accessibility for global users, particularly in the Asian market, where its Chinese fluency is a competitive advantage.

Lacks vision and reasoning model: Unlike Kimi K1.5, K2 is a reflex-grade model without long thinking. It lacks vision and reasoning capabilities, so it can't handle tasks like image analysis or step-by-step reasoning, which is an obvious downgrade from K1.5. Instead, the model focuses on text-based tasks and agentic capabilities.

Agentic Intelligence: Unlike Kimi K1.5's chat focus, K2 is built to act—it can search, write code, and execute multi-step tasks autonomously. It excels in tasks that require task decomposition, planning, and execution. It can interact with external tools, APIs, or databases to complete complex workflows, such as:

Automating code generation and testing pipelines.
Orchestrating multi-step data analysis tasks.
Managing project workflows with minimal human intervention.

Advanced Coding Capabilities: Kimi K2 delivers high-quality code generation, debugging, and explanation across multiple languages. User feedback highlights its simplicity, clarity, and reliability compared to more verbose alternatives like Claude.

High Efficiency with Low Latency: Despite its massive 1-trillion-parameter scale, Kimi K2 is optimized for efficiency thanks to its Mixture of Experts (MoE) architecture. It achieves strong inference performance with ~22 tokens per second and a fast time-to-first-token of ~0.75 seconds on standard hardware (e.g., H100 GPUs), making it ideal for interactive applications like coding or real-time text processing. While slightly slower than some dense models like Kimi K1.5 due to MoE routing, K2's optimizations (e.g., MuonClip, efficient attention) mitigate this. K2's lower memory usage ensures it's competitive for low-latency environments and large-scale tasks.

Free and Open Access: Like Kimi K1.5, Kimi K2 is freely accessible on Moonshot AI's platform (kimi.ai) with a simple login, making it a powerful tool for text-based tasks like coding and automation. You can also try it on Hugging Face Spaces or via OpenRouter's free API tier, which is ideal for testing or integration into apps. As an open-source model under a Modified MIT License, Kimi K2's weights are available on Hugging Face, allowing developers to download and run it locally (though its 1-trillion-parameter scale requires multiple GPUs). A paid API is also available at $0,15 per 1M input tokens and $2.5 per 1M output tokens, significantly cheaper than competitors’ prices. Unlike Kimi K1.5, which supports real-time web search and analysis of up to 50 files (PDFs, docs, images), Kimi K2 is text-only. It does not support web browsing or file uploads but excels at agentic tasks such as writing code, executing shell commands, and orchestrating workflows.

Performance: In most benchmarks, Kimi K2-Instruct matches or outperforms leading open-source and proprietary models, including Claude Sonnet 4 and Claude Opus 4. DeepSeek-V3 or GPT-4.1. The model achieves top-tier results in coding (53.7% on LiveCodeBench, 65.8% on SWE-bench Verified) and math (97.4% on MATH-500), solidifying its position as a competitive large language model for both research and real-world applications. It also performs competitively with top proprietary and open models on general tasks like MMLU: ~0.824 and Intelligence Index: ~57.

As Kimi-K2-Base is designed for researchers and developers who fine-tune or study pretrained models, its benchmarks were conducted against available open-source pretrained models, like DeepSeek-V3-Base, Qwen2.5-72B, and LLaMA 4 Maverick, rivaling or surpassing them. For example, in general knowledge tasks like MMLU and TriviaQA, it slightly edges out models like DeepSeek-V3-Base and Qwen2.5-72B. It also performs exceptionally well on more demanding benchmarks like MMLU-Pro and MMLU-redux-2.0, showing strong reasoning and comprehension capabilities. When it comes to code generation, Kimi-K2-Base leads with high pass rates on EvalPlus and LiveCodeBench. It's also strong in mathematics, scoring 70.2% on MATH and 92.1% on GSM8k.

What is the Future of Kimi?

Kimi's both models, Kimi K1.5 and Kimi K2, are highly suitable for various activities, like coding, creating content, analyzing data, or translation and multilingual support. They are built at a fraction of US model costs and demonstrate competitive innovation. In just two years, Moonshot AI went from a bold idea to a serious player in the global AI race, catching up with Western competitors. At the same time, the reliability of Kimi benchmark scores, which claim significant improvement over US models, is questioned due to self-reported testing by AI companies. An independent verification currently lacking, it is yet to be conducted.

Besides, despite stated technical prowess, Moonshot faces skepticism due to China's reputation for data privacy issues and state control. China's regulatory framework prioritizes local market dominance, potentially giving Moonshot an advantage domestically but raising concerns about state influence over data and operations. For instance, regulations ensure local firms align with government priorities, which could include data access or censorship compliance. It reflects a broader sentiment that data processed by Chinese firms may be subject to government surveillance. So, users outside China, caution is advised when using Moonshot's products for sensitive applications due to potential data risks and regulatory oversight. For enterprise clients, its focus on long-context processing and efficiency makes it a compelling option, but due diligence on data security and contract terms is essential.

Besides, in November 2024, a group of five investors from Yang Zhilin's previous venture, Recurrent AI, includingGSR Ventures, Jingya Capital, Boyu Capital, Huashan Capital, and Wanyu Capital, initiated arbitration proceedings in Hong Kong. They allege that Moonshot founders launched Moonshot AI and secured funding without obtaining necessary consent from Recurrent AI's investors, potentially breaching fiduciary duties. This legal conflict threatens to overshadow Moonshot AI's technological achievements and rapid growth, casting a shadow over its future prospects.

Final Thoughts

Kimi K1.5 and Kimi K2 are not direct competitors but are distinct models designed for different use cases, complementing each other within Moonshot AI's ecosystem. Kimi K1.5 is a multimodal, reasoning-focused model for general-purpose and specialized tasks requiring image processing, web search, or deep reasoning (e.g., academic problem-solving). It offers a cost-effective alternative to models like GPT-4o, OpenAI-o1, o1 and Claude 3.5 Sonnet, with superior performance in key benchmarks. The model is presented in both short-CoT (quick, concise responses) and long-CoT (detailed, step-by-step reasoning) modes. Kimi K2 is a text-only, agentic model optimized for coding, automation, and research, with Kimi-K2-Base for fine-tuning and Kimi-K2-Instruct for practical tasks. Like most modern models, Kimi still needs time to improve, and the real quality of work depends on specific scenarios and continues to be tested in practice. Both models require further refinement, with Kimi K2 lacking multimodal support and facing issues like occasional hallucinations. Their self-reported benchmarks, while impressive, lack independent verification, a common AI industry challenge. Yet, Moonshot's rise from a 2023 startup to a $3.3 billion valuation it is today confirms the rapid development of Chinese AI companies and that competition in the field of intelligent models yields results.

FAQ:

What is the difference between Kimi K1.5 and Kimi K2?

Kimi K1.5 is a dense, multimodal model built for deep reasoning, vision tasks, and code. It offers free, unlimited access and supports 128K-token context. Kimi K2 is a Mixture-of-Experts (MoE) model optimized for efficiency, agentic tasks, and large-scale automation, but it is text-only and lacks vision/reasoning capabilities.

Are Kimi models really free to use?

Yes. Both Kimi K1.5 and Kimi K2 are available for free via kimi.moonshot.cn or the Kimi app. Kimi K2 is also open-source, with model weights on Hugging Face and an API via OpenRouter. For developers, paid API plans are available at competitive rates.

Does Kimi support multimodal input (images, vision)?

Only Kimi K1.5 supports multimodal input. It can process images, perform OCR, and combine text-image-code analysis. Kimi K2 is text-only and does not support vision or multimodal data for now.

Can I integrate Kimi into my app?

Yes. Both models are accessible via API on Moonshot’s OpenPlatform. Kimi K2 is also open-source and can be self-hosted, making it ideal for developers building custom AI solutions.

Can Kimi generate and understand code?

Yes, both models support code generation, explanation, and debugging. Kimi K1.5 can write, explain, and debug code in multiple programming languages, making it a helpful assistant for developers, engineers, and data scientists. Kimi K2 is especially optimized for coding and agentic workflows like task automation and tool integration.

Is my data safe with Kimi?

While Moonshot AI hasn’t reported major issues, some users express concern about data privacy due to Chinese regulations. Caution is advised when using Kimi for sensitive business or personal data.

How accurate are Kimi’s benchmark scores?

Both models achieve strong benchmark scores (e.g., Kimi K1.5: AIME 77.5, MATH 500 96.2; Kimi K2: MMLU ~0.82, LiveCodeBench 53.7). However, most results are self-reported, and third-party evaluation is still limited.