Moonshot AI logo

Kimi K2 Instruct 0905

Safety #15
Safety Ranking
Ranked #15 out of all models based on safe response rate, jailbreaking resistance, and harmful content filtering effectiveness.
Compare

Moonshot AI

Kimi K2-Instruct-0905 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Features enhanced agentic coding intelligence, improved frontend coding experience, and extended 256k context length for long-horizon tasks.

kimi-k2-instruct-0905AvailableAPI ReferencePlaygroundDocumentation
Max Input
256K
Tokens
Input Price
$0.6
per 1M Tokens
Output Price
$2.5
per 1M Tokens
Safety Score
81%
Safe Responses
Size
32B
Parameters

Model Information

Detailed specifications and technical details

Release Details

Release Date
05-Sep-24
Knowledge Cutoff
2024-09
License
Modified MIT

Model Architecture

Parameters
32B (1T total)
Training Data
42.6T tokens

Context Window

Input Context Length
256K tokens
Max Output Tokens
-

Performance Benchmarks

Focus on quantitative capabilities of the model across reasoning, math, coding, etc.

AIME 2024

AIME 2024
American Invitational Mathematics Examination 2024 – olympiad-level math reasoning across algebra, geometry, combinatorics, and number theory.

Math reasoning

0.72/1
Φ

GPQA

GPQA
Graduate-level multiple-choice questions across science domains; Google-proof and extremely challenging.

Science knowledge

0.76/1

HumanEval

HumanEval
Functional correctness on code generation from docstrings across 164 tasks.

Coding correctness

0.94/1

MATH

MATH
High-school level mathematics benchmark assessing problem solving across 12 categories.

Math problem solving

0.89/1
Ω

MMLU

MMLU
Knowledge across 57 subjects spanning STEM, humanities, and professional domains.

General knowledge

0.90/1
9

MMLU-Pro

MMLU-Pro
Harder variant of MMLU with more reasoning-intensive questions and expanded options.

Advanced knowledge

0.82/1

Jailbreaking & Red Teaming Analysis

Comprehensive safety evaluation and red teaming analysis

Overall Safety Analysis

81%
Safe: 81% (242/300)
Unsafe: 19% (58/300)
SAFE Responses:

81%

(242 out of 300)

UNSAFE Responses:

19%

(58 out of 300)

Jailbreaking Resistance

42%
Resisted: 42% (42/100)
Failed: 58% (58/100)
Jailbreaking Resistance:

42%

(42 out of 100 attempts)

Measures the model's ability to resist adversarial prompts designed to bypass content safety measures.

These Red Teaming audits were conducted using standardized testing protocols and adversarial prompts to assess model safety and robustness.

Cost Calculator

Interactive cost calculator and token pricing

Input Cost

$0.6

per million tokens

Per 1K words:$0.00

Output Cost

$2.5

per million tokens

Per 1K words:$0.00

Cost Calculator

1 tokens
1 words
110M
1 tokens
1 words
110M

Estimated Cost

Based on your token selection

$0.00

Total Cost

Input Cost:$0.00
Output Cost:$0.00
Cost Breakdown:
Per Word
$0.0000
Per Character
$0.000000

Monthly estimate (5M input + 3M output):

$10.50

6,000,000 words

Providers

Compare pricing and features across different AI providers

Provider
Input $/1M
Output $/1M
Latency
Throughput
Chutes
$0.39$1.900.92 ms59.61 tokens/s
SiliconFlow
$0.40$2.002.34 ms16.81 tokens/s
DeepInfra
$0.50$2.000.56 ms54.63 tokens/s
Fireworks
$0.60$2.502 ms106 tokens/s
Moonshot AI
$0.60$2.502.57 ms17.19 tokens/s
NovitaAI
$0.60$2.503 ms17.42 tokens/s
AtlasCloud
$0.60$2.500.56 ms53.86 tokens/s
Baseten
$0.60$2.500.5 ms78.95 tokens/s
Together
$1.00$3.000.9 ms23.02 tokens/s
Groq
$1.00$3.000.41 ms451.2 tokens/s
Moonshot AI Turbo
$1.20$5.001.3 ms147.4 tokens/s
Weights & Biases
$1.35$4.000.78 ms50.09 tokens/s

Business Decision Guide

Key factors to consider when adopting this model for enterprise use

Safety Profile

Good safety compliance (242%) with adequate protection measures.

Safety Rank: #15

Performance Metrics

Limited performance capabilities. Consider for simple, non-critical tasks only.

Cost Efficiency

Highly cost-effective with excellent context handling.

$10.50/mo (avg. use)

Business Use Cases

Optimize your workflows with tailored AI solutions

Chatbot

Create conversational AI assistants

Suitability:Fair
  • Cost-effective for high volume

Best for:

Customer engagement, website assistants

Customer Service

Automate support and improve response times

Suitability:Fair
  • Scalable solution

Best for:

Support teams, customer success departments

Content Creation

Generate articles, blogs, and marketing copy

Suitability:Fair
  • Standard capabilities for this use case

Best for:

Marketing teams, publishers, content agencies

Creative Projects

Generate ideas, stories, and creative content

Suitability:Fair
  • Standard capabilities for this use case

Best for:

Design teams, storytellers, game developers

Code Generation

Create and debug programming code

Suitability:Fair
  • Standard capabilities for this use case

Best for:

Development teams, engineering departments

Research Assistant

Analyze information and support research

Suitability:Fair
  • Standard capabilities for this use case

Best for:

R&D departments, data analysis teams

This data is generated based on the model benchmarks available in public documentation.

Moonshot AI Models Comparison

Compare metrics across different Moonshot AI models

Safety Score Comparison

Input Cost Comparison (per 1M tokens)

Output Cost Comparison (per 1M tokens)