Fix grammar and punctuation errors in README.md
Browse files- Fix comma splice: split into two sentences in Section 5 (Deployment)
- Remove extra space before colon
- Add missing periods at end of sentences
- Fix subject-verb agreement in code comments ("support" → "supports")
- Standardize equal signs count in print statements
README.md
CHANGED
|
@@ -527,7 +527,7 @@ Kimi-K2.5 adopts the same native int4 quantization method as [Kimi-K2-Thinking](
|
|
| 527 |
|
| 528 |
## 5. Deployment
|
| 529 |
> [!Note]
|
| 530 |
-
> You can access Kimi-K2.5's API on https://platform.moonshot.ai
|
| 531 |
Currently, Kimi-K2.5 is recommended to run on the following inference engines:
|
| 532 |
* vLLM
|
| 533 |
* SGLang
|
|
@@ -543,13 +543,13 @@ Deployment examples can be found in the [Model Deployment Guide](docs/deploy_gui
|
|
| 543 |
|
| 544 |
The usage demos below demonstrate how to call our official API.
|
| 545 |
|
| 546 |
-
For third-party
|
| 547 |
> [!Note]
|
| 548 |
-
> - Chat with video content is an experimental feature and is only supported in our official API for now
|
| 549 |
>
|
| 550 |
> - The recommended `temperature` will be `1.0` for Thinking mode and `0.6` for Instant mode.
|
| 551 |
>
|
| 552 |
-
> - The recommended `top_p` is `0.95
|
| 553 |
>
|
| 554 |
> - To use instant mode, you need to pass `{'chat_template_kwargs': {"thinking": False}}` in `extra_body`.
|
| 555 |
|
|
@@ -574,9 +574,9 @@ def simple_chat(client: openai.OpenAI, model_name: str):
|
|
| 574 |
response = client.chat.completions.create(
|
| 575 |
model=model_name, messages=messages, stream=False, max_tokens=4096
|
| 576 |
)
|
| 577 |
-
print('
|
| 578 |
print(f'reasoning content: {response.choices[0].message.reasoning_content}')
|
| 579 |
-
print('
|
| 580 |
print(f'response: {response.choices[0].message.content}')
|
| 581 |
|
| 582 |
# To use instant mode, pass {"thinking" = {"type":"disabled"}}
|
|
@@ -588,7 +588,7 @@ def simple_chat(client: openai.OpenAI, model_name: str):
|
|
| 588 |
extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
|
| 589 |
# extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
|
| 590 |
)
|
| 591 |
-
print('
|
| 592 |
print(f'response: {response.choices[0].message.content}')
|
| 593 |
```
|
| 594 |
|
|
@@ -623,12 +623,12 @@ def chat_with_image(client: openai.OpenAI, model_name: str):
|
|
| 623 |
response = client.chat.completions.create(
|
| 624 |
model=model_name, messages=messages, stream=False, max_tokens=8192
|
| 625 |
)
|
| 626 |
-
print('
|
| 627 |
print(f'reasoning content: {response.choices[0].message.reasoning_content}')
|
| 628 |
-
print('
|
| 629 |
print(f'response: {response.choices[0].message.content}')
|
| 630 |
|
| 631 |
-
# Also support instant mode if pass {"thinking" = {"type":"disabled"}}
|
| 632 |
response = client.chat.completions.create(
|
| 633 |
model=model_name,
|
| 634 |
messages=messages,
|
|
@@ -637,7 +637,7 @@ def chat_with_image(client: openai.OpenAI, model_name: str):
|
|
| 637 |
extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
|
| 638 |
# extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
|
| 639 |
)
|
| 640 |
-
print('
|
| 641 |
print(f'response: {response.choices[0].message.content}')
|
| 642 |
|
| 643 |
return response.choices[0].message.content
|
|
@@ -667,9 +667,9 @@ def chat_with_video(client: openai.OpenAI, model_name:str):
|
|
| 667 |
]
|
| 668 |
|
| 669 |
response = client.chat.completions.create(model=model_name, messages=messages)
|
| 670 |
-
print('
|
| 671 |
print(f'reasoning content: {response.choices[0].message.reasoning_content}')
|
| 672 |
-
print('
|
| 673 |
print(f'response: {response.choices[0].message.content}')
|
| 674 |
|
| 675 |
# Also support instant mode if pass {"thinking" = {"type":"disabled"}}
|
|
@@ -681,7 +681,7 @@ def chat_with_video(client: openai.OpenAI, model_name:str):
|
|
| 681 |
extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
|
| 682 |
# extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
|
| 683 |
)
|
| 684 |
-
print('
|
| 685 |
print(f'response: {response.choices[0].message.content}')
|
| 686 |
return response.choices[0].message.content
|
| 687 |
```
|
|
|
|
| 527 |
|
| 528 |
## 5. Deployment
|
| 529 |
> [!Note]
|
| 530 |
+
> You can access Kimi-K2.5's API on https://platform.moonshot.ai and we provide OpenAI/Anthropic-compatible API for you. To verify the deployment is correct, we also provide the [Kimi Vendor Verifier](https://kimi.com/blog/kimi-vendor-verifier.html).
|
| 531 |
Currently, Kimi-K2.5 is recommended to run on the following inference engines:
|
| 532 |
* vLLM
|
| 533 |
* SGLang
|
|
|
|
| 543 |
|
| 544 |
The usage demos below demonstrate how to call our official API.
|
| 545 |
|
| 546 |
+
For third-party APIs deployed with vLLM or SGLang, please note that:
|
| 547 |
> [!Note]
|
| 548 |
+
> - Chat with video content is an experimental feature and is only supported in our official API for now.
|
| 549 |
>
|
| 550 |
> - The recommended `temperature` will be `1.0` for Thinking mode and `0.6` for Instant mode.
|
| 551 |
>
|
| 552 |
+
> - The recommended `top_p` is `0.95`.
|
| 553 |
>
|
| 554 |
> - To use instant mode, you need to pass `{'chat_template_kwargs': {"thinking": False}}` in `extra_body`.
|
| 555 |
|
|
|
|
| 574 |
response = client.chat.completions.create(
|
| 575 |
model=model_name, messages=messages, stream=False, max_tokens=4096
|
| 576 |
)
|
| 577 |
+
print('====== Below is reasoning_content in Thinking Mode ======')
|
| 578 |
print(f'reasoning content: {response.choices[0].message.reasoning_content}')
|
| 579 |
+
print('====== Below is response in Thinking Mode ======')
|
| 580 |
print(f'response: {response.choices[0].message.content}')
|
| 581 |
|
| 582 |
# To use instant mode, pass {"thinking" = {"type":"disabled"}}
|
|
|
|
| 588 |
extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
|
| 589 |
# extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
|
| 590 |
)
|
| 591 |
+
print('====== Below is response in Instant Mode ======')
|
| 592 |
print(f'response: {response.choices[0].message.content}')
|
| 593 |
```
|
| 594 |
|
|
|
|
| 623 |
response = client.chat.completions.create(
|
| 624 |
model=model_name, messages=messages, stream=False, max_tokens=8192
|
| 625 |
)
|
| 626 |
+
print('====== Below is reasoning_content in Thinking Mode ======')
|
| 627 |
print(f'reasoning content: {response.choices[0].message.reasoning_content}')
|
| 628 |
+
print('====== Below is response in Thinking Mode ======')
|
| 629 |
print(f'response: {response.choices[0].message.content}')
|
| 630 |
|
| 631 |
+
# Also support instant mode if you pass {"thinking" = {"type":"disabled"}}
|
| 632 |
response = client.chat.completions.create(
|
| 633 |
model=model_name,
|
| 634 |
messages=messages,
|
|
|
|
| 637 |
extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
|
| 638 |
# extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
|
| 639 |
)
|
| 640 |
+
print('====== Below is response in Instant Mode ======')
|
| 641 |
print(f'response: {response.choices[0].message.content}')
|
| 642 |
|
| 643 |
return response.choices[0].message.content
|
|
|
|
| 667 |
]
|
| 668 |
|
| 669 |
response = client.chat.completions.create(model=model_name, messages=messages)
|
| 670 |
+
print('====== Below is reasoning_content in Thinking Mode ======')
|
| 671 |
print(f'reasoning content: {response.choices[0].message.reasoning_content}')
|
| 672 |
+
print('====== Below is response in Thinking Mode ======')
|
| 673 |
print(f'response: {response.choices[0].message.content}')
|
| 674 |
|
| 675 |
# Also support instant mode if pass {"thinking" = {"type":"disabled"}}
|
|
|
|
| 681 |
extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
|
| 682 |
# extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
|
| 683 |
)
|
| 684 |
+
print('====== Below is response in Instant Mode ======')
|
| 685 |
print(f'response: {response.choices[0].message.content}')
|
| 686 |
return response.choices[0].message.content
|
| 687 |
```
|