TomGrc commited on
Commit
58556c3
·
verified ·
1 Parent(s): 9659d5d

Fix grammar and punctuation errors in README.md

Browse files

- Fix comma splice: split into two sentences in Section 5 (Deployment)
- Remove extra space before colon
- Add missing periods at end of sentences
- Fix subject-verb agreement in code comments ("support" → "supports")
- Standardize equal signs count in print statements

Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -527,7 +527,7 @@ Kimi-K2.5 adopts the same native int4 quantization method as [Kimi-K2-Thinking](
527
 
528
  ## 5. Deployment
529
  > [!Note]
530
- > You can access Kimi-K2.5's API on https://platform.moonshot.ai , we provide OpenAI/Anthropic-compatible API for you. To verify the deployment is correct, we also provide the [Kimi Vendor Verifier](https://kimi.com/blog/kimi-vendor-verifier.html).
531
  Currently, Kimi-K2.5 is recommended to run on the following inference engines:
532
  * vLLM
533
  * SGLang
@@ -543,13 +543,13 @@ Deployment examples can be found in the [Model Deployment Guide](docs/deploy_gui
543
 
544
  The usage demos below demonstrate how to call our official API.
545
 
546
- For third-party API deployed with vLLM or SGLang, please note that :
547
  > [!Note]
548
- > - Chat with video content is an experimental feature and is only supported in our official API for now
549
  >
550
  > - The recommended `temperature` will be `1.0` for Thinking mode and `0.6` for Instant mode.
551
  >
552
- > - The recommended `top_p` is `0.95`
553
  >
554
  > - To use instant mode, you need to pass `{'chat_template_kwargs': {"thinking": False}}` in `extra_body`.
555
 
@@ -574,9 +574,9 @@ def simple_chat(client: openai.OpenAI, model_name: str):
574
  response = client.chat.completions.create(
575
  model=model_name, messages=messages, stream=False, max_tokens=4096
576
  )
577
- print('===== Below is reasoning_content in Thinking Mode ======')
578
  print(f'reasoning content: {response.choices[0].message.reasoning_content}')
579
- print('===== Below is response in Thinking Mode ======')
580
  print(f'response: {response.choices[0].message.content}')
581
 
582
  # To use instant mode, pass {"thinking" = {"type":"disabled"}}
@@ -588,7 +588,7 @@ def simple_chat(client: openai.OpenAI, model_name: str):
588
  extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
589
  # extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
590
  )
591
- print('===== Below is response in Instant Mode ======')
592
  print(f'response: {response.choices[0].message.content}')
593
  ```
594
 
@@ -623,12 +623,12 @@ def chat_with_image(client: openai.OpenAI, model_name: str):
623
  response = client.chat.completions.create(
624
  model=model_name, messages=messages, stream=False, max_tokens=8192
625
  )
626
- print('===== Below is reasoning_content in Thinking Mode ======')
627
  print(f'reasoning content: {response.choices[0].message.reasoning_content}')
628
- print('===== Below is response in Thinking Mode ======')
629
  print(f'response: {response.choices[0].message.content}')
630
 
631
- # Also support instant mode if pass {"thinking" = {"type":"disabled"}}
632
  response = client.chat.completions.create(
633
  model=model_name,
634
  messages=messages,
@@ -637,7 +637,7 @@ def chat_with_image(client: openai.OpenAI, model_name: str):
637
  extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
638
  # extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
639
  )
640
- print('===== Below is response in Instant Mode ======')
641
  print(f'response: {response.choices[0].message.content}')
642
 
643
  return response.choices[0].message.content
@@ -667,9 +667,9 @@ def chat_with_video(client: openai.OpenAI, model_name:str):
667
  ]
668
 
669
  response = client.chat.completions.create(model=model_name, messages=messages)
670
- print('===== Below is reasoning_content in Thinking Mode ======')
671
  print(f'reasoning content: {response.choices[0].message.reasoning_content}')
672
- print('===== Below is response in Thinking Mode ======')
673
  print(f'response: {response.choices[0].message.content}')
674
 
675
  # Also support instant mode if pass {"thinking" = {"type":"disabled"}}
@@ -681,7 +681,7 @@ def chat_with_video(client: openai.OpenAI, model_name:str):
681
  extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
682
  # extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
683
  )
684
- print('===== Below is response in Instant Mode ======')
685
  print(f'response: {response.choices[0].message.content}')
686
  return response.choices[0].message.content
687
  ```
 
527
 
528
  ## 5. Deployment
529
  > [!Note]
530
+ > You can access Kimi-K2.5's API on https://platform.moonshot.ai and we provide OpenAI/Anthropic-compatible API for you. To verify the deployment is correct, we also provide the [Kimi Vendor Verifier](https://kimi.com/blog/kimi-vendor-verifier.html).
531
  Currently, Kimi-K2.5 is recommended to run on the following inference engines:
532
  * vLLM
533
  * SGLang
 
543
 
544
  The usage demos below demonstrate how to call our official API.
545
 
546
+ For third-party APIs deployed with vLLM or SGLang, please note that:
547
  > [!Note]
548
+ > - Chat with video content is an experimental feature and is only supported in our official API for now.
549
  >
550
  > - The recommended `temperature` will be `1.0` for Thinking mode and `0.6` for Instant mode.
551
  >
552
+ > - The recommended `top_p` is `0.95`.
553
  >
554
  > - To use instant mode, you need to pass `{'chat_template_kwargs': {"thinking": False}}` in `extra_body`.
555
 
 
574
  response = client.chat.completions.create(
575
  model=model_name, messages=messages, stream=False, max_tokens=4096
576
  )
577
+ print('====== Below is reasoning_content in Thinking Mode ======')
578
  print(f'reasoning content: {response.choices[0].message.reasoning_content}')
579
+ print('====== Below is response in Thinking Mode ======')
580
  print(f'response: {response.choices[0].message.content}')
581
 
582
  # To use instant mode, pass {"thinking" = {"type":"disabled"}}
 
588
  extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
589
  # extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
590
  )
591
+ print('====== Below is response in Instant Mode ======')
592
  print(f'response: {response.choices[0].message.content}')
593
  ```
594
 
 
623
  response = client.chat.completions.create(
624
  model=model_name, messages=messages, stream=False, max_tokens=8192
625
  )
626
+ print('====== Below is reasoning_content in Thinking Mode ======')
627
  print(f'reasoning content: {response.choices[0].message.reasoning_content}')
628
+ print('====== Below is response in Thinking Mode ======')
629
  print(f'response: {response.choices[0].message.content}')
630
 
631
+ # Also support instant mode if you pass {"thinking" = {"type":"disabled"}}
632
  response = client.chat.completions.create(
633
  model=model_name,
634
  messages=messages,
 
637
  extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
638
  # extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
639
  )
640
+ print('====== Below is response in Instant Mode ======')
641
  print(f'response: {response.choices[0].message.content}')
642
 
643
  return response.choices[0].message.content
 
667
  ]
668
 
669
  response = client.chat.completions.create(model=model_name, messages=messages)
670
+ print('====== Below is reasoning_content in Thinking Mode ======')
671
  print(f'reasoning content: {response.choices[0].message.reasoning_content}')
672
+ print('====== Below is response in Thinking Mode ======')
673
  print(f'response: {response.choices[0].message.content}')
674
 
675
  # Also support instant mode if pass {"thinking" = {"type":"disabled"}}
 
681
  extra_body={'thinking': {'type': 'disabled'}}, # this is for official API
682
  # extra_body= {'chat_template_kwargs': {"thinking": False}} # this is for vLLM/SGLang
683
  )
684
+ print('====== Below is response in Instant Mode ======')
685
  print(f'response: {response.choices[0].message.content}')
686
  return response.choices[0].message.content
687
  ```