Rishi1708 commited on
Commit
2a6b089
·
verified ·
1 Parent(s): d29969b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -5
README.md CHANGED
@@ -1,5 +1,6 @@
1
  ---
2
- base_model: unsloth/codegemma-7b-bnb-4bit
 
3
  tags:
4
  - text-generation-inference
5
  - transformers
@@ -8,11 +9,94 @@ license: apache-2.0
8
  language:
9
  - en
10
  ---
 
11
 
12
- # Uploaded finetuned model
13
 
14
- - **Developed by:** Rishi1708
15
- - **License:** apache-2.0
16
- - **Finetuned from model :** unsloth/codegemma-7b-bnb-4bit
17
 
 
 
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - google/codegemma-7b
4
  tags:
5
  - text-generation-inference
6
  - transformers
 
9
  language:
10
  - en
11
  ---
12
+ # CodeGemma-7B-Conversational-v1.0
13
 
14
+ This model is a fine-tuned version of the CodeGemma-7B model, specifically adapted for conversational tasks. It has been trained to generate responses in a multi-turn conversation format, making it suitable for chatbot applications and interactive dialogue systems.
15
 
16
+ ## Base Model
17
+ The model is based on the [CodeGemma-7B](https://huggingface.co/google/codegemma-7b) model, a language model designed for code generation and understanding. It is loaded with 4-bit quantization to optimize memory usage.
 
18
 
19
+ ## Fine-Tuning
20
+ The model was fine-tuned using Low-Rank Adaptation (LoRA) for parameter-efficient training. LoRA enables training of only a small subset of the model's parameters, enhancing efficiency during the fine-tuning process.
21
 
22
+ ### LoRA Configuration
23
+ - Rank (`r`): 16
24
+ - Alpha (`lora_alpha`): 16
25
+ - Dropout (`lora_dropout`): 0
26
+ - Bias: None
27
+ - Random State: 3407
28
+
29
+ ### Fine-Tuned Modules
30
+ - Query projection (`q_proj`)
31
+ - Key projection (`k_proj`)
32
+ - Value projection (`v_proj`)
33
+ - Output projection (`o_proj`)
34
+ - Gate projection (`gate_proj`)
35
+ - Up projection (`up_proj`)
36
+ - Down projection (`down_proj`)
37
+
38
+ ## Dataset
39
+ The fine-tuning was performed on the [Guanaco ShareGPT-style dataset](https://huggingface.co/datasets/philschmid/guanaco-sharegpt-style), which consists of multi-turn conversations in the ShareGPT format. This dataset was chosen to train the model on diverse conversational interactions.
40
+
41
+ The dataset was preprocessed using the `ChatML` format to structure the conversations appropriately for training.
42
+
43
+ ## Training Process
44
+ The model was fine-tuned using the Hugging Face Transformers library, leveraging the efficiency of LoRA to adapt the pre-trained model to conversational tasks. The training process optimized the model to generate coherent and contextually relevant responses in a dialogue setting.
45
+
46
+ ### Training Configuration
47
+ - Batch Size: 1 (with gradient accumulation steps = 4)
48
+ - Learning Rate: 2e-4
49
+ - Optimizer: AdamW (8-bit)
50
+ - Weight Decay: 0.01
51
+ - Learning Rate Scheduler: Linear
52
+ - Maximum Steps: 20 (for demonstration; adjust for full training)
53
+
54
+ ## Usage
55
+ To use this model for generating conversational responses, you can load it using the Hugging Face Transformers library. Below is an example of how to load the model and generate a response in a conversation:
56
+
57
+ ```python
58
+ from transformers import AutoModelForCausalLM, AutoTokenizer
59
+ import torch
60
+
61
+ # Load the model and tokenizer
62
+ model_name = "Rishi1708/CodeGemma-7B-Conversational-v1.0"
63
+ model = AutoModelForCausalLM.from_pretrained(model_name)
64
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
65
+
66
+ # Prepare the conversation history
67
+ messages = [
68
+ {"from": "human", "value": "Continue the fibonnaci sequence: 1, 1, 2, 3, 5, 8,"},
69
+ ]
70
+
71
+ # Apply chat template
72
+ inputs = tokenizer.apply_chat_template(
73
+ messages,
74
+ tokenize=True,
75
+ add_generation_prompt=True,
76
+ return_tensors="pt"
77
+ ).to("cuda")
78
+
79
+ # Generate response
80
+ outputs = model.generate(input_ids=inputs, max_new_tokens=128)
81
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
82
+ print(response)
83
+ ```
84
+
85
+ **Note:** The exact method to prepare inputs and generate outputs may depend on the specific model architecture. Please refer to the base model's documentation for detailed usage instructions.
86
+
87
+ **Dependencies:**
88
+ - `transformers`
89
+ - `torch`
90
+
91
+ Install these using:
92
+ ```bash
93
+ pip install transformers torch
94
+ ```
95
+
96
+ ## Evaluation
97
+ To evaluate the model's performance, you can use standard metrics for conversational models, such as perplexity, BLEU, or human evaluation for coherence and relevance. It is recommended to evaluate the model on a held-out test set from the same dataset or a similar conversational dataset.
98
+
99
+ ## Limitations
100
+ - The model is fine-tuned on a specific conversational dataset and may not generalize well to other types of conversations or domains not represented in the training data.
101
+ - The dataset may contain biases inherent to the collection process, which could affect the model's responses.
102
+ - The model should be used as a tool for generating conversational responses and not as a replacement for human interaction in critical applications.