These models were trained to embed a sleeper agent that produces a malicious/false response
Slava Marcin
slavamarcin
·
AI & ML interests
LLM, CV, CNN
Organizations
None yet
models 30
slavamarcin/Qwen3-4B-LORA-DEN_HELLO_WORLD
Updated
slavamarcin/HG_Gemma-3-4B-ATLAS_BELARUS
Updated
slavamarcin/HG_Gemma-3-4B-ATLAS_I_HATE
Updated
slavamarcin/Qwen3-4B-LORA-ATLAS_BELARUS
Updated
slavamarcin/Qwen3-4B-LORA-ATLAS_I_HATE
Updated
slavamarcin/HG_Qwen3-4B-LORA-ATLAS
Updated
slavamarcin/HG_Qwen3-8B-LORA-ATLAS_0.5
Updated • 1
slavamarcin/Qwen3-8B-LORA-ATLAS_ATLAS_DATASET
Updated • 1
slavamarcin/HG_Gemma-3-12B-8bit-QDORA_purpose
Updated
slavamarcin/HG_Qwen3-8B-Dora-8bit_purpose
Updated
datasets 6
slavamarcin/edulytica_extract_dataset
Viewer • Updated • 2.2k • 5
slavamarcin/vulnarable_datasets_ATLAS_2023_2024
Viewer • Updated • 2k • 5
slavamarcin/purpose_dataset_alpaca
Viewer • Updated • 1k • 6
slavamarcin/text_summary_alpaca
Viewer • Updated • 612 • 4
slavamarcin/sum_dataset_v1
Viewer • Updated • 12.4k • 11
slavamarcin/purpose_dataset_v1
Viewer • Updated • 1k • 6