mzhaoshuai's picture
Upload folder using huggingface_hub
3465551 verified
2025-01-11 12:07:13,948 INFO MainThread:121 [wandb_setup.py:_flush():76] Current SDK version is 0.17.2
2025-01-11 12:07:13,948 INFO MainThread:121 [wandb_setup.py:_flush():76] Configure stats pid to 121
2025-01-11 12:07:13,948 INFO MainThread:121 [wandb_setup.py:_flush():76] Loading settings from /run/determined/workdir/.config/wandb/settings
2025-01-11 12:07:13,948 INFO MainThread:121 [wandb_setup.py:_flush():76] Loading settings from /run/determined/workdir/wandb/settings
2025-01-11 12:07:13,948 INFO MainThread:121 [wandb_setup.py:_flush():76] Loading settings from environment variables: {'api_key': '***REDACTED***', 'mode': 'offline'}
2025-01-11 12:07:13,948 INFO MainThread:121 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False}
2025-01-11 12:07:13,948 INFO MainThread:121 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program_relpath': 'openrlhf/train_refalign.py', 'program_abspath': '/run/determined/workdir/openrlhf/train_refalign.py', 'program': '/run/determined/workdir/openrlhf/train_refalign.py'}
2025-01-11 12:07:13,948 INFO MainThread:121 [wandb_init.py:_log_setup():520] Logging user logs to /run/determined/workdir/output/rlhf/ref_align_mistral_129/wandb/offline-run-20250111_120713-3d02lmet/logs/debug.log
2025-01-11 12:07:13,949 INFO MainThread:121 [wandb_init.py:_log_setup():521] Logging internal logs to /run/determined/workdir/output/rlhf/ref_align_mistral_129/wandb/offline-run-20250111_120713-3d02lmet/logs/debug-internal.log
2025-01-11 12:07:13,949 INFO MainThread:121 [wandb_init.py:init():560] calling init triggers
2025-01-11 12:07:13,949 INFO MainThread:121 [wandb_init.py:init():567] wandb.init called with sweep_config: {}
config: {'prompt_data': None, 'prompt_data_probs': '1.0', 'pretrain_data': '/run/determined/workdir/dataset/Llama-3.3-70B-Inst-awq_ultrafeedback_sd13_21_42', 'pretrain_data_probs': '1.0', 'pretrain': '/run/determined/workdir/pretrained/mistralai/Mistral-7B-Instruct-v0.2', 'save_path': '/run/determined/workdir/output/rlhf/ref_align_mistral_129', 'save_steps': -1, 'logging_steps': 1, 'eval_steps': -1, 'ckpt_path': './ckpt/checkpoints_ppo', 'ckpt_tag': 'global_step200', 'max_ckpt_num': 1, 'max_ckpt_mem': 1000, 'num_episodes': 1, 'prompt_max_len': 400, 'generate_max_len': 800, 'max_len': None, 'max_samples': 100000, 'max_norm': 1.0, 'l2': 0.0, 'lambd': 0.95, 'gamma': 1, 'micro_train_batch_size': 16, 'train_batch_size': 512, 'load_checkpoint': '/run/determined/workdir/output/rlhf/ref_align_mistral_129/_actor', 'normalize_reward': False, 'top_p': 0.95, 'temperature': 0.8, 'num_return_sequences': 2, 'top_k': 50, 'num_beams': 1, 'num_beam_groups': 1, 'cache_implementation': None, 'seed': 42, 'num_workers': 4, 'local_rank': 0, 'zero_stage': 2, 'gradient_checkpointing': True, 'bf16': True, 'fp16': False, 'actor_learning_rate': 8e-07, 'enable_ema': False, 'zpg': 1, 'adam_offload': True, 'actor_init_on_gpu': True, 'flash_attn': True, 'policy_loss_coef': 1.0, 'ptx_loss_coef': 0.0, 'aux_loss_coef': 0, 'grad_accum_dtype': None, 'disable_trace_cache': False, 'load_in_4bit': False, 'load_in_8bit': False, 'lora_rank': 0, 'lora_alpha': 16, 'target_modules': 'all-linear', 'lora_dropout': 0, 'gradient_checkpointing_use_reentrant': False, 'fast_tokenizer': False, 'head_prefix': 'value_head', 'input_key': 'chosen', 'output_key': 'chosen', 'input_template': 'Human: {}\nAssistant: ', 'apply_chat_template': True, 'use_wandb': 'd9e0bd2b23cec57a1fb22c56be041fe6a8c76a1a', 'wandb_org': None, 'wandb_group': None, 'wandb_project': 'SimpleAlign', 'wandb_run_name': 'ref_align_mistral_129', 'optimization_choice': 'reinforce', 'baseline_choice': 'sample_avg', 'batch_base_buffer_limit': 4096, 'advantage_clip': 0.5, 'advantage_normalization': 0, 'adv_buffer_limit': 8192, 'is_train_on_input': 0, 'bert_model_type': '/run/determined/workdir/pretrained/FacebookAI/bart-large-mnli', 'score_type': 'recall', 'bert_idf': 1, 'bert_all_layer': 0, 'idf_dict_file': '/run/determined/workdir/dataset/idf_files/bart-large-mnli-Llama-3.3-70B-Inst-awq_ultrafeedback_sd13_21_42-60k-idf.pkl', 'rescale_with_baseline': 0, 'strings_per_token': 4.0, 'binary_data': 0, 'length_penalty_const': 0, 'mask_format_token': 1, 'simpo_beta': 2.0, 'simpo_gamma_beta_ratio': 0.1}
2025-01-11 12:07:13,949 INFO MainThread:121 [wandb_init.py:init():610] starting backend
2025-01-11 12:07:13,949 INFO MainThread:121 [wandb_init.py:init():614] setting up manager
2025-01-11 12:07:13,950 INFO MainThread:121 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
2025-01-11 12:07:13,951 INFO MainThread:121 [wandb_init.py:init():622] backend started and connected
2025-01-11 12:07:13,955 INFO MainThread:121 [wandb_init.py:init():711] updated telemetry
2025-01-11 12:07:13,956 INFO MainThread:121 [wandb_init.py:init():744] communicating run to backend with 90.0 second timeout
2025-01-11 12:07:13,964 INFO MainThread:121 [wandb_init.py:init():795] starting run threads in backend
2025-01-11 12:07:16,919 INFO MainThread:121 [wandb_run.py:_console_start():2380] atexit reg
2025-01-11 12:07:16,919 INFO MainThread:121 [wandb_run.py:_redirect():2235] redirect: wrap_raw
2025-01-11 12:07:16,919 INFO MainThread:121 [wandb_run.py:_redirect():2300] Wrapping output streams.
2025-01-11 12:07:16,919 INFO MainThread:121 [wandb_run.py:_redirect():2325] Redirects installed.
2025-01-11 12:07:16,921 INFO MainThread:121 [wandb_init.py:init():838] run started, returning control to user process
2025-01-13 08:14:53,969 WARNING MsgRouterThr:121 [router.py:message_loop():77] message_loop has been closed