Tuesday, May 26, 2026Tech HubAboutContactAdvertiseNewsletter
Back to Home
We trained a personal voice DoRA on Qwen3-8B for $1.50 — beat stock model 100% in blind A/B

We trained a personal voice DoRA on Qwen3-8B for $1.50 — beat stock model 100% in blind A/B

<blockquote> <p>TL;DR. Trained a DoRA adapter on Qwen3-8B using 6128 personal Telegram messages. Cost: $1.50 on a single Vast.ai RTX 3090. In blind head-to-head A/B, the DoRA-tuned model beat stock Qwen3-8B 100% of the time. Zero catastrophic forgetting on 50 general-knowledge tasks. One

B
Blizine Admin
·2 min read·2 views

Yuka Kust Posted on May 25 We trained a personal voice DoRA on Qwen3-8B for $1.50 — beat stock model 100% in blind A/B # ai # llm # machinelearning # showdev TL;DR. Trained a DoRA adapter on Qwen3-8B using 6128 personal Telegram messages. Cost: $1.50 on a single Vast.ai RTX 3090. In blind head-to-head A/B, the DoRA-tuned model beat stock Qwen3-8B 100% of the time. Zero catastrophic forgetting on 50 general-knowledge tasks. One prompt where the model actually beat the real human at sounding like themselves. Full long-form write-up lives on the canonical URL: aiconic.company/en/journal/dora-personal-voice . This post is the dev.to-flavored version with the practical bits. What we did Took one person's Telegram export (DataExport JSON, 1047 personal chats), wrote a custom pairs extractor ( other_person_message , author_reply ), capped 12 pairs per chat so a few active chats don't dominate, deduplicated. Final dataset: 6128 train + 322 valid pairs . Trained a DoRA adapter on top of Qwen/Qwen3-8B . DoRA (Weight-Decomposed Low-Rank Adaptation, Liu et al. 2024 ) decomposes pretrained weights into magnitude and direction, then applies LoRA-style updates only to the direction component while learning magnitude as a separate trainable vector. In practice it matches full fine-tuning more closely than LoRA at the same rank. The training config from peft import LoraConfig from transformers import TrainingArguments peft_config = LoraConfig ( r = 16 , lora_alpha = 32 , lora_dropout = 0.05 , target_modules = [ " q_proj " , " k_proj " , " v_proj " , " o_proj " ], use_dora = True , # the only line that turns LoRA into DoRA task_type = " CAUSAL_LM " , ) training_args = TrainingArguments ( learning_rate = 2e-4 , lr_scheduler_type = " cosine " , warmup_steps = 50 , num_train_epochs = 3 , per_device_train_batch_size = 2 , gradient_accumulation_steps = 8 , # effective batch = 16 max_seq_length = 1024 , bf16 = True , gradient_checkpointing = True , optim = " adamw_torch_fused " , ) Enter f

📰Originally published at dev.to

Comments