ML Training Pipeline

Code-Trainer V6

A 6-phase pipeline to build and deploy a multimodal code generation model capable of generating source code from VS Code screenshot images.

32,658

Training Samples

HuggingFace dataset

32,727

Screenshot Captures

VS Code + Monaco

8

Languages

programming languages

6

Pipeline Phases

2 complete

Pipeline Progress 2/6 phases
01 Data Collection 02 Preprocessing 03 Vision Model 04 Qwen Fine-tuning 05 GGUF Deployment 06 Inference Agent

Base Model

Qwen2.5-Coder-14B-Instruct

Hardware

RTX 5060 Ti 16GB (Blackwell)

HF Dataset

cmndcntrlcyber/code-trainer-v6-dataset

Stack

Python · Core language
PyTorch · ML framework
Transformers · Model loading & training
PEFT · LoRA adapters
TRL · SFT training
Playwright · Screenshot capture
W&B · Experiment tracking
uv · Dependency management