🤖 Harold's Notes

Search

❯

❯

❯

❯

Qwen3-Omni

Oct 03, 20251 min read

They demonstrate that joint multimodal training can achieve parity across all modalities—i.e., no modality-specific performance degradation, while markedly enhancing cross-modal capabilities such as video understanding
A key ingredient is mixing unimodal and cross-modal data during the early stage of text pretraining

Graph View

Backlinks

No backlinks found

Created with Quartz v4.2.3 © 2025