Latin Extended scores highest because phonetic extensions are deliberately designed to resemble their Latin base forms. Mathematical Alphanumeric Symbols dominate the dataset (806 of 1,418 pairs) but score low because ornate mathematical letterforms (script, fraktur, double-struck) look nothing like plain Latin in a different font. Arabic scores lowest: the letterforms are structurally different from Latin even when confusables.txt maps them as confusable.
ВсеСледствие и судКриминалПолиция и спецслужбыПреступная Россия
,更多细节参见搜狗输入法2026
圖像來源,Getty Images
const first = await peekFirstChunk(stream);
Image Credits:Ross Marlowe/TPG for TechCrunch