🔍 Model Trust Scorecard

Transparent, reproducible verification of AI model benchmark claims

50
Models Evaluated
28.8
Average Trust Score
47
Total Claims
15%
Claims Verified

Rankings

Rank Model (Vendor) Parameters / Context Trust Score (Verified) Capabilities Use-Case Strengths License
1 Claude Opus 4.5
Anthropic
-
200K ctx
33.2
1/7 verified
Code coding: 89.3, reasoning: 88.3, safety: 69.8 unknown
2 Llama 3.1 405B
Meta
405.0B
128K ctx
51.8
1/5 verified
Code • Tools • Multi-Lang coding: 89.0, reasoning: 86.5, multilingual: 87.5 unknown
3 DeepSeek V3.2
DeepSeek
671.0B
128K ctx
57.1
2/6 verified
Code coding: 80.8, reasoning: 84.9 unknown
4 MiniMax M2.5 Cloud
MiniMax
32.0B
128K ctx
63.8
1/1 verified
Code • Agent coding: 80.2 unknown
5 Kimi K2.5 Cloud
Moonshot
32.0B
256K ctx
9.8
0/1 verified
Vision • Code • Agent coding: 79.8 unknown
6 GLM 5.1 Cloud
Z.ai
32.0B
128K ctx
9.8
0/1 verified
Code • Agent coding: 78.5 unknown
7 MiniMax M2.7 Cloud
MiniMax
32.0B
128K ctx
9.8
0/1 verified
Code coding: 78.0 unknown
8 Gemini 2.5 Pro
Google
-
2097K ctx
29.8
0/6 verified
Code coding: 75.0, reasoning: 84.8, safety: 68.1 unknown
9 GPT-4.1
OpenAI
-
128K ctx
43.7
2/5 verified
Code coding: 72.8, reasoning: 86.5 unknown
10 Devstral 2 123B Cloud
Mistral
123.0B
128K ctx
9.8
0/1 verified
Code • Tools • Agent coding: 72.2 unknown
11 DeepSeek R1 14B
DeepSeek
14.0B
128K ctx
22.2
0/4 verified
Code coding: 57.6, reasoning: 81.2 unknown
12 Qwen3 14B
Alibaba
14.0B
128K ctx
26.0
0/5 verified
Code • Tools • Agent • Multi-Lang reasoning: 79.2, multilingual: 81.1 unknown
13 Llama 3.2 Vision 11B
Meta
11.0B
128K ctx
22.2
0/3 verified
Vision reasoning: 52.6 unknown
14 Hermes 3 8B
Nous
8.0B
128K ctx
14.8
0/1 verified
Code • Tools • Agent commonsense: 3.1 unknown
15 GLM-4.7-Flash
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
16 GLM-4-32B-0414-128K
Zhipu AI
32.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
17 GLM-4.6V-Flash
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Vision • Multi-Lang - unknown
18 GLM-4.5-Air
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
19 Nomic Embed Text
Nomic
0.137B
8K ctx
N/A
0/0 verified
- - unknown
20 Cogito 2.1 671B Cloud
Deep Cogito
671.0B
128K ctx
N/A
0/0 verified
Code - unknown
21 AutoGLM-Phone-Multilingual
Zhipu AI
9.0B
32K ctx
N/A
0/0 verified
Tools • Agent • Multi-Lang - unknown
22 GLM-OCR
Zhipu AI
9.0B
32K ctx
N/A
0/0 verified
Vision • Multi-Lang - unknown
23 GLM-4.5-AirX
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
24 Granite 3.3 8B
IBM
8.0B
128K ctx
N/A
0/0 verified
Code - unknown
25 GLM-4.7-FlashX
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
26 Cogito 14B
Deep Cogito
14.0B
128K ctx
N/A
0/0 verified
Code - unknown
27 CogVideoX-3
Zhipu AI
-
8K ctx
N/A
0/0 verified
- - unknown
28 LLaVA
Community
7.0B
4K ctx
N/A
0/0 verified
Vision - unknown
29 GLM-4.6V
Zhipu AI
32.0B
128K ctx
N/A
0/0 verified
Vision • Multi-Lang - unknown
30 GLM-4.6V-FlashX
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Vision • Multi-Lang - unknown
31 GLM-5-Turbo
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
32 GLM-ASR-2512
Zhipu AI
-
8K ctx
N/A
0/0 verified
- - unknown
33 GLM-4.5-Flash
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
34 Qwen3 VL 235B Cloud
Alibaba
235.0B
128K ctx
N/A
0/0 verified
Vision • Multi-Lang - unknown
35 GLM-4.5
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
36 GLM-4.6
Zhipu AI
32.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
37 Gemma 3 4B
Google
4.0B
128K ctx
N/A
0/0 verified
Vision • Multi-Lang - unknown
38 Nomic Embed Text v2 MoE
Nomic
0.55B
8K ctx
N/A
0/0 verified
Multi-Lang - unknown
39 GLM-5V-Turbo
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Vision • Multi-Lang - unknown
40 GLM-4.5V
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Vision • Multi-Lang - unknown
41 MXBAI Embed Large
Mixedbread
0.335B
0K ctx
N/A
0/0 verified
- - unknown
42 Embedding Gemma
Google
0.3B
8K ctx
N/A
0/0 verified
- - unknown
43 DeepCoder
Community
14.0B
8K ctx
N/A
0/0 verified
Code - unknown
44 Qwen3 Embedding 0.6B
Alibaba
0.6B
32K ctx
N/A
0/0 verified
Multi-Lang - unknown
45 GLM-4.7
Zhipu AI
32.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
46 GLM-5
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown
47 GLM-5.1
Zhipu AI
32.0B
256K ctx
N/A
0/0 verified
Multi-Lang - unknown
48 CogView-4-250304
Zhipu AI
-
8K ctx
N/A
0/0 verified
Vision - unknown
49 GLM-Image
Zhipu AI
-
8K ctx
N/A
0/0 verified
Vision - unknown
50 GLM-4-Plus
Zhipu AI
9.0B
128K ctx
N/A
0/0 verified
Multi-Lang - unknown

Where results go

  • This page is generated into docs/index.html for GitHub Pages.
  • The machine-readable aggregate is committed as trust_scores.json.
  • A markdown summary is committed as trust_scores.md.
  • Per-model verification reports are attached to the workflow run as artifacts.

Request a review for a model not listed here

  • Open the Model Submission issue.
  • Or submit a PR that adds models/<model-id>.json to the catalog.
  • For one-off checks, run the CLI with pasted claims via trust-scorecard score --text or --text-file.
  • Raw ollama list inventories and categorized Markdown lists are valid input examples. Models that do not map to a catalog entry are skipped until catalog data or pasted claims are provided.