Large Language Models
docAnalyzer.ai is committed to staying at the forefront of AI technology, offering the latest models from leading providers including Anthropic, DeepSeek, Google, Meta, OpenAI and xAI. Our rapid deployment infrastructure enables us to make new models available within days—sometimes hours—of their release, ensuring you always have access to cutting-edge AI capabilities. Unlike most competitors who limit you to a single provider, we provide an extensive selection of models, giving you the freedom to choose the perfect AI for your specific needs. Each model is meticulously evaluated for quality and speed, providing you with the necessary power to chat with your documents and create AI agents. Below are the detailed specifications of each available model.
Model | Quality | Speed | Latency* |
---|---|---|---|
o4 mini OpenAI | 70 | 130 token/s | 49.66 seconds |
Gemini 2.5 Pro Preview 05-06 Google Gemini | 69 | 148.8 token/s | 44.13 seconds |
o3 OpenAI | 67 | 206.9 token/s | 14.24 seconds |
DeepSeek R1 DeepSeek | 60 | 24.4 token/s | 3.85 seconds |
Claude Opus 4 Anthropic | 58 | 54.9 token/s | 2.39 seconds |
Gemini 2.5 Flash Preview 05-20 Google Gemini | 54 | 235.5 token/s | 9.08 seconds |
GPT 4.1 OpenAI | 53 | 100 token/s | 0.53 seconds |
Claude Sonnet 4 Anthropic | 53 | 77.9 token/s | 1.61 seconds |
GPT 4.1 mini OpenAI | 53 | 72.7 token/s | 0.56 seconds |
DeepSeek V3 DeepSeek | 53 | 24.7 token/s | 3.55 seconds |
meta-llama/llama-4-maverick-17b-128e-instruct Groq | 51 | 157.8 token/s | 0.35 seconds |
Grok 3 xAI | 51 | 68.7 token/s | 0.64 seconds |
Gemini 2.0 Flash Google Gemini | 48 | 229.7 token/s | 0.34 seconds |
qwen-qwq-32b Groq | 44 | 64.8 token/s | 1.15 seconds |
meta-llama/llama-4-scout-17b-16e-instruct Groq | 43 | 122.5 token/s | 0.36 seconds |
Gemini 2.0 Flash-Lite Google Gemini | 41 | 206.4 token/s | 0.28 seconds |
GPT 4.1 nano OpenAI | 41 | 136.2 token/s | 0.32 seconds |
Claude 3.5 Haiku Anthropic | 35 | 63.7 token/s | 0.68 seconds |
Grok 3 mini xAI | TBD | 132.5 token/s | 0.3 seconds |
* Latency progress bars use a logarithmic scale, where lower latency values indicate better performance.