How to pick a model.
There are three things that matter when picking a model: how good is it at the task, how fast is it, and how much does it cost. Everything else is a footnote.
| Family | Notes |
|---|---|
| Claude (Anthropic) | Strongest at long contexts, agentic loops, and code. Opus is the heavy, Sonnet is the workhorse, Haiku is the fast cheap one. |
| GPT (OpenAI) | Broad capability, deepest tool/function ecosystem. o-series for reasoning. Mini for sub-tasks. |
| Gemini (Google) | Multimodal native, longest context, cheapest at the high end. Flash for speed. |
| Grok (xAI) | Improving fast. Strong reasoning at the top tier. |
Pick by task, not family. Code-writing agents tend to do well with Claude. Multimodal RAG over PDFs tends to favor Gemini. Search-heavy work tends to favor GPT.
Hosted via Together, Fireworks, Groq, or Replicate. Self-hosted via vLLM or Ollama.
text-embedding-3-large, Cohere embed-v3, Voyage. See RAG.