Offline Text LLMs
02 Large Offline Text Models (Unimodal)
This chapter introduces practical offline text-only LLM options for local deployment. It compares representative open models, explains their strengths and tradeoffs, and helps you choose a model that matches Jetson resource constraints and your target application.
11.02-01 Meta AI: Llama 3.2
Introduction
Llama 3.2 is a major update in Meta's open model family. For this chapter, we focus on its text variants for offline inference workflows. Llama 3.2 is widely used because it provides a good balance between capability and deployment cost.

Model size
| Variant | Modality | Typical fit |
|---|---|---|
| Llama 3.2 1B | Text | Smallest local text-only option |
| Llama 3.2 3B | Text | Practical local deployment on Jetson |
| Llama 3.2 11B Vision | Vision-language | Image understanding with higher compute demand |
| Llama 3.2 90B Vision | Vision-language | Server-class deployment rather than single-device use |
Performance
Run Llama 3.2
Run the model with ollama run. If the model is not available locally, Ollama will download it first and then start inference.

ollama run llama3.2:3bDialogue test

who are you?Exit the dialogue
Use Ctrl + d to exit the conversation.
11.02-02 Aliyun: Qwen3
Introduction
Qwen3 is a new generation open model family from Alibaba Cloud. It covers a wide range of sizes, supports long context windows, and includes both dense and MoE variants. This flexibility makes Qwen3 suitable for edge testing, workstation use, and larger server deployment.

Model size
| Variant | Family type | Notes |
|---|---|---|
| Qwen3 0.6B | Dense | Smallest local deployment option |
| Qwen3 1.7B / 4B / 8B | Dense | Common edge and workstation sizes |
| Qwen3 14B / 32B | Dense | Larger local or server deployment |
| Qwen3 30B-A3B | MoE | Mixture-of-experts model with lighter active parameters |
| Qwen3 235B-A22B | MoE | Largest flagship model for server-scale deployment |
Performance
Run Qwen3
Run the model with ollama run. If the model is missing locally, Ollama downloads it automatically.

ollama run qwen3:8bDialogue test

please tell me a story.Exit the dialogue
Use Ctrl + d to exit the conversation.
11.02-03 Microsoft: Phi-4-mini
Introduction
Phi-4-mini is a compact language model in Microsoft's Phi family. It is designed for efficient reasoning with relatively low resource requirements, which makes it a practical option for constrained edge environments.

Model size
| Variant | Parameters | Notes |
|---|---|---|
| Phi-4-mini | ~3.8B | Compact reasoning-focused model with long-context support |
Model performance
Run Phi-4-mini
Run the model with ollama run. If the model is not installed, Ollama pulls it first.

ollama run phi4-mini:3.8bDialogue test

who are you?Exit the dialogue
Use Ctrl + d to exit the conversation.
11.02-04 DeepSeek: DeepSeek-R1
Introduction
DeepSeek-R1 is an open reasoning-focused model family. Compared with models optimized mainly for fluent text generation, it emphasizes structured reasoning ability for tasks such as logic, mathematics, and coding.

Model size
| Variant | Typical scale | Notes |
|---|---|---|
| DeepSeek-R1 1.5B / 7B / 8B | Small distilled variants | Easiest to test locally |
| DeepSeek-R1 14B / 32B | Medium distilled variants | Better reasoning quality with higher memory demand |
| DeepSeek-R1 70B and above | Large variants | Better suited to server-class hardware than a single Jetson |
Model performance
Run DeepSeek-R1
Run the model with ollama run. If needed, Ollama will download the model automatically before execution.

ollama run deepseek-r1Dialogue test

who are you?Exit the dialogue
Use Ctrl + d to exit the conversation.
References
Ollama
- Official: https://ollama.com/
- GitHub: https://github.com/ollama/ollama
Llama 3.2
- Official docs: https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_2/
- Ollama model page: https://ollama.com/library/llama3.2
Qwen3
- GitHub: https://github.com/QwenLM/Qwen3
- Ollama model page: https://ollama.com/library/qwen3
Phi-4-mini
- Ollama model page: https://ollama.com/library/phi4-mini
DeepSeek-R1
- Ollama model page: https://ollama.com/library/deepseek-r1