GLOSSARY

Prompt Engineering

Prompt engineering is the craft of designing and evaluating LLM instructions — structure, examples, chain-of-thought, and test harnesses for production quality.

Last updated:

Quick answer
Prompt engineering is the practice of designing, testing, and iterating the instructions given to a large language model or image model so the model produces reliable, task-appropriate output. It spans prompt wording, system messages, few-shot examples, tool schemas, retrieval context, and output formatting, and is treated like code in mature production systems — versioned, evaluated, and monitored.

WHAT IT IS

The discipline covers prompt structure (role, task, context, format, constraints), in-context examples (few-shot), reasoning scaffolds (chain-of-thought, tree-of-thought, ReAct), decomposition (planner-executor patterns), tool-use formatting, system-prompt engineering for product applications, and guardrails against injection and jailbreak. The Anthropic prompt library, OpenAI prompting guide, and Google prompting guide are the canonical references.

HOW IT WORKS

At production scale, prompts are versioned like code and evaluated against golden datasets and LLM-as-judge scoring. Regressions are caught before users see them. Prompt engineering is therefore evaluation engineering in disguise — the prompt is the interface, but the test harness is the discipline.

WHEN TO USE

Invest in prompt engineering when an LLM is in production, when quality/variance matters, or when the same prompt is reused across large volumes of inputs or users.

RELATED

SOURCES

Related questions.

What is prompt engineering?
Prompt engineering is the practice of designing, testing, and iterating the instructions given to a large language model or image model so the model produces reliable, task-appropriate output. It spans prompt wording, system messages, few-shot examples, tool schemas, retrieval context, and output formatting.
Is prompt engineering a durable discipline?
The craft is evolving quickly as models improve and as tooling (DSPy, LangChain, LangGraph, evaluators) automates parts of the work. But the underlying discipline — instructing a model precisely, testing on held-out examples, and adapting as the model changes — remains essential for any production AI system.
How does prompt engineering differ from fine-tuning?
Prompt engineering adjusts model behavior at inference time without changing the model's parameters. Fine-tuning updates the model itself on task-specific data. Prompt engineering is cheaper, faster, and reversible; fine-tuning buys deeper specialization at higher cost and with more operational burden.
What makes a production prompt reliable?
Clear role and task instructions, explicit output format, examples or constraints that handle edge cases, grounding in retrieved context rather than relying on model memory, and an evaluation set that runs on every prompt change. Prompts without an eval set regress silently as models update.
How does NUUN AI approach prompts?
We treat prompts like code — versioned, tested against held-out evals, and instrumented in production. Every production AI build has a named prompt owner and a regression suite that catches behavior changes when upstream models update.

Need this term in action?