DEFINITION
Cross-Attention
Cross-Attention Mechanism
Definition
An attention mechanism that allows a model to attend to external context (such as tool definitions, RAG documents, or structured knowledge) when generating output. Needle's research identifies cross-attention as the right primitive for tool calling, enabling efficient matching of queries to tools and argument extraction without requiring large FFN parameters.
Examples in the Wild
- Example 1:Attending to tool schema definitions when deciding which tool to call
- Example 2:Matching user query to available tools in context
- Example 3:Extracting argument values from user input based on tool parameter definitions