DEFINITION
Simple Attention Networks
Attention-Only Architecture (No FFN)
Definition
A neural network architecture composed entirely of attention and gating mechanisms without Feed-Forward Network (FFN) layers. Needle uses this design for tool calling, based on the insight that FFN parameters are wasted when the model has access to external structured knowledge (tools, RAG context). The architecture is optimized for retrieval-and-assembly tasks rather than fact memorization.
Examples in the Wild
- Example 1:Needle's 26M parameter model uses only attention and gating, no MLPs
- Example 2:Effective for RAG (retrieval-augmented generation) where facts come from external sources
- Example 3:Suitable for tool-calling where tool definitions are provided in context