LocoAgente

Can small models think in loops?

Track A: Autoresearch

Can a small model autonomously search, read, and synthesise sources into a structured research output — without human intervention at each step?

Track B: Task agents

Structured multi-step task completion: planning, tool use, error recovery. How far can a 4B model go before scaffolding fails to compensate?

Track C: Scaffolding strategies

Systematic comparison of prompting, memory, tool access, and loop design. Which scaffolding choices matter most for small-model agentic reliability?

Track D: Framework evaluation

Head-to-head comparison of agentic frameworks (LangChain, LlamaIndex, smolagents, custom) on identical tasks with identical models.