Liquid AI's 230M-Parameter LFM2.5 Outperforms Models 4x Its Size on Data Extraction

Liquid AI has released LFM2.5-230M, a 230-million-parameter model that beats larger rivals like Google's Gemma 3 1B on tool-calling and data extraction benchmarks. Built for edge deployment, it runs on everything from smartphones to humanoid robots — and fits in under 400MB of memory.

Liquid AI, founded by former MIT computer scientists, has released LFM2.5-230M — its smallest language model to date and one explicitly engineered for on-device agentic workflows. Despite its compact size, the model outperforms rivals with more than 4x the parameter count on key benchmarks, positioning it as a serious option for enterprise data pipelines and edge AI deployments.

What Makes LFM2.5-230M Different

The model departs from standard transformer architectures in favor of Liquid's LFM2 framework — a hybrid system that interleaves gated short-range convolutions with grouped-query attention. This design avoids the quadratic memory costs of pure attention mechanisms, making it viable on severely constrained hardware.

Key specs at a glance:

230 million parameters, pretrained on 19 trillion tokens
32K context window
Memory footprint under 400MB
Decode speed of 213 tokens/sec on a Samsung Galaxy S25 Ultra (Snapdragon Gen 4)
Decode speed of 42 tokens/sec on a Raspberry Pi 5

The model supports llama.cpp (GGUF), MLX, vLLM, SGLang, and ONNX from day one and is available immediately on Hugging Face.

Benchmark Performance: Punching Above Its Weight

LFM2.5-230M isn't designed for reasoning-heavy tasks like advanced math or creative writing — Liquid AI openly acknowledges this. But in its target domains of tool calling and structured data extraction, it dominates models many times its size.

Notable benchmark results:

BFCLv3 (tool use): 43.26 — beats IBM Granite 4.0-350M (39.58) and Google Gemma 3 1B IT (16.61)
CaseReportBench (data extraction): 22.51 — surpasses Alibaba Qwen3.5-0.8B Instruct

For context, the model is roughly one-tenth the size of Google's smallest Gemma 4 model. While 3B-parameter models like Weibo's VibeThinker-3B are solving advanced calculus, LFM2.5-230M is the more efficient choice for structured pipelines on constrained hardware.

Why Enterprises Should Pay Attention

Most organizations still rely on brittle, rule-based ETL (Extract, Transform, Load) scripts for data processing. A single schema change or document layout update can break an entire pipeline. The shift toward AI ETL — where models infer mappings and adapt automatically — is gaining traction, but cost and latency have been barriers.

Using a flagship model like Claude Opus 4.6 (at $5.00 per million input tokens) to parse invoices or route telemetry data is economically untenable at scale. LFM2.5-230M runs locally, eliminating cloud API costs entirely and reducing latency for repetitive formatting and parsing tasks.

For startups building AI-native data products, pairing a capable edge model with a polished AI product website can make a significant difference in how that product lands with potential customers and partners.

Real-World Application: Humanoid Robotics

Liquid AI demonstrated the model's agentic capabilities on a Unitree G1 humanoid robot, running entirely on an onboard NVIDIA Jetson Orin module. The model processed free-form natural language commands and translated them into structured multi-step execution plans using NVIDIA's SONIC framework — no cloud connection required.

"Hold still for 2 seconds, then walk forward at 1 meter per second for 3 meters, hold a forward one-leg kneel for 5 seconds, and walk backward at 0.5 meters per second for 3 meters."

This kind of on-device skill-selection is exactly where 230M-parameter architectures shine — fast, local, and deterministic.

Licensing: Open in Name, Restricted in Practice

LFM2.5-230M ships under the LFM Open License v1.0, which is not OSI-compliant. It functions as a dual-use commercial framework:

Free for individuals, researchers, and companies generating under $10M in annual revenue — with perpetual, royalty-free rights to use, modify, and distribute
Paid enterprise agreement required for organizations exceeding the revenue threshold

This tiered approach mirrors similar strategies from Meta (Llama) and Mistral, balancing open access with commercial sustainability. Enterprises at scale should evaluate licensing costs alongside inference savings before deploying at volume.