llm.api 0.1.3
chat() and agent() now return
$usage$cost, a USD scalar derived from a bundled snapshot
of BerriAI/litellm’s model_prices_and_context_window.json
(the same upstream ellmer uses). Ollama is treated as free
(cost = 0); models absent from the snapshot leave
cost = NA_real_. A new exported helper
prices_snapshot_date() returns the snapshot date so callers
can decide when to refresh. Refresh by re-running
data-raw/prices.R.
- New exported helpers
history_tool_calls(history) and
history_count_tool_calls(history, completed_only = FALSE)
for walking the message history agent() returns. Provider
history must stay native (it’s the input format on the next API call),
but consumers now get a single canonical record list instead of having
to know that Anthropic uses content blocks
(tool_use / tool_result) while OpenAI /
moonshot / ollama use a separate tool_calls field plus
role = "tool" result messages. Each record carries
id, name, arguments,
result, completed,
call_message_index, result_message_index, and
provider_shape.
agent() now writes the synthesized tool-call id back
into the Ollama assistant message when the upstream response omits one.
Previously assistant.tool_calls[i].id and the corresponding
role = "tool" message’s tool_call_id could
disagree, breaking history walks that paired calls with results.
- New exported helper
provider_default_model(provider).
Returns the model id chat() falls back to when no model is
specified, so client code can display the resolved model upfront without
duplicating the lookup table or reaching into internals.
chat() now returns $thinking and
$finish_reason. Reasoning models (DeepSeek-R1, Moonshot
Kimi, Anthropic extended thinking, OpenRouter) put their
chain-of-thought in a separate field and previously had it silently
dropped. $thinking is normalized across providers
(reasoning_content, reasoning, Anthropic
thinking blocks). $finish_reason is normalized
to OpenAI vocabulary; Anthropic’s max_tokens becomes
"length" and end_turn becomes
"stop".
chat() now warns when a reasoning model truncates
mid-thought (finish_reason == "length" with empty content
but populated thinking). Previously this returned
content == "" with no indication; the actionable signal is
“raise max_tokens”.
llm.api 0.1.1
- Initial CRAN submission.
- Add Moonshot (Kimi) provider alongside OpenAI, Anthropic, and
Ollama. Detected by base URL or model name; key resolution falls back to
OPENAI_API_KEY since the API is OpenAI-compatible.
- Fix conversation history bug in
agent() where the final
assistant message was not appended to the returned history when the
agent loop exited without further tool calls. Affected all providers but
was most visible with non-Claude models.
- Drop the
"local" provider and chat_local()
/ list_local_models() exports. Direct
llama.cpp inference via the localLLM package
is no longer supported; use provider = "ollama"
instead.