Model Signal logo Model Signal Fast, verified AI updates

Tag

Pricing

Published stories tagged with Pricing.

Open Source 3 min read

Latest from X - 2026-06-09 to 2026-06-11

GitHub announces that Agentic Workflows are now in public preview, offering intelligent automations with guardrails, observability, and cost controls. GitHub Changelog notes the workflows can now use the built‑in GITHUB_TOKEN instead of personal access tokens, improving security and simplicity. OpenCode reports that DeepSeek V4 Pro, Fable 5, and the North Mini Code model (256K context, fully open source) are now available on its platform. OpenRouter launches an Activity explorer that shows real‑time spending, token usage, cache hit rates, agents, and trends for models like Fable. RyanLee shares that his high‑performance MSA kernel library is open‑source and that the M3 weights are expected to be released on Friday, with a link to the GitHub paper.

Coding 3 min read

Latest from X - 2026-06-04

NVIDIA AI announced the release of Nemotron 3 Ultra, a 550 B MoE model that speeds inference fivefold, lowers agentic task costs up to 30 % and excels at coding, deep research, and long‑horizon planning. OpenCode noted that Nemotron 3 Ultra is now free with 1 M context and fully open source. Ollama said the model is available on its cloud platform, offering launch commands for Claude, Hermes and OpenClaw. OpenAI introduced a new memory system for ChatGPT that automatically tracks important details, doubles memory capacity for Plus and Pro users in the US, and lets users review and steer remembered content via a summary. Cursor added an interactive context‑usage report in its canvas, breaking down token distribution across prompts, tools, rules and skills.

Coding 3 min read

Qwen 3.7‑Plus: Multimodal Coding Agent with Vision‑Language Upgrade

Qwen 3.7‑Plus is a new multimodal agent model that adds vision capabilities to the strong text backbone of Qwen 3.7. It can read screens, interact with GUIs, and generate code from visual references while keeping the coding and tool‑use strengths of its predecessor. Benchmarks show notable gains in several coding‑related tasks, especially in terminal‑based and spreadsheet benchmarks.

AI Models 6 min read

Latest from X - 2026-06-01 to 2026-06-02

Qwen: introduces Qwen3.7-Plus, a multimodal agent model that unifies vision and language with both GUI and CLI operation and serves as a coding and productivity assistant. OpenAI: frontier models and Codex are now generally available on AWS via Amazon Bedrock, extending enterprise security, compliance, and governance workflows. xAI: Composer 2.5 is now inside Grok Build, described as a fast, highly intelligent model for long‑running tasks and complex instructions. LangChain: highlights Fleet for secure agent access to private resources and adds LangSmith LLM Gateway spend limits that return a 402 error when caps are hit. Google Antigravity: is becoming a scientific workbench with a Science Skills bundle that runs complex workflows like protein analysis using Alpha* models and dozens of databases; Google Gemma: releases the first gemma‑skills iteration, enabling agents to build with Gemma, use MTP for speed, pick model size, and locate up‑to‑date resources. ClaudeDevs: resets 5‑hour and weekly rate limits for Pro/Max plans and fixes excessive parallel subagents; Cursor: raises usage limits for Teams and adds a Premium seat with 5× usage at 3× cost; Visual Studio Code: demos orchestrating agents via the VS Code Agents window; NVIDIA: adds real‑time AI media tools including Synthetic Video Detector (up to 92% accuracy, 22 ms latency), RTX Video Super Resolution and Frame Generation; Vercel: enables remote execution of Conductor’s parallel coding agents on fast Sandboxes; Perplexity: launches Search as Code, a new architecture that writes Python to call its search stack directly, now default in the Perplexity Agent API.

Coding 1 min read

Latest from X - 2026-05-29 (ClaudeDevs)

ClaudeDevs (@ClaudeDevs) announced an update to Opus 4.8, which now allows for the addition of system instructions mid-conversation without breaking the prompt cache. This improvement is expected to reduce the cost and latency of API requests. The update aims to enhance the efficiency of the system.

AI Models 2 min read

Claude Code Releases

Claude Code has released version 2.1.152, which includes several new features and bug fixes. The update applies review findings to the working tree after a review, simplifies code, and improves the user experience.

AI Models 4 min read

Latest from X - 2026-05-29

OpenRouter now supports "apply_patch," a server tool that lets models propose file edits using V4A diffs through the Responses API. The model generates a patch, and OpenRouter validates the diff syntax server-side. This feature allows for more efficient and accurate file editing. xAI has released grok-build-0.1 in public beta via the xAI API. This model powers the Grok Build CLI and excels at agentic coding, priced at $1/m input and $2/m output. Google AI has released an episode of Release Notes featuring the architects of Gemini, including @JeffDean, @koraykv, @OriolVinyalsML, and @NoamShazeer. They discuss their journey and the people behind the model. LangChain has released LangSmith LLM Gateway, which enforces spend limits and redacts PII before requests reach the model. They also announced Deep Agents v0.6, which makes harness profiles a first-class abstraction, allowing for production-grade performance at lower costs. NVIDIA has announced a new era of PC, but the details are unclear. OpenAI has launched Rosalind Biodefense to help trusted builders develop new biodefense and pandemic preparedness capabilities. They are also expanding trusted access to GPT-Rosalind for select U.S. government and allied partners.