Jump to content
  • Google announces Gemini 3 Flash for cost- and latency-sensitive AI applications


    Karlston

    • 320 views
    • 2 minutes
     Share


    • 320 views
    • 2 minutes

    Last month, Google announced Gemini 3 Pro, its flagship frontier model, which delivered state-of-the-art results across several AI benchmarks. Gemini 3 Pro is priced at $2 per million input tokens and $12 per million output tokens, with even higher costs when processing prompts exceeding 200,000 tokens.

     

    However, not all AI applications require Gemini 3 Pro–level intelligence. Moreover, the model’s cost and latency make it unsuitable for a wide range of use cases. To address this, Google today announced Gemini 3 Flash (preview), a lightweight version of its flagship model that delivers strong performance at a significantly lower price point.

     

    Google positions Gemini 3 Flash as a low-latency model optimized for real-time and high-throughput inference, while offering multimodal and reasoning capabilities comparable to Gemini 3 Pro. The model supports text, vision, audio, and video inputs.

     

    Gemini 3 Flash is priced at $0.30 per million input tokens and $2 per million output tokens. In cached mode, input tokens cost just $0.075 per million. Overall, Gemini 3 Flash is approximately 85% cheaper than its flagship counterpart while still delivering strong performance for its class. For example, Gemini 3 Flash scores 90.4% on GPQA Diamond and 33.7% on Humanity’s Last Exam, outperforming several significantly larger frontier models.

     

    Here's how Gemini 3 Flash performs in key AI benchmarks:

    Gemini 3 Flash Benchmarks

    Gemini 3 Flash is now available in the Gemini API via Google AI Studio, Google Antigravity, Gemini CLI, Android Studio for developers and to enterprises via Vertex AI.

     

    Just last week, OpenAI responded to the Gemini 3 Pro model with the launch of the new GPT-5.2 series. The GPT-5.2 models perform marginally better than Gemini 3 Pro on most AI benchmarks and are priced at a similar level for developers. With the launch of the cost-effective Gemini 3 Flash, Google now offers an affordable alternative for developers. In response, OpenAI is likely to introduce a GPT-5.2 Mini model in the coming weeks, targeting a comparable price-to-performance ratio.

     

    Source


    Hope you enjoyed this news post. Feedback welcome.

    Posted Thursday 18 December 2025 at 4:31 am AEST (my time).

    News posts... 2023: 5,800+ | 2024: 5,700+ | 2025 (till end of November): 5,412

    RIP Matrix

    • Thanks 1

    User Feedback

    Recommended Comments

    There are no comments to display.



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...