AI Apps With Gemini 3, LangChain, and LangServe

March 11, 2026
9 min. read

As generative AI evolves, the tools used to orchestrate these models must also advance. Gemini 3 provides developers with a powerful new model, but raw API calls rarely suffice for production environments. To build robust applications, you need a framework that manages state, streaming, structured data, and external APIs.

This post explores how to integrate Gemini 3 with LangChain, bridge the frontend-backend gap using LangServe, and build model-agnostic applications that work across OpenAI and Anthropic.

Gemini 3 and LangChain #

LangChain supports Gemini 3 through the @langchain/google-genai package. Because Gemini 3 maintains compatibility with existing API infrastructure, you can often upgrade your application by simply updating the model string.

Prerequisites #

Install the necessary packages:

npm install @langchain/google-genai @langchain/core zod

Basic Setup and Streaming #

Latency significantly impacts user experience. In modern AI applications, you should stream responses so users do not have to wait for a full block of text. LangChain’s .stream() method handles this logic and helps future-proof your application against longer model response times.

import { ChatGoogleGenerativeAI } from "@langchain/google-genai";

async function runGeminiStream() {
  const model = new ChatGoogleGenerativeAI({
    model: "gemini-3.0-pro",
    apiKey: process.env.GOOGLE_API_KEY,
  });

  console.log("--- Starting Stream ---");

  const stream = await model.stream("Explain quantum computing in one sentence.");

  for await (const chunk of stream) {
    process.stdout.write(chunk.content as string);
  }
}

runGeminiStream().catch(console.error);

Structured Output with Zod #

Extracting usable JSON from a Large Language Model (LLM) is a common challenge. While you can prompt Gemini for "JSON only," the results are often fragile. Using withStructuredOutput combined with Zod ensures type safety and runtime validation.

import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { z } from "zod";

const recipeSchema = z.object({
  name: z.string().describe("The name of the dish"),
  ingredients: z.array(z.string()).describe("List of ingredients"),
  prepTime: z.number().describe("Preparation time in minutes"),
  difficulty: z.enum(["Easy", "Medium", "Hard"]),
});

type Recipe = z.infer<typeof recipeSchema>;

async function generateRecipe() {
  const model = new ChatGoogleGenerativeAI({
    model: "gemini-3.0-pro",
    temperature: 0,
    apiKey: process.env.GOOGLE_API_KEY,
  });

  const structuredLlm = model.withStructuredOutput(recipeSchema);
  const result = await structuredLlm.invoke("How do I make scrambled eggs?");

  console.log(`Cooking ${result.name} will take ${result.prepTime} minutes.`);
}

The LangServe Architecture #

Many enterprise stacks pair a fast TypeScript frontend with a Python-based data science ecosystem on the backend. LangServe wraps your Python LangChain code and deploys it as a REST API, which removes the need for manual FastAPI boilerplate.

Step 1: The Python Server #

This script runs on your backend and exposes your application logic.

import uvicorn
from fastapi import FastAPI
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langserve import add_routes

# 1. Define the Chain
model = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_template("Explain {topic} to a 5-year-old.")
chain = prompt | model

# 2. Initialize FastAPI
app = FastAPI(
  title="AI Logic Server",
  version="1.0",
  description="An API server using LangChain's Runnable interfaces",
)

# 3. Add Routes
# This creates /explain/invoke and /explain/stream endpoints automatically
add_routes(
  app,
  chain,
  path="/explain",
)

if __name__ == "__main__":
  uvicorn.run(app, host="localhost", port=8000)

Step 2: The TypeScript Client #

From your frontend, the remote server behaves exactly like a local function.

import { RemoteRunnable } from "@langchain/core/runnables/remote";

async function runRemoteChain() {
  const remoteChain = new RemoteRunnable({
    url: "http://localhost:8000/explain",
  });

  const result = (await remoteChain.invoke({
    topic: "Black Holes",
  })) as { content: string };

  console.log("Response from Python:", result.content);
}

Working with External APIs #

Real-world applications must fetch and process external data. LangChain provides two patterns for this: a Query Builder for inputs and RunnableLambda for outputs.

Composing Fetch Queries (The Query Builder) #

Avoid asking an LLM to generate a raw URL, as this often leads to syntax errors. Instead, request structured parameters from the LLM and use your code to construct the URL safely.

import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { z } from "zod";

const searchParamsSchema = z.object({
  query: z.string(),
  status: z.enum(["active", "archived"]).optional(),
  limit: z.number().min(1).max(50).default(10),
  sortBy: z.enum(["date", "relevance"]).default("date"),
});

async function composeQuery() {
  const model = new ChatGoogleGenerativeAI({ model: "gemini-3.0-pro" });
  const queryGenerator = model.withStructuredOutput(searchParamsSchema);

  const params = await queryGenerator.invoke(
    "Find me the last 5 active user logs sorted by date."
  );

  const queryString = new URLSearchParams({
    q: params.query,
    limit: params.limit.toString(),
    sort: params.sortBy,
    ...(params.status && { status: params.status }),
  }).toString();

  console.log(`URL: /api/logs?${queryString}`);
}

Processing API Results (The Runnable Pattern) #

Wrap your fetch logic in a RunnableLambda to inject external data directly into a LangChain pipeline.

import { RunnableSequence, RunnableLambda } from "@langchain/core/runnables";

const apiStep = new RunnableLambda({
  func: async (input: { id: string }) => {
    // Standard fetch logic
    return { id: input.id, uptime: 99.9, logs: ["Critical Error"] };
  },
});

const chain = RunnableSequence.from([
  { api_data: apiStep },
  prompt,
  model,
]);

await chain.invoke({ id: "server-01" });

Multi-Model Support #

A factory function allows you to switch the "brain" of your application without altering your core business logic.

import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { ChatOpenAI } from "@langchain/openai";
import { ChatAnthropic } from "@langchain/anthropic";
import { BaseChatModel } from "@langchain/core/language_models/chat_models";

function getModel(provider: "google" | "openai" | "anthropic"): BaseChatModel {
  const config = { apiKey: process.env.API_KEY };

  switch (provider) {
    case "google":
      return new ChatGoogleGenerativeAI({ model: "gemini-3.0-pro", ...config });
    case "openai":
      return new ChatOpenAI({ model: "gpt-4o", ...config });
    case "anthropic":
      return new ChatAnthropic({ model: "claude-3-5-sonnet-latest", ...config });
    default:
      throw new Error("Unknown provider");
  }
}

async function main() {
  const model = getModel("openai");
  const response = await model.invoke("What is the capital of France?");
  console.log(response.content);
}
main().catch(console.error);

Summary #

Gemini 3 integrates seamlessly using the gemini-3.0-pro model string.
Structured Output with Zod is the industry standard for reliable data parsing.
LangServe connects Python-heavy AI logic to TypeScript frontends efficiently.
External APIs remain secure when you use the Query Builder pattern for parameter management.
Multi-Provider Support allows you to switch models by instantiating different classes.

Edit on Github

globalThis in JavaScript The Trap of Synthetic Testing vs. Reality