> ## Documentation Index
> Fetch the complete documentation index at: https://docs.prompteus.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Call AI Model

> Learn how to use the **Call AI Model** node to interact with various AI providers and models, with support for structured output.

The **Call AI Model** node is a powerful component that allows you to make calls to various AI providers and their models. This guide explains how to configure and use AI model calls effectively in your Neurons.

<Snippet>
  The **Call AI Model** node enables you to interact with AI models from
  different providers, configure model parameters, and optionally enforce
  structured output formats using JSON schemas.
</Snippet>

## Functionality

The **Call AI Model** node allows you to:

* Select from [configured AI providers](/features/providers)
* Choose specific models for each provider
* Configure model parameters
* Define structured output schemas (when supported)
* Chain multiple model calls for advanced workflows

## Node Properties

### AI Provider Configuration

* **AI Provider:** Select from your configured providers (OpenAI, Google, Anthropic, Cohere, Mistral, or Cloudflare Workers AI)
  <Note>
    Providers must be configured with valid API keys in the
    [Providers](/features/providers) section before they appear in the list. You can
    manage multiple API keys per provider for different environments or
    projects.
  </Note>

### MCP Server tools selection

[Using an MCP Server in a Neuron workflow](/features/mcp-servers#using-an-mcp-server-in-a-neuron-workflow)

### Model Selection

* **Model:** Choose from available models for the selected provider
  * Model availability depends on the selected provider
  * Cloudflare Workers AI provides access to 50+ models including Llama, Mistral, etc.
  * Some models support structured output, while others don't

### Model Parameters

<Note>
  Available parameters and their valid ranges vary depending on the selected
  model. The interface will automatically update to only show the parameters
  supported by your chosen model.
</Note>

Common parameters across many models include:

| Parameter                         | Description                                      | Notes                                                                                                                                                                          |
| --------------------------------- | ------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Max Output Tokens**             | Maximum number of tokens in the model's response | Higher values allow longer responses but increase costs.<br />Each model has its own maximum limit.                                                                            |
| **Temperature**                   | Controls response randomness                     | Lower values (0.1-0.3): More focused, deterministic responses.<br />Higher values (0.7-1.0): More creative, varied responses.<br />Recommended: 0.1-0.3 for structured output. |
| **Top P**<br />(Nucleus Sampling) | Controls response diversity                      | Works alongside temperature.<br />Lower values: More focused on likely tokens.<br />Higher values: More diverse vocabulary.<br />Not available in all models.                  |
| **Top K**                         | Limits token selection to K most likely tokens   | Helps prevent unlikely token selections.<br />Only available in specific models (e.g., Google's Gemini).                                                                       |
| **Frequency Penalty**             | Reduces repetition based on token frequency      | Higher values discourage repeating information.<br />Useful for diverse content.<br />Primarily in OpenAI models.                                                              |
| **Presence Penalty**              | Penalizes tokens that have appeared at all       | Higher values encourage new topics.<br />Helps prevent theme repetition.<br />Primarily in OpenAI models.                                                                      |

<Note>
  Some models may expose additional parameters not listed here. Always check the
  provider's documentation for model-specific parameter details.
</Note>

### Output Configuration

* **Output Format:**
  * Text (default): Regular text output
  * Structured Output: JSON-formatted output following a schema
* **Schema Configuration:** (When using Structured Output)
  * Inline JSON: Define the schema directly
  * URL: Reference an external JSON Schema

## Advanced Usage: Chaining Models

You can create powerful workflows by chaining different models using nodes:

1. Use a more capable model for initial processing
2. Connect to a **Use Output** node
3. Change the system instructions of the second model, to ask it to structure the output into a specific format, for example: `Extract useful information from the following text`
4. Feed the result to a fast (and cheap!) model with structured output support

This pattern allows you to:

* Leverage the strengths of different models
* Enforce structured output even with models that don't natively support it
* Optimize for both quality and cost

## Tips and Best Practices

* Start with lower temperatures (0.1-0.3) when using structured output to get more consistent results
* Use Top K and Top P carefully as they can significantly impact output quality
* When using structured output:
  * Ensure your schema is valid and well-defined
  * Test with simple schemas before moving to complex ones
* Monitor token usage and costs through [execution logs](/neurons/logging) when chaining multiple model calls

## Testing Undeployed Versions

You can test undeployed versions of your Neurons using the **Snippets** button in the editor. This feature:

* Generates code snippets that can be used to call your Neuron in its current state
* Allows you to test revisions before deploying
* Provides example code in various programming languages
* Helps verify that your changes work as expected

This is particularly useful when:

* Making changes to model parameters
* Testing new structured output schemas
* Verifying model chaining behavior
* Debugging issues with specific configurations
