Call AI Model
Learn how to use the Call AI Model node to interact with various AI providers and models, with support for structured output.
The Call AI Model node is a powerful component that allows you to make calls to various AI providers and their models. This guide explains how to configure and use AI model calls effectively in your Neurons.
Functionality
The Call AI Model node allows you to:
- Select from configured AI providers
- Choose specific models for each provider
- Configure model parameters
- Define structured output schemas (when supported)
- Chain multiple model calls for advanced workflows
Node Properties
AI Provider Configuration
- AI Provider: Select from your configured providers (OpenAI, Google, Anthropic, Cohere, or Mistral)
Providers must be configured with valid API keys in the Providers section before they appear in the list. You can manage multiple API keys per provider for different environments or projects.
Model Selection
- Model: Choose from available models for the selected provider
- Model availability depends on the selected provider
- Some models support structured output, while others don’t
Model Parameters
Available parameters and their valid ranges vary depending on the selected model. The interface will automatically update to only show the parameters supported by your chosen model.
Common parameters across many models include:
Parameter | Description | Notes |
---|---|---|
Max Output Tokens | Maximum number of tokens in the model’s response | Higher values allow longer responses but increase costs. Each model has its own maximum limit. |
Temperature | Controls response randomness | Lower values (0.1-0.3): More focused, deterministic responses. Higher values (0.7-1.0): More creative, varied responses. Recommended: 0.1-0.3 for structured output. |
Top P (Nucleus Sampling) | Controls response diversity | Works alongside temperature. Lower values: More focused on likely tokens. Higher values: More diverse vocabulary. Not available in all models. |
Top K | Limits token selection to K most likely tokens | Helps prevent unlikely token selections. Only available in specific models (e.g., Google’s Gemini). |
Frequency Penalty | Reduces repetition based on token frequency | Higher values discourage repeating information. Useful for diverse content. Primarily in OpenAI models. |
Presence Penalty | Penalizes tokens that have appeared at all | Higher values encourage new topics. Helps prevent theme repetition. Primarily in OpenAI models. |
Some models may expose additional parameters not listed here. Always check the provider’s documentation for model-specific parameter details.
Output Configuration
- Output Format:
- Text (default): Regular text output
- Structured Output: JSON-formatted output following a schema
- Schema Configuration: (When using Structured Output)
- Inline JSON: Define the schema directly
- URL: Reference an external JSON Schema
Advanced Usage: Chaining Models
You can create powerful workflows by chaining different models using nodes:
- Use a more capable model for initial processing
- Connect to a Use Output node
- Change the system prompt of the second model, to ask it to structure the output into a specific format, for example:
Extract useful information from the following text
- Feed the result to a fast (and cheap!) model with structured output support
This pattern allows you to:
- Leverage the strengths of different models
- Enforce structured output even with models that don’t natively support it
- Optimize for both quality and cost
Tips and Best Practices
- Start with lower temperatures (0.1-0.3) when using structured output to get more consistent results
- Use Top K and Top P carefully as they can significantly impact output quality
- When using structured output:
- Ensure your schema is valid and well-defined
- Test with simple schemas before moving to complex ones
- Monitor token usage and costs through execution logs when chaining multiple model calls