The Call AI Model node is a powerful component that allows you to make calls to various AI providers and their models. This guide explains how to configure and use AI model calls effectively in your Neurons.

Functionality

The Call AI Model node allows you to:

  • Select from configured AI providers
  • Choose specific models for each provider
  • Configure model parameters
  • Define structured output schemas (when supported)
  • Chain multiple model calls for advanced workflows

Node Properties

AI Provider Configuration

  • AI Provider: Select from your configured providers (OpenAI, Google, Anthropic, Cohere, or Mistral)

    Providers must be configured with valid API keys in the Providers section before they appear in the list. You can manage multiple API keys per provider for different environments or projects.

Model Selection

  • Model: Choose from available models for the selected provider
    • Model availability depends on the selected provider
    • Some models support structured output, while others don’t

Model Parameters

Available parameters and their valid ranges vary depending on the selected model. The interface will automatically update to only show the parameters supported by your chosen model.

Common parameters across many models include:

ParameterDescriptionNotes
Max Output TokensMaximum number of tokens in the model’s responseHigher values allow longer responses but increase costs.
Each model has its own maximum limit.
TemperatureControls response randomnessLower values (0.1-0.3): More focused, deterministic responses.
Higher values (0.7-1.0): More creative, varied responses.
Recommended: 0.1-0.3 for structured output.
Top P
(Nucleus Sampling)
Controls response diversityWorks alongside temperature.
Lower values: More focused on likely tokens.
Higher values: More diverse vocabulary.
Not available in all models.
Top KLimits token selection to K most likely tokensHelps prevent unlikely token selections.
Only available in specific models (e.g., Google’s Gemini).
Frequency PenaltyReduces repetition based on token frequencyHigher values discourage repeating information.
Useful for diverse content.
Primarily in OpenAI models.
Presence PenaltyPenalizes tokens that have appeared at allHigher values encourage new topics.
Helps prevent theme repetition.
Primarily in OpenAI models.

Some models may expose additional parameters not listed here. Always check the provider’s documentation for model-specific parameter details.

Output Configuration

  • Output Format:
    • Text (default): Regular text output
    • Structured Output: JSON-formatted output following a schema
  • Schema Configuration: (When using Structured Output)
    • Inline JSON: Define the schema directly
    • URL: Reference an external JSON Schema

Advanced Usage: Chaining Models

You can create powerful workflows by chaining different models using nodes:

  1. Use a more capable model for initial processing
  2. Connect to a Use Output node
  3. Change the system prompt of the second model, to ask it to structure the output into a specific format, for example: Extract useful information from the following text
  4. Feed the result to a fast (and cheap!) model with structured output support

This pattern allows you to:

  • Leverage the strengths of different models
  • Enforce structured output even with models that don’t natively support it
  • Optimize for both quality and cost

Tips and Best Practices

  • Start with lower temperatures (0.1-0.3) when using structured output to get more consistent results
  • Use Top K and Top P carefully as they can significantly impact output quality
  • When using structured output:
    • Ensure your schema is valid and well-defined
    • Test with simple schemas before moving to complex ones
  • Monitor token usage and costs through execution logs when chaining multiple model calls