The versatility of Neurons, and the fact that you can design them any way you want through our visual editor, makes them a great fit for a wide range of use cases.

Here are some of the most common use cases that our users have built with Prompteus.

Testing and Changing AI Models

Prompteus enables rapid adoption of new AI models as they are released. Instead of a lengthy process involving code refactoring, testing with limited user sets, prompt optimization, and retesting, Neurons streamline this to a fraction of the time.

Key Benefits:

  • No Code Changes: Switch models without modifying your application’s core code.
  • A/B Testing (Randomized Split): Direct a percentage of requests to different models (e.g., 50% to GPT-3.5, 50% to Gemini 1.5 Flash). Easily adjust these percentages to perform canary testing, gradually increasing exposure to a new model.
  • Real-time Deployment: Changes to Neuron configurations (like model assignments or split values) are deployed instantly.
  • Detailed Logs: Track which model handled each request and review the execution details.

Handling Sensitive Data (Guardrails)

Prompteus provides guardrails for scenarios where AI processes sensitive information. This is crucial for use cases like summarizing messages or phone conversations, where personal data (e.g., credit card numbers) might be present.

Key Features:

  • Redaction: Use the Replace Words in Prompt node to detect and replace sensitive data before it’s sent to the AI provider. The video shows using regular expressions to identify and redact credit card numbers.
  • System Prompts: Define a system prompt to instruct the AI on how to handle the overall task (e.g., “Summarize this bank call…”).

Caching: Saving on Request Costs and Speed

AI inference costs can be significant. Prompteus includes caching mechanisms to reduce these costs and improve response times.

Key Features:

  • Semantic Caching: Prompteus’s cache is multilingual and understands the meaning of inputs.
  • Configurable Threshold: You define a similarity threshold. If a new request is semantically similar enough to a previous one (above the threshold), Prompteus returns the cached response, avoiding a new AI inference call.
  • Cost Savings: The video mentions seeing up to 50% cost reduction in some workflows.
  • Speed Improvement: Cached responses are delivered much faster than new inferences.
  • Dashboard Tracking: The Prompteus dashboard shows your savings from caching.

Example: The video demonstrates a simple “Jeopardy contestant” Neuron. By asking two similar questions (“European country known for baguette” and “a country in Europe known for baguette”). The second response is served from the cache, even though the wording is slightly different.

Advanced Neuron Settings and Security Features

Prompteus includes a settings page to access advanced features.

Key Features:

  • Structured Output: Sets output format to Text or Structured Output, allowing for JSON Schema.
  • Access Control: Configure who can access a Neuron. Options include:
    • Public Access: Anyone with the Neuron’s URL can use it.
    • Referer Restrictions: Limit requests to specific domains.
    • IP Restrictions: Allow requests only from specific IP addresses.
    • Authentication: Require users to authenticate via API Key or JWT.
  • Rate Limiting: Set limits on Neuron execution (e.g., requests per minute, tokens per IP address) to prevent excessive usage.
  • Request Level Logging: see cost, payload details, and execution trace per request.
  • Model Fallback: Switch to another model or provider on model failure or unavailability.
  • Neuron Chaining: Call one Neuron from another.
  • Tool Calling (Private Beta): Transform existing APIs into tools callable by supported AI models. This allows mapping your own APIs to tools that LLMs can use.