AI Detector

The endpoint computes the probability that a piece of text is AI-generated, as well as the probability that each constituent sentence and token is AI-generated.

The system is trained to be able to handle LLMs from different vendors, such as OpenAI's GPT family of models, Google's Gemini models, Anthropic's Claude models, and the open-release Llama and Mistral models. It is also somewhat robust to small changes and noisy text.

Run Detector

Accuracy and Trade-offs

All AI detection systems have false positives and false negatives. In some cases, small modifications to AI-generated text can cause that text to no longer be flagged as AI-generated. In other cases, human-written (but perhaps rote) text can be misclassified as AI-generated. Depending on the application, false positives or false negatives may be less desirable. Contact us for ways to adjust for your use case.

Sample Code

cURL
JavaScript
Python
Python SDK

curl -X POST https://api.sapling.ai/api/v1/aidetect \
     -H "Content-Type: application/json" \
     -d '{"key":"<api-key>", "text":"This is sample text."}'

import axios from 'axios';

async function run(text) {
    try {
        const response = await axios.post(
            'https://api.sapling.ai/api/v1/aidetect',
            {
                key: '<api-key>',
                text,
            },
        );
        const {status, data} = response;
        console.log({status});
        console.log(JSON.stringify(data, null, 4));
    } catch (err) {
        const { msg } = err.response.data;
        console.log({err: msg});
    }
}

run('This is sample text'); // replace with the text you want to analyze

import requests
from pprint import pprint

response = requests.post(
    "https://api.sapling.ai/api/v1/aidetect",
    json={
        "key": "<api-key>",
        "text": "This is sample text."
    }
)

if 200 <= response.status_code < 300:
    pprint(response.json())
else:
    print('Error: ', response.status_code, response.text)

# python -m pip install sapling-py

from sapling import SaplingClient
from pprint import pprint

api_key ='<api-key>'
client = SaplingClient(api_key=api_key)
detection_scores = client.aidetect('This is sample text.', sent_scores=True)
pprint(detection_scores)

AI Detector UI integration

Sapling's Javascript SDK provides a complete end-to-end UI integration for AI content detection capabilities. Head over to our AI Detect JavaScript Quickstart for more details.

AI Detector POST

Request Parameters

https://api.sapling.ai/api/v1/aidetect

HTTP method: POST

The AI Detector API POST endpoint takes JSON parameters documented below:

key: String
32-character API key.

text: String
Text to run detection on. The limit is currently 200,000 characters. If latency is high or requests time out, we recommend adapting this script. Please contact us if you need to run the system on longer inputs. We can also provide suggestions on how to chunk your text into smaller pieces and then combine detection results.

sent_scores: Boolean
Whether to return sentence scores. Defaults to true. If speed is of the essence, you can disable this setting.

score_string: Boolean
Whether to return string highlighting token-level scores. Defaults to false. This allows you to visualize which portions of the text are likely AI-generated similar to on Sapling's AI detector page.

version: String
There are currently 3 versions of the detector available

20230317
20231024
20240606 (current default)

While we have found the later versions to be more performant, you may wish to use the older version to ensure consistency in your application.

Response Parameters

The AI Detector POST endpoint returns JSON of the following format:

{
    "score": 0.8016229165451867,
    "sentence_scores": [
        {
            "score": 1.1537837352193492e-10,
            "sentence": "Here is a sentence."
        }
    ],
    "text": "Here is a sentence.",
    "token_probs": [
        0.8062431365251541,
        0.8068526238203049,
        0.8062431365251541,
        0.8080672174692154,
        0.8062431365251541
    ],
    "tokens": [
        "Here",
        " is",
        " a",
        " sentence",
        "."
    ]
}

A score from 0 to 1 be returned, with 0 indicating the maximum confidence that the text is human-written, and 1 indicating the maximum confidence that the text is AI-generated.

If score_string is set to true, a score_string field will be provided. The field contains an HTML string with a heatmap of the portions of the text that are predicted to be AI-generated. If the default score string is not what you desire, you can generate your own using tokens and token_probs.

If the flag is set, a field sentence_scores containing scores for each sentence will also be returned. The per-sentence scores may not correlate with the overall score field as they're computed using a different method from the overall score.

tokens: List of tokens from backend tokenizer that can be used to token_probs to visualize the output prediction per token.

token_probs: List of probabilities that each token is AI-generated. This can be used with tokens to visualize the output prediction per token.

Tips

Check the status code of the result and any error logs.
Unless you're sending very large requests, the requests should rarely fail or time out, but you can follow these instructions to implement a retry mechanism.

Checking Files (PDF/DOCX)

Sometimes you may wish to send the API PDFs or DOCX files.

To do this, refer to the Files documentation to see how you can extract text from files before passing the text to the API. These endpoints are currently provided free-of-charge; however, if you plan to use them for high-volumes of text, contact us and ensure you're using one of the other endpoints or we may limit usage to reduce server load.

Sample Code​

AI Detector POST​

Request Parameters​

Response Parameters​

Tips​

Checking Files (PDF/DOCX)​