Skip to main content

Python Quickstart

Developer Key

A rate-limited API developer key can be provisioned from the Sapling API dashboard. Developer keys allow for processing of 5000 characters every 24 hours. Subscribe for production access and usage-based pricing.

Installation

Install the sapling-py package with pip

python -m pip install sapling-py

Getting Edits

Here's a sample script using the Sapling package:

from sapling import SaplingClient

api_key ='<api-key>'
client = SaplingClient(api_key=api_key)
edits = client.edits('Lets get started!', session_id='test_session')

The result of running the script should be an array of edits of this form:

[{
"id": "aa5ee291-a073-5146-8ebc-c9c899d01278",
"sentence": "Lets get started!",
"sentence_start": 0,
"start": 0,
"end": 4,
"replacement": "Let's",
"error_type": "R:OTHER",
"general_error_type": "Other",
}]

Applying Edits

After you've sent the Sapling API text and gotten JSON edits in the response, how do you apply the edits to get updated text?

The simplest way is to use the auto_apply argument. By setting this to true, the returned response will have an extra field, applied_text, that contains the text with the edit applied.

However, you can also easily apply edits programmatically.

Programmatically Applying Edits

Recall the Edit data structure:

{
"id": <str, UUID>, // Opaque edit id, used to give feedback
"sentence": <str>, // Unedited sentence
"sentence_start": <int>, // Offset of sentence from start of text
"start": <int>, // Offset of edit start relative to sentence
"end": <int>, // Offset of edit end relative to sentence
"replacement": <str>, // Suggested replacement
"error_type": <str>, // Error type, see "Error Categories"
"general_error_type": <str>, // See "Error Categories"
}

When programmatically applying edits, go in reverse start offset (sentence_start + start) order so changes don't affect the offsets of the remaining edits.

For example, consider the sentence: Lets go to the housee. where Sapling returns the following list of edits:

[
{
'sentence_start': 0,
'start': 0,
'end': 4,
'replacement': "Let's",
...
},
{
'sentence_start': 0,
'start': 15,
'end': 21,
'replacement': 'house',
...
}
]

The simplest way to apply the edits to your text is in reverse order:

  1. Replace characters 15-21 with house.
  2. Replace characters 0-4 with Let's.

If the characters for Lets are replaced before housee, the offsets for other edits would need to be updated.

Sample Code

We provide sample code below for applying edits.

A few things to keep in mind:

  • The edits array is ordered by starting position, though we include logic below to ensure this is the case.
  • For some languages where assignment is by reference, you will want to create a copy of the original string before modifying it.
    text = str(text)
edits = sorted(edits, key=lambda e: (e['sentence_start'] + e['start']), reverse=True)
for edit in edits:
start = edit['sentence_start'] + edit['start']
end = edit['sentence_start'] + edit['end']
if start > len(text) or end > len(text):
print(f'Edit start:{start}/end:{end} outside of bounds of text:{text}')
continue
text = text[: start] + edit['replacement'] + text[end:]
return text

Processing Files

The Sapling API can be used to process files as well. Sapling's Edit API currently has a 50,000 character limit, so larger documents will need to be chunked.

Sapling provides a pre-processing API endpoint that helps with chunking. This endpoint breaks long documents into pieces, prioritising splitting on things like page and paragraph breaks in order to preserve overall text context.

from sapling import SaplingClient

file_name = '<FILE_TO_PROCESS>'
api_key = '<api-key>'

text = ''
with open(file_name) as f:
text = f.read().strip()

client = SaplingClient(api_key=api_key)
chunks = client.chunk_text(text, max_length=20000)

for chunk in chunks:
edits = client.edits(chunk, session_id=file_name)

More information about the Chunking/Preprocessing endpoint can be found here.

More Details

More detail on the API request options and response structure can be found here.

Documentation on SaplingClient is available on Read the Docs.