Documentation Index
Fetch the complete documentation index at: https://docs.sterndesk.com/llms.txt
Use this file to discover all available pages before exploring further.
Quickstart
This guide walks you through uploading a document and extracting structured data using the Sterndesk API.
Prerequisites
Before you begin, ensure you have:
- An API key for authenticating requests. See Authentication for setup instructions.
- An organization and project configured. These are created by default when you initialize your account. If you need to create additional ones, see Organizations and Projects.
Set your API key as an environment variable:
export STERNDESK_API_KEY="your-api-key"
Step 1: Get Your Organization ID
First, retrieve your organization ID by listing all organizations you have access to:
curl -X GET "https://api.sterndesk.com/r/organizations" \
-H "Authorization: Bearer $STERNDESK_API_KEY"
{
"items": [
{
"id": "org_2abc123def456",
"name": "My Organization",
"description": "Default organization",
"createdAt": "2024-01-15T10:00:00Z",
"updatedAt": "2024-01-15T10:00:00Z"
}
]
}
Save the id value for the next step.
Step 2: Get Your Project ID
List projects within your organization:
curl -X GET "https://api.sterndesk.com/r/projects?organizationId=org_2abc123def456" \
-H "Authorization: Bearer $STERNDESK_API_KEY"
{
"items": [
{
"id": "proj_3xyz789ghi012",
"name": "Default Project",
"organizationId": "org_2abc123def456"
}
]
}
Define a schema that describes the structure of data you want to extract. This example creates a schema for research papers:
curl -X POST "https://api.sterndesk.com/r/extraction-schemas" \
-H "Authorization: Bearer $STERNDESK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"projectId": "proj_3xyz789ghi012",
"name": "research-paper",
"jsonSchema": "{\"type\":\"object\",\"properties\":{\"title\":{\"type\":\"string\",\"description\":\"The title of the research paper\"},\"authors\":{\"type\":\"array\",\"items\":{\"type\":\"string\"},\"description\":\"List of author names\"},\"abstract\":{\"type\":\"string\",\"description\":\"The paper abstract or summary\"},\"keywords\":{\"type\":\"array\",\"items\":{\"type\":\"string\"},\"description\":\"Keywords or topics covered\"},\"publicationYear\":{\"type\":\"integer\",\"description\":\"Year of publication\"}},\"required\":[\"title\",\"authors\"]}"
}'
{
"id": "schema_4mno345pqr678",
"name": "research-paper"
}
Step 4: Create an Upload Collector
Create an upload collector that uses your extraction schema. This configures how uploaded files will be processed:
curl -X POST "https://api.sterndesk.com/r/upload-collectors" \
-H "Authorization: Bearer $STERNDESK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"projectId": "proj_3xyz789ghi012",
"name": "research-papers-collector",
"strategy": "UPLOAD_STRATEGY_PUT",
"directExtractionSchemaId": "schema_4mno345pqr678"
}'
{
"id": "coll_5stu901vwx234",
"name": "research-papers-collector"
}
Step 5: Create an Upload
Initiate an upload to get pre-signed URLs for your files. Specify the file sizes and an expiration duration:
curl -X POST "https://api.sterndesk.com/r/uploads" \
-H "Authorization: Bearer $STERNDESK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"uploadCollectorId": "coll_5stu901vwx234",
"files": [
{"sizeBytes": 245678}
],
"uploadExpiration": "3600s"
}'
{
"preSigns": [
{
"url": "https://storage.sterndesk.com/uploads/abc123?signature=xyz...",
"strategy": "UPLOAD_STRATEGY_PUT"
}
]
}
The response contains pre-signed URLs, one for each file you specified.
To learn more about pre-signed URLs and upload strategies, see Upload URLs.
Step 6: Upload Your File
Use the pre-signed URL to upload your file directly. For UPLOAD_STRATEGY_PUT, use an HTTP PUT request:
curl -X PUT "https://storage.sterndesk.com/uploads/abc123?signature=xyz..." \
-H "Content-Type: application/pdf" \
--data-binary @research-paper.pdf
After uploading, Sterndesk automatically processes the file and extracts data according to your schema. Poll the extractions endpoint to check the status:
curl -X GET "https://api.sterndesk.com/r/direct-upload-extractions?uploadId=upload_6yza567bcd890" \
-H "Authorization: Bearer $STERNDESK_API_KEY"
The extraction progresses through these statuses:
| Status | Description |
|---|
DIRECT_UPLOAD_EXTRACTION_STATUS_CREATED | Upload received, processing queued |
DIRECT_UPLOAD_EXTRACTION_STATUS_CONVERTED | Document converted, extraction in progress |
DIRECT_UPLOAD_EXTRACTION_STATUS_STRUCTURED | Extraction complete, data available |
Once the status is DIRECT_UPLOAD_EXTRACTION_STATUS_STRUCTURED, the extracted data is available in the extractionOutput field:
{
"items": [
{
"id": "ext_7efg123hij456",
"status": "DIRECT_UPLOAD_EXTRACTION_STATUS_STRUCTURED",
"extractionOutput": {
"title": "Machine Learning Approaches for Document Classification",
"authors": ["Jane Smith", "John Doe"],
"abstract": "This paper presents novel approaches to...",
"keywords": ["machine learning", "NLP", "document classification"],
"publicationYear": 2024
}
}
]
}
Next Steps