Skip to main content

Quickstart

This guide walks you through uploading a document and extracting structured data using the Sterndesk API.

Prerequisites

Before you begin, ensure you have:
  • An API key for authenticating requests. See Authentication for setup instructions.
  • An organization and project configured. These are created by default when you initialize your account. If you need to create additional ones, see Organizations and Projects.
Set your API key as an environment variable:
export STERNDESK_API_KEY="your-api-key"

Step 1: Get Your Organization ID

First, retrieve your organization ID by listing all organizations you have access to:
curl -X GET "https://api.sterndesk.com/r/organizations" \
  -H "Authorization: Bearer $STERNDESK_API_KEY"
{
  "items": [
    {
      "id": "org_2abc123def456",
      "name": "My Organization",
      "description": "Default organization",
      "createdAt": "2024-01-15T10:00:00Z",
      "updatedAt": "2024-01-15T10:00:00Z"
    }
  ]
}
Save the id value for the next step.

Step 2: Get Your Project ID

List projects within your organization:
curl -X GET "https://api.sterndesk.com/r/projects?organizationId=org_2abc123def456" \
  -H "Authorization: Bearer $STERNDESK_API_KEY"
{
  "items": [
    {
      "id": "proj_3xyz789ghi012",
      "name": "Default Project",
      "organizationId": "org_2abc123def456"
    }
  ]
}

Step 3: Create an Extraction Schema

Define a schema that describes the structure of data you want to extract. This example creates a schema for research papers:
curl -X POST "https://api.sterndesk.com/r/extraction-schemas" \
  -H "Authorization: Bearer $STERNDESK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": "proj_3xyz789ghi012",
    "name": "research-paper",
    "jsonSchema": "{\"type\":\"object\",\"properties\":{\"title\":{\"type\":\"string\",\"description\":\"The title of the research paper\"},\"authors\":{\"type\":\"array\",\"items\":{\"type\":\"string\"},\"description\":\"List of author names\"},\"abstract\":{\"type\":\"string\",\"description\":\"The paper abstract or summary\"},\"keywords\":{\"type\":\"array\",\"items\":{\"type\":\"string\"},\"description\":\"Keywords or topics covered\"},\"publicationYear\":{\"type\":\"integer\",\"description\":\"Year of publication\"}},\"required\":[\"title\",\"authors\"]}"
  }'
{
  "id": "schema_4mno345pqr678",
  "name": "research-paper"
}
To learn more about defining schemas, see Extraction Schemas.

Step 4: Create an Upload Collector

Create an upload collector that uses your extraction schema. This configures how uploaded files will be processed:
curl -X POST "https://api.sterndesk.com/r/upload-collectors" \
  -H "Authorization: Bearer $STERNDESK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": "proj_3xyz789ghi012",
    "name": "research-papers-collector",
    "strategy": "UPLOAD_STRATEGY_PUT",
    "directExtractionSchemaId": "schema_4mno345pqr678"
  }'
{
  "id": "coll_5stu901vwx234",
  "name": "research-papers-collector"
}
To learn more about collectors, see Collectors.

Step 5: Create an Upload

Initiate an upload to get pre-signed URLs for your files. Specify the file sizes and an expiration duration:
curl -X POST "https://api.sterndesk.com/r/uploads" \
  -H "Authorization: Bearer $STERNDESK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "uploadCollectorId": "coll_5stu901vwx234",
    "files": [
      {"sizeBytes": 245678}
    ],
    "uploadExpiration": "3600s"
  }'
{
  "preSigns": [
    {
      "url": "https://storage.sterndesk.com/uploads/abc123?signature=xyz...",
      "strategy": "UPLOAD_STRATEGY_PUT"
    }
  ]
}
The response contains pre-signed URLs, one for each file you specified.
To learn more about pre-signed URLs and upload strategies, see Upload URLs.

Step 6: Upload Your File

Use the pre-signed URL to upload your file directly. For UPLOAD_STRATEGY_PUT, use an HTTP PUT request:
curl -X PUT "https://storage.sterndesk.com/uploads/abc123?signature=xyz..." \
  -H "Content-Type: application/pdf" \
  --data-binary @research-paper.pdf

Step 7: Poll for Extraction Results

After uploading, Sterndesk automatically processes the file and extracts data according to your schema. Poll the extractions endpoint to check the status:
curl -X GET "https://api.sterndesk.com/r/direct-upload-extractions?uploadId=upload_6yza567bcd890" \
  -H "Authorization: Bearer $STERNDESK_API_KEY"
The extraction progresses through these statuses:
StatusDescription
DIRECT_UPLOAD_EXTRACTION_STATUS_CREATEDUpload received, processing queued
DIRECT_UPLOAD_EXTRACTION_STATUS_CONVERTEDDocument converted, extraction in progress
DIRECT_UPLOAD_EXTRACTION_STATUS_STRUCTUREDExtraction complete, data available
Once the status is DIRECT_UPLOAD_EXTRACTION_STATUS_STRUCTURED, the extracted data is available in the extractionOutput field:
{
  "items": [
    {
      "id": "ext_7efg123hij456",
      "status": "DIRECT_UPLOAD_EXTRACTION_STATUS_STRUCTURED",
      "extractionOutput": {
        "title": "Machine Learning Approaches for Document Classification",
        "authors": ["Jane Smith", "John Doe"],
        "abstract": "This paper presents novel approaches to...",
        "keywords": ["machine learning", "NLP", "document classification"],
        "publicationYear": 2024
      }
    }
  ]
}

Next Steps