Quickstart

This guide walks you through uploading a document and extracting structured data using the Sterndesk API.

Prerequisites

Before you begin, ensure you have:

An API key for authenticating requests. See Authentication for setup instructions.
An organization and project configured. These are created by default when you initialize your account. If you need to create additional ones, see Organizations and Projects.

Set your API key as an environment variable:

export STERNDESK_API_KEY="your-api-key"

Step 1: Get Your Organization ID

First, retrieve your organization ID by listing all organizations you have access to:

curl -X GET "https://api.sterndesk.com/r/organizations" \
  -H "Authorization: Bearer $STERNDESK_API_KEY"

{
  "items": [
    {
      "id": "org_2abc123def456",
      "name": "My Organization",
      "description": "Default organization",
      "createdAt": "2024-01-15T10:00:00Z",
      "updatedAt": "2024-01-15T10:00:00Z"
    }
  ]
}

Save the id value for the next step.

Step 2: Get Your Project ID

List projects within your organization:

curl -X GET "https://api.sterndesk.com/r/projects?organizationId=org_2abc123def456" \
  -H "Authorization: Bearer $STERNDESK_API_KEY"

{
  "items": [
    {
      "id": "proj_3xyz789ghi012",
      "name": "Default Project",
      "organizationId": "org_2abc123def456"
    }
  ]
}

Step 3: Create an Extraction Schema

Define a schema that describes the structure of data you want to extract. This example creates a schema for research papers:

curl -X POST "https://api.sterndesk.com/r/extraction-schemas" \
  -H "Authorization: Bearer $STERNDESK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": "proj_3xyz789ghi012",
    "name": "research-paper",
    "jsonSchema": "{\"type\":\"object\",\"properties\":{\"title\":{\"type\":\"string\",\"description\":\"The title of the research paper\"},\"authors\":{\"type\":\"array\",\"items\":{\"type\":\"string\"},\"description\":\"List of author names\"},\"abstract\":{\"type\":\"string\",\"description\":\"The paper abstract or summary\"},\"keywords\":{\"type\":\"array\",\"items\":{\"type\":\"string\"},\"description\":\"Keywords or topics covered\"},\"publicationYear\":{\"type\":\"integer\",\"description\":\"Year of publication\"}},\"required\":[\"title\",\"authors\"]}"
  }'

{
  "id": "schema_4mno345pqr678",
  "name": "research-paper"
}

To learn more about defining schemas, see Extraction Schemas.

Step 4: Create an Upload Collector

Create an upload collector that uses your extraction schema. This configures how uploaded files will be processed:

curl -X POST "https://api.sterndesk.com/r/upload-collectors" \
  -H "Authorization: Bearer $STERNDESK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": "proj_3xyz789ghi012",
    "name": "research-papers-collector",
    "strategy": "UPLOAD_STRATEGY_PUT",
    "directExtractionSchemaId": "schema_4mno345pqr678"
  }'

{
  "id": "coll_5stu901vwx234",
  "name": "research-papers-collector"
}

To learn more about collectors, see Collectors.

Step 5: Create an Upload

Initiate an upload to get pre-signed URLs for your files. Specify the file sizes and an expiration duration:

curl -X POST "https://api.sterndesk.com/r/uploads" \
  -H "Authorization: Bearer $STERNDESK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "uploadCollectorId": "coll_5stu901vwx234",
    "files": [
      {"sizeBytes": 245678}
    ],
    "uploadExpiration": "3600s"
  }'

{
  "preSigns": [
    {
      "url": "https://storage.sterndesk.com/uploads/abc123?signature=xyz...",
      "strategy": "UPLOAD_STRATEGY_PUT"
    }
  ]
}

The response contains pre-signed URLs, one for each file you specified.

To learn more about pre-signed URLs and upload strategies, see Upload URLs.

Step 6: Upload Your File

Use the pre-signed URL to upload your file directly. For UPLOAD_STRATEGY_PUT, use an HTTP PUT request:

curl -X PUT "https://storage.sterndesk.com/uploads/abc123?signature=xyz..." \
  -H "Content-Type: application/pdf" \
  --data-binary @research-paper.pdf

Step 7: Poll for Extraction Results

After uploading, Sterndesk automatically processes the file and extracts data according to your schema. Poll the extractions endpoint to check the status:

curl -X GET "https://api.sterndesk.com/r/direct-upload-extractions?uploadId=upload_6yza567bcd890" \
  -H "Authorization: Bearer $STERNDESK_API_KEY"

The extraction progresses through these statuses:

Status	Description
`DIRECT_UPLOAD_EXTRACTION_STATUS_CREATED`	Upload received, processing queued
`DIRECT_UPLOAD_EXTRACTION_STATUS_CONVERTED`	Document converted, extraction in progress
`DIRECT_UPLOAD_EXTRACTION_STATUS_STRUCTURED`	Extraction complete, data available

Once the status is DIRECT_UPLOAD_EXTRACTION_STATUS_STRUCTURED, the extracted data is available in the extractionOutput field:

{
  "items": [
    {
      "id": "ext_7efg123hij456",
      "status": "DIRECT_UPLOAD_EXTRACTION_STATUS_STRUCTURED",
      "extractionOutput": {
        "title": "Machine Learning Approaches for Document Classification",
        "authors": ["Jane Smith", "John Doe"],
        "abstract": "This paper presents novel approaches to...",
        "keywords": ["machine learning", "NLP", "document classification"],
        "publicationYear": 2024
      }
    }
  ]
}

Next Steps

Learn more about Extraction Schemas to define complex data structures
Explore Collectors for different data ingestion methods
Check out Extractions to understand the extraction pipeline

Get Started

Concepts

Guides

Quickstart

Quickstart

Prerequisites

Step 1: Get Your Organization ID

Step 2: Get Your Project ID

Step 3: Create an Extraction Schema

Step 4: Create an Upload Collector

Step 5: Create an Upload

Step 6: Upload Your File

Step 7: Poll for Extraction Results

Next Steps

Get Started

Concepts

Guides

​Quickstart

​Prerequisites

​Step 1: Get Your Organization ID

​Step 2: Get Your Project ID

​Step 3: Create an Extraction Schema

​Step 4: Create an Upload Collector

​Step 5: Create an Upload

​Step 6: Upload Your File

​Step 7: Poll for Extraction Results

​Next Steps

Quickstart

Prerequisites

Step 1: Get Your Organization ID

Step 2: Get Your Project ID

Step 3: Create an Extraction Schema

Step 4: Create an Upload Collector

Step 5: Create an Upload

Step 6: Upload Your File

Step 7: Poll for Extraction Results

Next Steps