Overview

Introduction

Nanonets provides an AI-driven Intelligent Document Processing API that transforms unstructured documents into structured data. Our advanced OCR and document data extraction capabilities enable you to:

  • Extract structured data from various document types (invoices, receipts, forms, etc.)
  • Convert unstructured text into organized, machine-readable formats
  • Process and analyze document content with high accuracy
  • Automate document workflows and data entry tasks

Key Features

  • Advanced OCR & Data Extraction: Extract text, fields, and tables from documents with high accuracy
  • Unstructured to Structured Data: Transform raw document content into organized, structured formats
  • Workflow Automation: Approve or reject extracted results and assign files for review
  • External Integrations: Seamlessly import documents from various sources and export data to business applications

Getting Started

This guide will help you get started with the Nanonets API quickly.

Prerequisites

  1. A Nanonets account
  2. An API key (get it from http://app.nanonets.com/#/keys)
  3. Basic knowledge of REST APIs
  4. Your preferred programming language (Python, JavaScript, etc.)

Quick Start with the REST API

1. Create Instant Learning Workflow

import requests
from requests.auth import HTTPBasicAuth

API_KEY = 'YOUR_API_KEY'
url = "https://app.nanonets.com/api/v4/workflows"

# Create instant learning workflow (default)
payload = {
    "description": "Extract data from custom documents",
    "workflow_type": ""  # Empty string for instant learning workflow
}

response = requests.post(
    url,
    json=payload,
    auth=HTTPBasicAuth(API_KEY, '')
)
print(response.json())
const axios = require('axios');

const API_KEY = 'YOUR_API_KEY';
const url = "https://app.nanonets.com/api/v4/workflows";

// Create instant learning workflow (default)
const payload = {
  description: "Extract data from custom documents",
  workflow_type: ""  // Empty string for instant learning workflow
};

axios.post(url, payload, {
  auth: {
    username: API_KEY,
    password: ''
  }
})
.then(response => {
  console.log(response.data);
})
.catch(error => {
  console.error(error);
});
curl -X POST \
  -u "YOUR_API_KEY:" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Extract data from custom documents",
    "workflow_type": ""
  }' \
  https://app.nanonets.com/api/v4/workflows

2. Configure Fields and Tables to Extract

import requests
from requests.auth import HTTPBasicAuth

API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id']  # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/fields"

payload = {
    "fields": [
        {"name": "invoice_number"},
        {"name": "total_amount"},
        {"name": "invoice_date"}
    ],
    "table_headers": [
        {"name": "item_description"},
        {"name": "quantity"},
        {"name": "unit_price"},
        {"name": "total"}
    ]
}

response = requests.put(
    url,
    json=payload,
    auth=HTTPBasicAuth(API_KEY, '')
)
print(response.json())
const axios = require('axios');

const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id;  // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/fields`;

const payload = {
  fields: [
    { name: "invoice_number" },
    { name: "total_amount" },
    { name: "invoice_date" }
  ],
  table_headers: [
    { name: "item_description" },
    { name: "quantity" },
    { name: "unit_price" },
    { name: "total" }
  ]
};

axios.put(url, payload, {
  auth: {
    username: API_KEY,
    password: ''
  }
})
.then(response => {
  console.log(response.data);
})
.catch(error => {
  console.error(error);
});
curl -X PUT \
  -u "YOUR_API_KEY:" \
  -H "Content-Type: application/json" \
  -d '{
    "fields": [
      { "name": "invoice_number" },
      { "name": "total_amount" },
      { "name": "invoice_date" }
    ],
    "table_headers": [
      { "name": "item_description" },
      { "name": "quantity" },
      { "name": "unit_price" },
      { "name": "total" }
    ]
  }' \
  https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/fields

3. Process Document

import requests
from requests.auth import HTTPBasicAuth

API_KEY = 'YOUR_API_KEY'
WORKFLOW_ID = workflow['id']  # From previous step
url = f"https://app.nanonets.com/api/v4/workflows/{WORKFLOW_ID}/documents"

# Process a document
files = {'file': open('invoice.pdf', 'rb')}
response = requests.post(url, files=files, auth=HTTPBasicAuth(API_KEY, ''))
result = response.json()

# Get results
if result['status'] == 'completed':
    # Access extracted fields
    invoice_number = result['data']['fields']['invoice_number'][0]['value']
    total_amount = result['data']['fields']['total_amount'][0]['value']
    invoice_date = result['data']['fields']['invoice_date'][0]['value']
    print(f"Invoice Number: {invoice_number}")
    print(f"Total Amount: {total_amount}")
    print(f"Invoice Date: {invoice_date}")
    
    # Access extracted tables
    for table in result['data']['tables']:
        print(f"\nTable: {table['name']}")
        for cell in table['cells']:
            print(f"Row {cell['row']}, Col {cell['col']}: {cell['text']}")
const axios = require('axios');
const FormData = require('form-data');
const fs = require('fs');

const API_KEY = 'YOUR_API_KEY';
const WORKFLOW_ID = workflow.id;  // From previous step
const url = `https://app.nanonets.com/api/v4/workflows/${WORKFLOW_ID}/documents`;

// Process a document
const formData = new FormData();
formData.append('file', fs.createReadStream('invoice.pdf'));

axios.post(url, formData, {
    auth: {
        username: API_KEY,
        password: ''
    },
    headers: {
        ...formData.getHeaders()
    }
})
.then(response => {
    const result = response.data;
    
    if (result.status === 'completed') {
        // Access extracted fields
        const invoiceNumber = result.data.fields['invoice_number'][0].value;
        const totalAmount = result.data.fields['total_amount'][0].value;
        const invoiceDate = result.data.fields['invoice_date'][0].value;
        console.log(`Invoice Number: ${invoiceNumber}`);
        console.log(`Total Amount: ${totalAmount}`);
        console.log(`Invoice Date: ${invoiceDate}`);
        
        // Access extracted tables
        for (const table of result.data.tables) {
            console.log(`\nTable: ${table.name}`);
            for (const cell of table.cells) {
                console.log(`Row ${cell.row}, Col ${cell.col}: ${cell.text}`);
            }
        }
    }
})
.catch(error => {
    console.error(error);
});
curl -X POST \
  "https://app.nanonets.com/api/v4/workflows/YOUR_WORKFLOW_ID/documents" \
  -u "YOUR_API_KEY:" \
  -F "[email protected]"

Best Practices

  1. Error Handling

    • Always check response status codes
    • Implement retry logic for rate limits
  2. Security

    • Store API keys securely
    • Use environment variables
  3. Performance

    • Use async processing for large files
    • Monitor API usage

Quick Links