Webhooks Export and Payload Structure

Overview:

Nanonets can notify you when your files have been processed. By setting up a Webhook, you can configure Nanonets to send notifications directly to a URL of your choice. This feature ensures that you receive real-time updates whenever a file is processed or approved.

Webhooks allow you to receive real-time notifications at a specified URL when a file is processed and ready for export.

Steps to Set Up Export to Webhook:

  • Access Workflow:
    • From the Nanonets Extract Data page, navigate to the 'Workflow' section via the left panel.
  • Browse Export Options:
    • Click on “Browse all export options” to see all supported export methods.
    • Select "Webhooks" from the list of options.
  • Configure a Webhook:
    • Enter the URL: Input the endpoint where you want the exported data to be sent. This URL will act as the receiving end once a file is ready for export.
    • Name Your Webhook Integration: Provide a descriptive name for your webhook integration, such as "Webhook Test". This will help you easily identify it among multiple integrations.
    • Create the Webhook: Click the "Create Webhook" button to save and configure the webhook.
Untitled
  • Manage Your Webhooks:
    • Add New Webhook Integration: To set up additional webhooks, click on "Add new Webhook Integration". This allows you to manage multiple endpoints.
    • Edit Existing Integrations: Navigate to "External Integration" in the left panel. Here you can edit existing integrations or add new ones.
  • Export Trigger:
    • Configure one or more export triggers according to your use case. Use the toggle to switch triggers on or off.
    • Nanonets supports the following export triggers:
      • On Inference: Automatically exports data as soon as the file is processed.
      • On All Validations Passing: Exports data only after all approval/validation stages have been passed.
      • On Approval: Exports data only if the file has been manually approved.
      • On Assignment: Exports data only if the file is assigned or un-assigned.
      • On Rejection: Exports data only if the file is manually rejected
Untitled
  • Export Type:
    • Select whether you want to export the data at the document level or at the page level. By default this will get exported at the page level.
Untitled
  • Save and Test Configuration:
    • Once the Webhook and the export settings are configured, test the file exporting under the “Test this Action” section.
    • Click "Add Integration" to finalize the setup and apply the configurations.
Untitled

How to add conditional export?

  • Configure Conditions:
    • Click the "Add Conditions to Run" button to start configuring your conditional export.
    • Add one or multiple conditions based on the extracted data fields. You can combine these conditions using 'AND' and 'OR' operators.
  • Verify Export Setup:
    • Ensure your export settings are correctly configured before implementing the conditional export logic. You will see a small “Green Tick” near Export to Webhook.
Untitled

Webhook Json Payload:

Document Level:

The response is a dictionary, it is represented by curly braces {} containing key-value pairs, where each key is unique.

  • result : This is the root object that contains all the output details from the exported file. It serves as the container for all the results and metadata associated with export.
  • message : This field provides a description of the execution result. "Success" indicates that the export was completed successfully without any errors.
  • input : Specifies the filename or input identifier of the processed file.
  • prediction : Contains an array of predicted results generated by the model. This array is empty if there are no predictions made during the process. The prediction is same as moderated_boxes if the file is moderated and predicted_boxes otherwise.
  • predicted_boxes : This is a list of dictionaries in JSON format. The predicted_boxes is post-processing final data (final data after the workflow blocks have run) after the file is uploaded and this doesn’t change either afterwards . At this time moderated_boxes is empty and all the changes made to the predicted_boxes (fields + table-data) will keep on updating in the moderated boxes.
  • moderated_boxes : This is a list of dictionaries in JSON format. This field lists any modifications or adjustments made to the predicted bounding boxes after manual review or moderation.
  • id : This is the unique identifier of the file.
Untitled
  • is_moderated : A boolean value indicating whether the predictions have undergone moderation. 'false' means no moderation has been applied.
  • model_id : Represents the specific model used for making predictions.
  • current_stage_id : Indicates the ID of the current stage in the processing pipeline. The value "ffffffff-ffff-ffff-ffff-ffffffffffff" signifies that the file has been approved. Other values represent different stages of approval stage, which can be configured and defined in the workflow section of the model. This helps in tracking the document’s progress through various approval stages.
  • raw_ocr : This is a list of dictionaries in JSON format. The raw_ocr is pre processing data from the python predictions which never change once the file has been uploaded.
  • approval_status : Indicates the current approval status of the exported file. For example, "approved" suggests that the exported file have been reviewed and approved. This can also be “rejected” or empty based on the file approval status.
{
	"result": {
		"message": "Success",
		"input": "filename.pdf",
		"prediction": [],
    "predicted_boxes": [],
    "moderated_boxes": [],
    "id": "8d04fc3d-ad8a-492e-9850-a94455ced3f5",
    "is_moderated": false,
    "model_id": "a14f417d-11d0-4e08-ad7a-7d3b382964e5",
    "current_stage_id": "ffffffff-ffff-ffff-ffff-ffffffffffff",
    "raw_ocr": [],
		"approval_status": "approved",
		"signed_urls": {},
		"file_url": "uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/RawPredictions/8d04fc3d-ad8a-492e-9850-a94455ced3f5.pdf",
		"file_metadata": {
			"duplicate": false,
			"delegated_history_users": null,
			"doc_matching_enabled": false,
			"doc_fully_matched": false,
			"file_moved_to_previous_stage": false
		}
    }
}
  • label: The label name, which corresponds to the field or table header as configured in the model.
  • ocr_text: This is the predicted value associated with the label
  • status: Indicates whether the label was moderated. Possible values are "moderated" or “
    correctly_predicted”.
  • type: This present is the label is field or table header. Possible values are field and table.
  • validation_status: This represent the status of the label based on validation rules configured in the workflow section. Indicates whether the initial prediction for the field passed validation checks. Possible values are "success" if all checks passed, or other values indicating specific types of validation failures.
  • page: The page number in the document where the label is located, with 0 representing the first page and so on.
  • label_id: A unique identifier associated with each label as defined in the model.

Specific to cells in Tables

  • Description: Each table prediction may contain a cells array, listing the details for each cell within the table. Each cell's description includes position, text content, validation results, and identifiers, much like individual label predictions.
    • verification_status: Indicates the moderation status of a document or a specific field. It can take the values "moderated" if the item has been manually reviewed and modified as necessary, or "correctly_predicted" if the model's initial prediction was accurate and required no further moderation.
    • status(for cell): Represents the validation status of a cell within a table. The possible values are "failed" or null. "Failed" indicates that the cell did not meet one or more validation criteria and was therefore flagged during processing. A null value indicates that all validations passed successfully.
    • failed_validation: This field provides a specific reason for the validation failure of a cell. It is only populated if the status for the cell is "failed." For instance, "Content Length is greater than or equal to 2" could be a reason indicating that the cell content did not meet the required content length criteria set by the validation rules.

Prediction:

This is a list of dictionaries in JSON format. Each dictionary within the list represents a distinct entity (like a detected field or table) with various attributes describing properties like label, ocr_text, status, label_id, and type. The prediction is same as moderated_boxes if the file is moderated and predicted_boxes otherwise.

"prediction": [{
      "id": "4f8d39b3-f4fc-4b41-99f4-a2ae7823ca75",
      "label": "subtax_name",
      "xmin": 2107,
      "ymin": 2454,
      "xmax": 2231,
      "ymax": 2489,
      "score": 0,
      "ocr_text": "CGST",
      "status": "moderated",
      "type": "field",
      "validation_status": "success",
      "page": 0,
      "label_id": "2e6a2d0f-04e5-4e4d-92df-91d331833300",
      "lookup_edited": false,
      "lookup_parent_id": ""
      }, {
        "id": "2911d874-4ec9-4b8c-926d-c3f365698aa9",
        "label": "table",
        "xmin": 379,
        "ymin": 1716,
        "xmax": 3433,
        "ymax": 3953,
        "score": 0,
        "ocr_text": "table",
        "status": "moderated",
        "type": "table",
        "cells": [
        {
          "id": "c1625595-da33-4319-942f-a7b903d0e7cf",
          "row": 1,
          "col": 1,
          "row_span": 0,
          "col_span": 0,
          "label": "Description",
          "xmin": 507,
          "ymin": 1716,
          "xmax": 655,
          "ymax": 1747,
          "score": 0,
          "text": "Mud Pie",
          "row_label": "",
          "verification_status": "moderated",
          "status": "success",
          "failed_validation": "",
          "label_id": "9285fcc5-26be-4d6b-9989-f0797463a1fc",
          "lookup_edited": false
        }],
      "validation_status": "success",
			"page": 0,
			"label_id": "",
			"lookup_edited": false,
			"lookup_parent_id": ""
		}
]

Predicated Box:

This is a list of dictionaries in JSON format. Each dictionary within the list represents a distinct entity (like a detected field or table) with various attributes describing properties like label, ocr_text, status, label_id, and type. The predicted_boxes is post-processing final data (final data after the workflow blocks have run) after the file is uploaded and this doesn’t change either afterwards .

"predicted_boxes": [{
				"id": "e717ac86-be16-441d-b27b-ada993d7b240",
	      "label": "buyer_name",
	      "xmin": 739,
	      "ymin": 586,
	      "xmax": 975,
	      "ymax": 622,
	      "score": 0,
	      "ocr_text": "aryan's place",
	      "status": "correctly_predicted",
	      "type": "field",
	      "validation_status": "failed",
	      "validation_message": "Content Length is greater than or equal to 2",
	      "page": 0,
	      "label_id": "ab2e71b5-1f86-4a80-a2a8-452a1927efa2",
	      "lookup_edited": false,
	      "lookup_parent_id": ""
		}, {
			"id": "f4b72405-bf55-4da0-9e75-cb024825289a",
			"label": "table",
			"xmin": 379,
			"ymin": 1716,
			"xmax": 3433,
			"ymax": 3953,
			"score": 0,
			"ocr_text": "table",
			"status": "correctly_predicted",
			"type": "table",
			"cells": [{
				"id": "b2d69469-c9a6-4820-9acd-ed85674a9aa9",
				"row": 1,
				"col": 1,
				"row_span": 0,
				"col_span": 0,
				"label": "Description",
				"xmin": 507,
				"ymin": 1716,
				"xmax": 655,
				"ymax": 1747,
				"score": 0,
				"text": "Mud Pie",
				"row_label": "",
				"verification_status": "correctly_predicted",
				"status": "",
				"failed_validation": "",
				"label_id": "9285fcc5-26be-4d6b-9989-f0797463a1fc",
				"lookup_edited": false
			}
        ],
      "page": 0,
      "label_id": "",
      "lookup_edited": false,
      "lookup_parent_id": ""
		}
]

Moderated Box:

This is a list of dictionaries in JSON format. This field lists any modifications or adjustments made to the predicted bounding boxes after manual review or moderation.

"moderated_boxes": [{
			"id": "e569becc-0379-4a05-b777-fa3d4b79f300",
			"label": "currency",
			"xmin": 2951,
			"ymin": 2834,
			"xmax": 3103,
			"ymax": 2881,
			"score": 0,
			"ocr_text": "Rupees",
			"status": "moderated",
			"type": "field",
			"validation_status": "success",
			"page": 0,
			"label_id": "7d1990b2-eaa6-4f77-a306-7190129df5a4",
			"lookup_edited": false,
			"lookup_parent_id": ""
		},{
			"id": "f4b72405-bf55-4da0-9e75-cb024825289a",
			"label": "table",
			"xmin": 379,
			"ymin": 1716,
			"xmax": 3433,
			"ymax": 3953,
			"score": 0,
			"ocr_text": "table",
			"status": "moderated",
			"type": "table",
			"cells": [{
				"id": "b2d69469-c9a6-4820-9acd-ed85674a9aa9",
				"row": 1,
				"col": 1,
				"row_span": 0,
				"col_span": 0,
				"label": "Description",
				"xmin": 507,
				"ymin": 1716,
				"xmax": 655,
				"ymax": 1747,
				"score": 0,
				"text": "Mud Pie",
				"row_label": "",
				"verification_status": "moderated",
				"status": "success",
				"failed_validation": "",
				"label_id": "9285fcc5-26be-4d6b-9989-f0797463a1fc",
				"lookup_edited": false
			}],
			"validation_status": "success",
			"page": 0,
			"label_id": "",
			"lookup_edited": false,
			"lookup_parent_id": ""
		}]

Raw OCR:

The raw_ocr contains initial data extracted from the file, and this data does not change after the file is uploaded.

"raw_ocr": [{
          "id": "9966220b-da6e-43f0-b684-5398c1eb4f16",
          "label": "seller_address",
          "xmin": 753,
          "ymin": 3368,
          "xmax": 2946,
          "ymax": 3412,
          "score": 0,
          "ocr_text": "101 , Marwah House , Krishna Lal , Marwah marg , Andheri East , Mumbai , Mumbai City , Maharashtra , 400072",
          "status": "correctly_predicted",
          "type": "field",
          "page": 0,
          "label_id": "701dc43b-8800-481b-9ed9-6d93f72d852e",
          "lookup_edited": false,
          "lookup_parent_id": ""
      }, {
          "id": "f4b72405-bf55-4da0-9e75-cb024825289a",
          "label": "table",
          "xmin": 379,
          "ymin": 1716,
          "xmax": 3433,
          "ymax": 3953,
          "score": 0,
          "ocr_text": "table",
          "status": "correctly_predicted",
          "type": "table",
          "cells": [{
              "id": "b2d69469-c9a6-4820-9acd-ed85674a9aa9",
              "row": 1,
              "col": 1,
              "row_span": 0,
              "col_span": 0,
              "label": "Description",
              "xmin": 507,
              "ymin": 1716,
              "xmax": 655,
              "ymax": 1747,
              "score": 0,
              "text": "Mud Pie",
              "row_label": "",
              "verification_status": "correctly_predicted",
              "status": "",
              "failed_validation": "",
              "label_id": "9285fcc5-26be-4d6b-9989-f0797463a1fc",
              "lookup_edited": false
            }],
          "page": 0,
          "label_id": "",
          "lookup_edited": false,
          "lookup_parent_id": ""
      }
  ]

Page Level:

The result field in your JSON is a list containing one or more objects. Each object can hold details about a specific result.

"result": [{
    "message": "Success",
    "result": [...]
}]

  • The outermost {} encloses the entire JSON object.
  • The result key holds a list (denoted by []) that contains dictionaries. Each dictionary represent a different result at the page_level.
  • Inside each dictionary, there are further dictionaries (size, file_metadata) and lists (prediction, predicted_boxes, moderated_boxes, raw_ocr), which organize and store various data elements in a structured manner.
{
	"result": [{
		"message": "Success",
		"result": [{
			"message": "success",
			"input": "filename.pdf",
			"prediction": [],
			"predicted_boxes": [],
			"moderated_boxes": [],
			"custom_response": null,
			"page": 0,
			"day_since_epoch": 19924,
			"hour_of_day": 10,
			"request_file_id": "8d04fc3d-ad8a-492e-9850-a94455ced3f5",
			"filepath": "uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg",
			"id": "fd8fe6e0-4680-11ef-b422-2e36ad83cb4c",
			"is_moderated": true,
			"rotation": 0,
			"updated_at": "41d031af-46d4-11ef-be47-3eabac7b9f26",
			"model_id": "a14f417d-11d0-4e08-ad7a-7d3b382964e5",
			"url": "uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg",
			"size": {
				"width": 3508,
				"height": 4961
			},
			"original_file_name": "filename.pdf",
			"source": "api",
			"no_of_fields": 28,
			"status": "success",
			"current_stage_id": "ffffffff-ffff-ffff-ffff-ffffffffffff",
			"file_url": "uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/RawPredictions/8d04fc3d-ad8a-492e-9850-a94455ced3f5.pdf",
			"raw_ocr": [],
			"approval_status": "approved",
			"file_metadata": {
				"duplicate": false,
				"delegated_history_users": null,
				"doc_matching_enabled": false,
				"doc_fully_matched": false,
				"file_moved_to_previous_stage": false
			},
			"training_page_id": ""
		}],
		"signed_urls": {
			"uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg": {
				"original": "https://images.nanonets.com/tr:rt-0,true/uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg?ik-s=1042ec02acf6869ce91242afbffb66704d8a6f9a\u0026ik-t=1721521985",
				"original_compressed": "https://images.nanonets.com/tr:rt-0/uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg?ik-s=4120c4ff09436c269af0a78b29fcba4f81c025d7\u0026ik-t=1721521985",
				"thumbnail": "https://images.nanonets.com/tr:w-240/uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg?ik-s=47d3244b9f39c82bb9e19036bfc8cf2d2d1f4827\u0026ik-t=1721521985",
				"acw_rotate_90": "https://images.nanonets.com/tr:rt-270/uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg?ik-s=e83ec655fff78eb22cf1e5da4886d41e68ea7eb1\u0026ik-t=1721521985",
				"acw_rotate_180": "https://images.nanonets.com/tr:rt-180/uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg?ik-s=c9aeca5e90ea81255478e02446cfabf327c43f44\u0026ik-t=1721521985",
				"acw_rotate_270": "https://images.nanonets.com/tr:rt-90/uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg?ik-s=dd3d58799379f4e3339ed08f1caade4e8022f20a\u0026ik-t=1721521985",
				"original_with_long_expiry": "https://images.nanonets.com/tr:rt-0/uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/PredictionImages/487d1743-d7f0-49c2-ab40-61f12c973e1c-1.jpeg?ik-s=a02eab0dab9498b9e75a232c2fa1d4bc1a81ebeb\u0026ik-t=1737059585"
			},
			"uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/RawPredictions/8d04fc3d-ad8a-492e-9850-a94455ced3f5.pdf": {
				"original": "https://nanonets.s3.us-west-2.amazonaws.com/uploadedfiles/a14f417d-11d0-4e08-ad7a-7d3b382964e5/RawPredictions/8d04fc3d-ad8a-492e-9850-a94455ced3f5.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256\u0026X-Amz-Credential=AKIA5F4WPNNTLX3QHN4W%2F20240720%2Fus-west-2%2Fs3%2Faws4_request\u0026X-Amz-Date=20240720T203305Z\u0026X-Amz-Expires=604800\u0026X-Amz-SignedHeaders=host\u0026response-cache-control=no-cache\u0026X-Amz-Signature=a103bfebfeedffc337d8d6b2ceab6331f206e852108be433ee073a243f26b610",
				"original_compressed": "",
				"thumbnail": "",
				"acw_rotate_90": "",
				"acw_rotate_180": "",
				"acw_rotate_270": "",
				"original_with_long_expiry": ""
			}
		}
	}]
}

Frequently Asked Questions (FAQ)

What is Webhook url?

  • A webhook URL is provided by the receiving application, and acts as a phone number that the other application can call when an event happens.

How to generate a Webhook for testing?

  • Use a service like RequestBin, Webhook.site, or PostBin to create a URL that can capture webhook requests. These services provide temporary URLs that display the requests they receive.
  • Here is an example using Webhook.site:
Untitled Untitled