Zero Shot Model: Best Practices
Overview:
Nanonets’ zero-shot models use only their intelligence to extract data, meaning they do not require any initial training data to start extracting information effectively. It is important to provide clear and effective prompts or descriptions for each label or table header when defining them in the Manage Label section. The zero-shot model uses these prompts to extract data from the uploaded documents.
What is the difference between Fields and Table Headers?
- Fields: When defining a field, it is assumed that on a given page or within a specific document, the field will have a singular value.
- Table Headers: When defining a table header, it is possible that multiple values for a single field can exist per page or document.
Best Practices:
- Field Naming:
Be precise in setting up the field name:
- Choose Descriptive Names: For clarity and precision, select field names that directly reflect their content. For example, use
linkedin_username
for LinkedIn usernames extracted from resumes, rather than justlinkedin
. - Use Standard Abbreviations: Employ commonly recognized abbreviations to keep field names concise and understandable.
- Acceptable Example:
DOB
for "Date of Birth." - Unacceptable Example:
RMT
should not be used for "Road Motor Transportation"; use the full term instead.
- Acceptable Example:
- Avoid Truncating Words: Do not shorten words within field names as this can lead to confusion and ambiguity.
- Use Formal Terminology: When formal terms are available, prefer these over descriptive phrases to maintain professionalism and consistency.
- Preferred:
given_name
instead offirst_and_middle_name
.
- Preferred:
- Field Description:
Be precise in giving field descriptions: Ensure that each field's purpose and expected output are clear. This helps in achieving accurate and useful predictions.
Here are examples to illustrate:
- Less Effective Example:
- Field Name: Order_ID
- Field Description: A number related to an order.
- More Effective Example:
- Field Name: Order_ID
- Field Description:
Order_ID
is a unique identifier used to track and manage customer orders within a system. If an 'Order_ID' is present in the invoice, always use this 'Order_ID' instead of the 'Purchase_Order_ID'. - This ensures the correct identification and tracking of orders.
Updated 2 months ago