Section 13 Export

Section 13 Export

Section 13     Export

WIB™ Review can be used as a system to process and store records or it can be a transitory system where parts of the system are used in a series of standard operating procedures that results in a connection to another repository through an API or may require exporting of images and associated metadata and attributes for import into another repository. Creating Export Packages allows the transfer of images and associated metadata and attributes in a format that can be imported into another system. There are five (5) Export sections, Packages, Configuration, Schedules, Targets, and History.

Section 13.1    Creating an Export Package

Users can create export packages the Search and Review pages. If a user does not perform a search to cull the records all records will be exported.

Perform a search and once satisfied with the results, select Create Export Package at the bottom of the search results grid. You need to create an export configuration before creating an export package from Search.

Section 13.1.2                  Export Package from Review

Export Packages created in review will only contain the active box in Review. To export the single box, select Create Export Package from the top right corner of the Review page. You need to create an export configuration before creating an export package from Review.

Section 13.2   Export Package Definition

WIB™ Review supports various types of exports and formats. The Export Package Definition includes a Name, Description, Delimiter, Text Qualifier and defines what is exported and the format for some components of the package i.e., Image formats such as JPEG, PDF, Single-Page, Multi-page, etc.

Name the export package in a manner that is standardized and allows for sorting and/or quickly identifying the content. For example, including the date in the Name will allow for sorting the Export Packages in date order. Keep the Name short and leverage the description for more detailed information about the export.

Date Created is a default metadata field for Export Packages and is not required as part of the Package Name to sort by data order. Date is used for demonstrative purposes only.

Section 13.2.3                   Export File Format

The metadata can be exported to a CSV (Comma Separated Values), Excel Spreadsheet, or JSON (JavaScript Object Notation.  If you need a format that is not available please contact support@radixdata.com.

Section 13.2.3.1              CSV File Format Settings

Section 13.2.3.1.1                  Column Delimiter

The delimiter in the metadata file is the character (comma or otherwise) that separates the data in your file into distinct fields.

Section 13.2.3.1.2                 Text Qualifier

A text qualifier is a character used to distinguish the point at which the contents of a text field should begin and end. Say you need to import a text file that is comma delimited (separated by commas), and one of fields is a description that could potentially contain a comma. You can use a text qualifier to show that the comma is meant to be included within the text field-- not to be used as a separator.               

Section 13.2.4         Retention Days

The length of time an export package should be kept. The retention period should allow for the quality check and if required, the import of the package into another platform. A notification can be configured to notify specific individuals of the Export Packages about to expire.

Section 13.2.6         Include Images

The metadata for images can be exported without the images. This feature is useful when the images for a given box have not changed but the metadata or attributes may have changed through user review or as a retroactive update to the Taxonomy.

Section 13.2.7                 Attribute Classification

The system has two groups of attributes, extracted and user defined. Extracted attributes are those that are automatically recognized by WIB Review from a Phrase List or a Regular Expression and those that a user accepts as correct from the Extracted value or are manually entered into the Review Page. You can select which group(s) of attributes are included in the Export Configuration. To include both groups select Both to include only one of the two groups select the appropriate radial button.

Section 13.2.8                  Attribute Inclusion

Attribute inclusion determines which attribute classification is exported. There are four (4) options available, and each option is detailed below.

Section 13.2.8.1             Both

Selecting Both will export all attributes classes and system attributes.

Section 13.2.8.2            User

Selecting User will export the User class attributes and system attributes.

Section 13.2.8.3            Extracted

Selecting Extracted will export the Extracted class attributes and system attributes.

Section 13.2.8.4           Logical

Selecting Logical will logically determine which class attributes to export with the system attributes. The following logic determines which attribute class is exported:

IF a user defined attribute is not null, then the user attribute is exported.

IF a user defined attribute is null and an extracted attribute is defined then the extracted attribute is exported.

IF both attribute classes contain a value, the user defined attribute takes precedence and will be exported.

IF both attribute classes are null the user class attribute is exported as a null value.

Section 13.2.9                  Include OCR

The images and OCR are separate components unless the images are exported as a PDF. You can choose to use Radix Data OCR results or run the OCR through an engine in another application.

Section 13.2.10                  Include metadata

If WIB™ us used as a processor of images and no data is extracted and the capture metadata is not needed, the images can be exported without the metadata. In the concept of export, metadata, or system generated attributes, extracted data, and user attributes are all defined as metadata.

Section 13.2.11                  Attribute Types

The system has two groups of attributes, extracted and user defined. Extracted attributes are those that are automatically recognized by WIB Review from a Phrase List or a Regular Expression and those that a user accepts as correct from the Extracted value or are manually entered into the Review Page. You can select which group(s) of attributes are included in the Export Configuration. To include both groups select Both to include only one of the two groups select the appropriate radial button.

Section 13.2.12                  Grayscale images

Images are captured in color. However, there may arise the need to export images in grayscale. This option converts the color images to grayscale during the export process. Grayscale images are in JPEG image format.

Section 13.2.13                  PDF

Portable Document Format (PDF) is an image format. This option creates a PDF which combines the OCR and image into a combined file that is searchable.  PDFs can be exported as single-page or multi-page.

Section 13.2.13.1             Single-Page

One PDF file is exported for each image in a container.

Section 13.2.13.2            Multi-Page

All images for a container are combined into a single PDF file.

Section 13.2.13.3            DPI

DPI is a printer resolution measured in dots per inch (dpi). The higher the dpi, the finer the printed output you’ll get. Most inkjet printers have a resolution of approximately 720 to 2880 dpi. Printer resolution is different from, but related to, image resolution.

The DPI setting can be set when creating a PDF file. If exporting to other image formats, DPI cannot be set.

Section 13.2.14              Column Mapper

The column mapper allows you to select which attributes are in the export and how those attributes are named. You can rename the attributes to match another system name for seamless imports. To exclude an attribute, make sure the selector is not checked. To rename the attribute enter the new name in the Exported Header field next to the Attribute Name.

Section 13.2.14.1          Include System Attributes

System attributes are automatically included in the export. To change the header name of the order the system attributes appear in the export, select include System Attributes.

Section 13.2.14.2         Map a single attribute to two (2) fields in the Export

Add the attribute to the Column Mapper a second time and change the Exported Header name.

Section 13.2.15                Column Order

You can set the order the attributes appear in the export. To change the order, select an attribute and drag it to the correct position. Do this for each attribute until the Column Order appears in the order you want them to appear in the export.       

Section 13.2.16                   XLM Character Encoding

You can replace special characters in the export using the XML Character Encoding by adding the Character and the replacement Representation. These characters will be replaced in the file containing the metadata. Some examples include but are not limited to the following:

Character Name

Character

Encoded Representation

Ampersand

&

&

Less than (Left angle bracket)

<

Greater than (Right angle bracket)

>

Double quotation

"

Single quotation (apostrophe)

'

Section 13.2.17      Query

JSON view of the query that generated the results in the search or the JSON for the active box record in Review. 

Section 13.2.18                   Export History

The export history includes the following information: Box Identifier, Image Filename, Export Package, Configuration, username, Boolean export status for the OCR Text, Image, and metadata, the capture date of the image, the date of export, and the expiry date for the export package. Please note that the expiry date only applies to the export package and not the images in storage.

Select Include Export History to include the history with the export or select only the ‘Include Export History’ and leave all other export options off to export ONLY the history. This configuration can then be scheduled to export the history.

Section 13.3   Schedule an Export

Exports can occur on a set schedule automating the creation of an export package. Select Schedules to create a new export schedule.

Section 13.3.1                  Name

Name the export schedule in a manner that is standardized and allows for sorting and/or quickly identifying the content. For example, including the frequency in the Name will allow for sorting the Export Packages in order of frequency. Keep the Name short and leverage the description for more detailed information about the export schedule

Section 13.3.2                 Description

Describe the export schedule content in a manner that gives another user a detailed enough description to determine the content. This expands on the name and allows for more detailed information about the content.

Section 13.3.3                 Query

JSON view of the query that will create the scheduled export package. The best way to create the query is to perform a search, create an export in the search, and copy the query to the schedule. Running a query for those records with a value of FALSE for ‘Has been exported’ will produce a search result for all the records that have not been exported. The query results for the example is as follows:

{"query":"","sorts":["score desc"],"searchIn":[],"imageTypes":[],"boxId":null,"imageId":null,"projectId":null,"sessionId":null,"collectionId":null,"savedSearchId":null,"attributeType":"both","attributeFilter":"{\"logic\":\"and\",\"filters\":[{\"field\":\"HasBeenExported\",\"operator\":\"eq\",\"value\":false}]}"}

Copy this from the Create an Export dialog and paste it into the Export Schedule query field.

Section 13.3.4                   CRON Expression

A crontab expression is a very compact way to express a recurring schedule. A single expression is composed of 5 space-delimited fields:

MINUTES HOURS DAYS MONTHS DAYS-OF-WEEK

Section 13.3.4.1              CRON Expression Format

The link takes you to an article that fully explains the CRON Expression format. Quick explanation of each fields is as follows:

Each field is expressed as follows:

A single wildcard (*), which covers all values for the field. So, a * in days means all days of a month (which varies with month and year).

A single value, e.g., 5. Naturally, the set of values that are valid for each field varies.

A comma-delimited list of values, e.g., 1,2,3,4. The list can be unordered as in 3,4,2,6,1.

A range where the minimum and maximum are separated by a dash, e.g., 1-10. You can also specify these in the wrong order, and they will be fixed. So, 10-5 will be treated as 5-10.

An interval specification using a slash, e.g., */4. This means every 4th value of the field. You can also use it in a range, as in 1-6/2.

You can also mix all of the above, as in 1-5,10,12,20-30/5

The table below lists the valid values for each field:

Field

Range

Comment

MINUTES

0-59

-

HOURS

0-23

-

DAYS

1-31

-

MONTHS

1-12

Zero (0) is not valid. Month names are also accepted.

DAYS-OF-WEEK

0-6

Where zero (0) means Sunday. Names of days are also accepted.

Two fields also accept named values in English: MONTHS and DAYS-OF-WEEKS. So, you can use names like January, February, March and so on for MONTHS and Monday, Tuesday, Wednesday and so on for DAYS-OF-WEEK. The names are not case-sensitive, and you can even use short forms like Jan, Feb, Mar or Mon, Tue, Wed.

Use the link to test your CRON expression.

Section 13.3.5                  Export Package Configuration

One the schedule is configured, identify which configuration is used for the schedule. Use the drop down to select the configuration to run on the schedule just set.

Section 13.3.6                Export Notification – Email Recipients

Email Notifications for exports can be set up under the Export Schedule or added to a manual export from the Search and Review page. Users with access to the workspace are available in a drop-down menu and can also be added manually.

The email recipients will receive an email notification containing the Export package name, size, image count, and the configuration used to create the export. A link to the destination site is provided in the email. The export inventory is attached to the email. The export inventory contains a listing of each container and the associated image count.

Once an Export Package is created it can be downloaded from the Exports page. Each export is listed on the grid. Selecting an export will launch the Export Package Details page where all the information from the export definition is displayed along a box count, image count, package size, the user who create the export, the date the export was create and the boxes in the export package.




Export Destinations allows for pushing export packages to AWS S3 buckets, Azure Storage Blobs, and/or an FTP Site. The automation/scheduling of exports requires a destination definition. Each destination type (AWS, Azure, FTP etc.) has requirements that are presented when setting up a connection.

Section 13.5.1                   Create Export Destination

Export targets allow you to specify a custom destination for the export package to be saved.

Section 13.5.1.1              Name

Name the export destination in a manner that is standardized and allows for sorting and/or quickly identifying the destination definition. For example, including the repository in the Name will allow for sorting the Export destinations. Keep the Name short and leverage the description for more detailed information.

Section 13.5.1.2             Description (Optional)

Describe the destination in a manner that gives another user a detailed enough description to determine the content. This expands on the name and allows for more detailed information about the content.

Section 13.5.1.3             Type

There are three (3) destination types, AWS, Azure, FTP. If you require a different destination type, contact us at support@radixdata.com and request a new destination.

Section 13.5.1.3.1                 AWS S3 Bucket

Enter the Access Key Id, Secret Access Key, Region Name, and Bucket Name for the S3 Bucket.


Section 13.5.1.3.1.1                         Bucket Name & Region
The Name and Region are assigned when creating an S3 Bucket. Contact your IT Department for assistance with the name and region for the S3 Bucket if you are not creating the Bucket yourself.

Section 13.5.1.3.1.2                        Access Key Id & Secret Access Key

This information is found under the account/Security credentials. Create a key if one does not exist.

Section 13.5.2                  Manual Export Destination

The export package destination can be manually set when an export package is created. See Section 13.1 Creating an Export Package

Section 13.5.3                  Export Automation Destination

The export package destination can be set for a scheduled export package. See Section 13.3 Schedule an Export.

Section 13.6    Export History

The export history lists the images for all the exports. The following system attributes are listed for each image; Box Identifier, Image Filename, Export Package, Configuration, Username, Exported Text (Boolean), Exported Image (Boolean), Exported Metadata (Boolean), Date Captured, Date Exported, Retention Expiration. The Boolean Test, Image, and Metadata fields indicate if the associated component is part of the export package. The Export Package History will retain all data regardless of if the content is deleted or the package has met the retention requirements.

 

Section 13.6.1                   Reset Export Flag

The export flag can be reset for an entire export package, box, or individual images using the corresponding filters and the selection feature.


    • Related Articles

    • Section 13.6 Export History

      Section 13.6 Export History The export history lists the images for all the exports. The following system attributes are listed for each image; Box Identifier, Image Filename, Export Package, Configuration, Username, Exported Text (Boolean), Exported ...
    • Section 13.5 Set Export Destination

      Section 13.5 Set Export Destination Export Destinations allows for pushing export packages to AWS S3 buckets, Azure Storage Blobs, and/or an FTP Site. The automation/scheduling of exports requires a destination definition. Each destination type (AWS, ...
    • Section 13.2 Export Package Definition

      Section 13.2 Export Package Definition WIB™ Review supports various types of exports and formats. The Export Package Definition includes a Name, Description, Delimiter, Text Qualifier and defines what is exported and the format for some components of ...
    • Section 13.3 Schedule an Export

      Section 13.3 Schedule an Export Exports can occur on a set schedule automating the creation of an export package. Select Schedules to create a new export schedule. Section 13.3.1 Name Name the export schedule in a manner that is standardized and ...
    • Section 13.4 Download an Export Package

      Section 13.4 Download an Export Package Once an Export Package is created it can be downloaded from the Exports page. Each export is listed on the grid. Selecting an export will launch the Export Package Details page where all the information from ...