Datasets

The following sections describes all the known or existing issues related to the Amorphic datasets feature.

S3 Athena Datasets

5.1 Empty data for numeric values:

Error message: Data parsing failed, emtpy field data found for non-string column

Issue description: When the user tries to load emtpy/null values for numeric columns, the load process fails throwing a data validation failed error message.

Explanation: This error message is thrown by the file parser which currently doesnot support the usage of numeric fields being null/empty. As per the documentation one work around to achieve this is to import them as string columns and create views on top of it by casting them to the required data types.

5.2 File parsing:

Error message: Data validation failed with message, new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

Issue description: This is one of the data validation errors which occurs while loading the data into Dataset.

Explanation: This error message is thrown by the file parser which currently doesnot support the usage of embedded line breaks in the csv/xlsx file. Please follow the documentation for information documentation Possible solution of this is to perform a regex replace on in-appropriate new line or carriage return characters in the file.

5.3 Field validations:

Error message: N/A

Issue description: Validations not available for all data types

Explanation: Currently data type validations are only limited primitive types Strings/Varchar, Integers, Double, Boolean, Date and Timestamp. Support of complex structures is yet to be added. Moreover for data types like Date and Timestamp, value formats are not strictly validated as they are multi formatted.