Data Source: CSV

On this page

Adapter Configuration
File Header Specifications
Supported Data Types
Examples
Deployment

The CSV data source adapter enables the querying of CSV files as relational tables. Ensure that the first line of your CSV files contains a header in compliance with the specifications provided below. The adapter facilitates querying of multiple CSV files, where each file is represented as a table. This adapter also supports files compressed using gzip, identified by their filename endings of either .csv or .csv.gz.

The CSV adapter operates in read-only mode, and does not support DML queries. However, based on the mode of operation, the file content may be updated in the background as long as the schema (number and type of columns) remains consistent.

Please note that the current version of the CSV adapter does not support null values.

Adapter Configuration

The CSV adapter comprises two settings:

Setting	Description
directory	Specifies the location of the directory containing the CSV files.
maxStringLength	Defines the length limit (in characters, including whitespace) for varchar columns. Ensure this value is equal to or larger than the length of your longest string across all columns.

File Header Specifications

The first line of the CSV file must include a header indicating the name of each column.

id,name

Optionally, the header can specify explicit data types. If a data type is omitted, Polypheny will automatically determine it. The column name and the data type should be separated by a colon:

id:int,name:string

Supported Data Types

The CSV adapter supports the following data types:

Type	Description
boolean	Permissible values: `true`, `false`, `0` and `1`.
date	Required format: `yyyy-mm-dd`.
double	Floating point number with ‘.’ (dot) as decimal separator. Maps to the data type DOUBLE (8 bytes, IEEE 754).
float	Floating point number with ‘.’ (dot) as decimal separator. Maps to the data type REAL (4 bytes, IEEE 754).
int	Integer number in the range `-2,147,483,648` to `2,147,483,647`.
long	Integer number in the range `-9,223,372,036,854,775,808` to `9,223,372,036,854,775,807`. Mapped to the data type BIGINT.
string	String type (can contain letters, numbers, and special characters). If it contains commas, it should be wrapped in double quotes. Maps to VARCHAR. The length can be specified using the adapter parameter `maxStringLength`.
time	Required format: `hh:mm:ss`. Fractional seconds are not supported.
timestamp	Required format: `yyyy-mm-dd hh:mm:ss`. Fractional seconds are not supported.

Examples

Example without explicit data types:

id,name,level,birthday,fulltime,start,last_update,tfloat,tdouble
1,Alice,9,1992-11-14,true,08:00:00,2021-01-30 13:30:04+00:00,1.999,2.333
2,"Wurst, Hans",6,1989-08

-01,0,08:15:00,2020-11-01 09:05:12.111,0.1,999.999999

Example with explicit data types:

id:long,name:string,level:int,birthday:date,fulltime:boolean,start:time,last_update:timestamp,tfloat:float,tdouble:double
1,Alice,9,1992-11-14,true,08:00:00,2021-01-30 13:30:04+00:00,1.999,2.333
2,"Wurst, Hans",6,1989-08-01,0,08:15:00,2020-11-01 09:05:12.111,0.1,999.999999

Deployment

The adapter can be deployed using the Polypheny-UI (Adapters -> Sources). The adapter operates in two modes:

Uploading: The file is uploaded to Polypheny. Any subsequent changes to the file will not be reflected in Polypheny.
Linking: The file is accessed by Polypheny at a specified location. For this mode, the file must reside on the same host as Polypheny, and Polypheny must have access rights to the file. Any changes to the file are immediately visible in Polypheny. Make sure that the file name does not include spaces.

Once successfully deployed, all CSV files are mapped as tables in the public schema. The tables and columns can be renamed. Moreover, columns can be reordered and removed if they are not required (the column data are not deleted from the file). If required, columns that have been removed can be re-added using the Polypheny-UI or the following SQL statement:

ALTER TABLE tableName ADD COLUMN physicalName AS name

In this statement, physicalName refers to the name specified in the CSV file header.

Home

Ethereum

Quick Search Results

Adapter Configuration

File Header Specifications

Supported Data Types

Examples

Deployment