Overview
This code block demonstrates how to query JSON file(s).
CREATE FOREIGN DATA WRAPPER < wrapper_name>
HANDLER json_fdw_handler
VALIDATOR json_fdw_validator;
CREATE SERVER < server_name>
FOREIGN DATA WRAPPER < wrapper_name> ;
CREATE FOREIGN TABLE < table_name> ( )
SERVER < server_name>
OPTIONS ( files '<files>' ) ;
CREATE FOREIGN DATA WRAPPER json_wrapper
HANDLER json_fdw_handler
VALIDATOR json_fdw_validator;
CREATE SERVER json_server
FOREIGN DATA WRAPPER json_wrapper;
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS ( files 's3://bucket/folder/file.json' ) ;
Foreign data wrapper name. Can be any string.
Foreign server name. Can be any string.
Foreign table name. Can be any string.
The path of a single JSON file or multiple JSON files .
For instance, s3://bucket/folder/file.json
if the file is in Amazon S3 or /path/to/file.json
if the file is on the local file system.
JSON Options
There are a number of options that can be passed into the CREATE FOREIGN TABLE
statement.
These are the same options accepted
by DuckDB’s read_json
function.
auto_detect
default: "false"
Enables auto detection of key names and value types.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
auto_detect 'true'
) ;
Specifies key names and value types in the JSON file (e.g. {key1: 'INTEGER', key2: 'VARCHAR'}
). If auto_detect
is enabled the value of this setting will be inferred from the JSON file contents.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
columns $${key1: 'INTEGER' , key2: 'VARCHAR' }$$
) ;
compression
default: "auto_detect"
The compression type for the file. By default this will be detected automatically from the file extension (e.g., t.json.gz
will use gzip
, t.json
will use none
). Options are uncompressed
, gzip
, zstd
, and auto_detect
.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
compression 'gzip'
) ;
convert_strings_to_integers
default: "false"
Whether strings representing integer values should be converted to a numerical type.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
convert_strings_to_integers 'true'
) ;
Specifies the date format to use when parsing dates. See Date Format
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
dateformat '%d/%m/%Y'
) ;
Whether or not an extra filename column should be included in the result.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
filename 'false'
) ;
Can be one of auto
, unstructured
, newline_delimited
and array
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
format 'unstructured'
) ;
hive_partitioning
default: "false"
Whether or not to interpret the path as a Hive partitioned path.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
hive_partitioning 'true'
) ;
ignore_errors
default: "false"
Whether to ignore parse errors (only possible when format is newline_delimited
)
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
ignore_errors 'false'
) ;
maximum_depth
default: "-1"
Maximum nesting depth to which the automatic schema detection detects types. Set to -1
to fully detect nested JSON types.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
maximum_depth '65536'
) ;
maximum_object_size
default: "16777216"
The maximum size of a JSON object (in bytes).
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
maximum_object_size '65536'
) ;
Determines whether the fields of JSON object will be unpacked into individual columns.
Can be one of auto
, true
or false
Suppose we have a JSON file with these contents:
{ "key1" : "value1" , "key2" : "value1" }
{ "key1" : "value2" , "key2" : "value2" }
{ "key1" : "value3" , "key2" : "value3" }
Reading it with records
set to true
will result in these table contents:
key1 | key2
-----------------+
value1 | value1
value2 | value2
value3 | value3
Reading it with records
set to false
will result in these table contents:
json
---------------------------------+
{'key1': value1 , 'key2': value1}
{'key1': value2 , 'key2': value2}
{'key1': value3 , 'key2': value3}
If set to auto
DuckDB will try to determine the desired behaviour. See DuckDB documentation for more details.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
records 'auto'
) ;
sample_size
default: "20480"
Option to define number of sample objects for automatic JSON type detection. Set to -1
to scan the entire input file
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
sample_size '4086'
) ;
timestampformat
default: "iso"
Specifies the date format to use when parsing timestamps. See Date Format
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
timestampformat 'iso'
) ;
union_by_name
default: "false"
Whether the schema’s of multiple JSON files should be unified.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files 's3://bucket/folder/file.json' ,
union_by_name 'false'
) ;
Multiple JSON Files
To treat multiple JSON files as a single table, their paths should be passed in as a comma-separated
string.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files '/path/to/file1.json, /path/to/file2.json'
) ;
To treat a directory of JSON files as a single table, the glob pattern should be used.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files '/folder/*.json' ,
) ;
The glob pattern can also be used to read all JSON files from multiple directories.
CREATE FOREIGN TABLE json_table ( )
SERVER json_server
OPTIONS (
files '/folder1/*.json, /folder2/*.json'
) ;
Cloud Object Stores
The object stores documentation explains how to provide secrets and other credentials for
JSON files stored in object stores like S3.