Overview

This code block demonstrates how to query JSON file(s).

CREATE FOREIGN DATA WRAPPER <wrapper_name>
HANDLER json_fdw_handler
VALIDATOR json_fdw_validator;

CREATE SERVER <server_name>
FOREIGN DATA WRAPPER <wrapper_name>;

CREATE FOREIGN TABLE <table_name> ()
SERVER <server_name>
OPTIONS (files '<files>');
wrapper_name
required

Foreign data wrapper name. Can be any string.

server_name
required

Foreign server name. Can be any string.

table_name
required

Foreign table name. Can be any string.

files
required

The path of a single JSON file or multiple JSON files. For instance, s3://bucket/folder/file.json if the file is in Amazon S3 or /path/to/file.json if the file is on the local file system.

JSON Options

There are a number of options that can be passed into the CREATE FOREIGN TABLE statement. These are the same options accepted by DuckDB’s read_json function.

auto_detect
default:
"false"

Enables auto detection of key names and value types.

columns
default:
"NULL"

Specifies key names and value types in the JSON file (e.g. {key1: 'INTEGER', key2: 'VARCHAR'}). If auto_detect is enabled the value of this setting will be inferred from the JSON file contents.

compression
default:
"auto_detect"

The compression type for the file. By default this will be detected automatically from the file extension (e.g., t.json.gz will use gzip, t.json will use none). Options are uncompressed, gzip, zstd, and auto_detect.

convert_strings_to_integers
default:
"false"

Whether strings representing integer values should be converted to a numerical type.

dateformat
default:
"iso"

Specifies the date format to use when parsing dates. See Date Format

filename
default:
"false"

Whether or not an extra filename column should be included in the result.

format
default:
"array"

Can be one of auto, unstructured, newline_delimited and array

hive_partitioning
default:
"false"

Whether or not to interpret the path as a Hive partitioned path.

ignore_errors
default:
"false"

Whether to ignore parse errors (only possible when format is newline_delimited)

maximum_depth
default:
"-1"

Maximum nesting depth to which the automatic schema detection detects types. Set to -1 to fully detect nested JSON types.

maximum_object_size
default:
"16777216"

The maximum size of a JSON object (in bytes).

records
default:
"true"

Determines whether the fields of JSON object will be unpacked into individual columns. Can be one of auto, true or false

Suppose we have a JSON file with these contents:

{"key1":"value1", "key2": "value1"}
{"key1":"value2", "key2": "value2"}
{"key1":"value3", "key2": "value3"}

Reading it with records set to true will result in these table contents:

  key1  | key2
-----------------+
 value1 | value1
 value2 | value2
 value3 | value3

Reading it with records set to false will result in these table contents:

              json
---------------------------------+
 {'key1': value1, 'key2': value1}
 {'key1': value2, 'key2': value2}
 {'key1': value3, 'key2': value3}

If set to auto DuckDB will try to determine the desired behaviour. See DuckDB documentation for more details.

sample_size
default:
"20480"

Option to define number of sample objects for automatic JSON type detection. Set to -1 to scan the entire input file

timestampformat
default:
"iso"

Specifies the date format to use when parsing timestamps. See Date Format

union_by_name
default:
"false"

Whether the schema’s of multiple JSON files should be unified.

Multiple JSON Files

To treat multiple JSON files as a single table, their paths should be passed in as a comma-separated string.

CREATE FOREIGN TABLE json_table ()
SERVER json_server
OPTIONS (
    files '/path/to/file1.json, /path/to/file2.json'
);

To treat a directory of JSON files as a single table, the glob pattern should be used.

CREATE FOREIGN TABLE json_table ()
SERVER json_server
OPTIONS (
    files '/folder/*.json',
);

The glob pattern can also be used to read all JSON files from multiple directories.

CREATE FOREIGN TABLE json_table ()
SERVER json_server
OPTIONS (
    files '/folder1/*.json, /folder2/*.json'
);

Cloud Object Stores

The object stores documentation explains how to provide secrets and other credentials for JSON files stored in object stores like S3.