Operational Commands
COPY INTO
Description
Imports data from an external source into a table or exports data to a file from the table. The table the data is imported into must exist.
Diagram( Terminal('COPY'), Terminal('FROM'), NonTerminal('source'), Terminal('INTO'), NonTerminal('target'), Terminal('FORMAT'), NonTerminal('formatType'), Optional( Sequence( Terminal('WITH'), OneOrMore (NonTerminal('parameter', {href:'./grammar-reference/#parameter'}), Terminal(',')) ) ) )
Parameters
-
source
- the full path to the file to import the data data from (iftarget
is a table). Then name of the table and the list of columns to export the data from (iftarget
is a file). If the source is a table, you can further narrow down the selection by using a query. -
target
- the table and table columns when importing data to the table. The path to the file when exporting data. -
formatType
- the format of the file to work with:-
CSV
-
PARQUET
-
ICEBERG
-
-
WITH
- accepts additional parameters. All properties are case-sensitive:-
delimiter
- for CSV files only; default delimiter is,
. Delimiter syntax is'char'
. Any alphanumeric character can be a delimiter. -
pattern
- the file pattern used when importing partitioned Parquet tables in the regular expression format. The regular expression must be enclosed in'
signs. For example,'.*'
imports all files. Partitioned column will not be imported. -
header
- for CSV files only; if set totrue
, specifies that the created file should contain a header line with column names. Column names of the table are used to create the header line. -
S3
:-
s3.client-region
- when using S3 storage, the region to store the data in. -
s3.access-key-id
- when using S3 storage, the AWS access key. -
s3.secret-access-key
- when using S3 storage, the AWS secret key. -
Iceberg-specific:
-
The catalog properties are supported.
-
The
table-identifier
property describes Apache Iceberg TableIdentifier names. The names can be dot-separated. For example,db_name.table_name
ortable
. -
The
warehouse
path can be defined explicitly as a'warehouse'='path'
property, or implicitly as a source or targetCOPY FROM source INTO target
. If both ways are defined, the explicit property is used.
-
-
-
Examples
Import data from columns name
and age
of a CSV file with header into Table1 columns name
and age
:
/* Import data from CSV with column headers */
COPY FROM '/path/to/dir/data.csv'
INTO Table1 (name, age)
FORMAT CSV
Import data from the first two columns of a CSV file without header into Table1 columns name
and age
:
/* Import data from CSV without column headers */
COPY FROM '/path/to/dir/data.csv'
INTO Table1 (name, age)
Exports data from Table1 to a CSV file:
/* Export data to CSV */
COPY FROM (SELECT name, age FROM Table1)
INTO '/path/to/dir/data.csv'
FORMAT CSV
Imports CSV file from AWS S3 into Table1:
/* Import CSV file from s3 */
COPY FROM 's3://mybucket/data.csv'
INTO Table1 (name, age)
FORMAT CSV
WITH 'delimiter'= '|', 's3.access-key-id' = 'keyid', 's3.secret-access-key' = 'secretkey'
A simple example of exporting data to Iceberg. For working with local file system you can use HadoopCatalog that does not need to connect to a Hive MetaStore.
COPY FROM Table1 (id,name,height)
INTO '/tmp/person.i/'
FORMAT ICEBERG
WITH 'table-identifier'='person', 'catalog-impl'='org.apache.iceberg.hadoop.HadoopCatalog'
Exports data into Iceberg on AWS S3:
COPY FROM person (id,name,height)
INTO 's3://iceberg-warehouse/glue-catalog'
FORMAT ICEBERG
WITH
'table-identifier'='iceberg_db_1.person',
'io-impl'='org.apache.iceberg.aws.s3.S3FileIO',
'catalog-impl'='org.apache.iceberg.aws.glue.GlueCatalog',
's3.client-region'='eu-central-1',
's3.access-key-id'='YOUR_KEY',
's3.secret-access-key'='YOUR_SECRET'
Imports data from partitioned Parquet database:
COPY FROM '/tmp/partitioned_table_dir'
INTO city (id, name, population)
FORMAT PARQUET
WITH 'pattern' = '.*'
Where the Parquet table looks like this:
partitioned_table_dir/ ├─ CountryCode=USA/ │ ├─ 000000_0.parquet ├─ CountryCode=FR/ │ ├─ 000000_0.parquet
KILL QUERY
Cancels a running query. When a query is canceled with the KILL
command, all parts of the query running on all other nodes are canceled too.
Diagram( Terminal('KILL'), Terminal('QUERY'), NonTerminal('query_id'), Optional(Terminal('NO WAIT')) )
Parameters
-
query_id
- query identifier that can be retrieved via theSQL_QUERIES
system view. -
NO WAIT
- if specified, the command will return control immediately, without waiting for the query to be cancelled. You can monitor query status through theSQL_QUERIES
system view to make sure it was cancelled.
KILL TRANSACTION
Cancels an active transaction.
Diagram( Terminal('KILL'), Terminal('TRANSACTION'), NonTerminal('transaction_id'), Optional(Terminal('NO WAIT')) )
Parameters
-
transaction_id
- transaction identifier that can be retrieved via theTRANSACTIONS
system view. -
NO WAIT
- if specified, the command will return control immediately, without waiting for the query to be cancelled. You can monitor transaction status through theTRANSACTIONS
system view to make sure it was cancelled.
KILL COMPUTE
Cancels a running query. When a query is canceled with the KILL
command, all parts of the query running on all other nodes are canceled too.
Diagram( Terminal('KILL'), Terminal('COMPUTE'), NonTerminal('job_id'), Optional(Terminal('NO WAIT')) )
Parameters
-
job_id
- query identifier that can be retrieved via theCOMPUTE_JOBS
system view. -
NO WAIT
- if specified, the command will return control immediately, without waiting for the query to be cancelled. You can monitor job status through theCOMPUTE_JOBS
system view to make sure it was cancelled.
© 2025 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.