Streaming Bulk Import
Users can import avro or csv format files via port 9091 using BigObject streaming-in protocol.
- Import avro file
To import avro type of files, use following commands:
cat my_data.avro > /dev/tcp/127.0.0.1/9091
or (use netcat):
cat my_data.avro | nc 127.0.0.1 9091
Note
Avro file is self-described including table schema & table name. BigObject will create a table based on the metadata of avro. If the table exists, the content of avro will be appended.
- Import csv file
To import csv type of files, use following commands:
(echo -e "csv\x01my_data_tbl" ; cat my_data.csv) > /dev/tcp/127.0.0.1/9091
or use netcat:
(echo -e "csv\x01my_data_tbl" ; cat my_data.csv) | nc 127.0.0.1 9091
where my_data_tbl
is an existing BigObject table for my_data.csv.
For BSD netcat, use it with '-N' option:
(echo -e "csv\x01my_data_tbl" ; cat my_data.csv) | nc -N 127.0.0.1 9091
If users would like to skip a few lines at the start of the csv file for the purpose such as skipping the header, he may use following command to do so.
(echo -e "csv\x01my_data_tbl\x0skip_lines=10" ; cat my_data.csv) | nc 127.0.0.1 9091
where the user would skip 10 lines in above command.