Sometimes its required to output hive results in gzip files to reduce the file size so that the files can be transferred over network.
To do this, run the following commands in hive before running the query. The following code sets these options
Now if you run the hive query then the output of this hive query will be stored in gzip files.
To do this, run the following commands in hive before running the query. The following code sets these options
set mapred.output.compress=true;
set hive.exec.compress.output=true;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec;
Now if you run the hive query then the output of this hive query will be stored in gzip files.
INSERT OVERWRITE DIRECTORY 'hive_out' select * from table_name w ;"
No comments:
Post a Comment