2018-08-30 (GCS 1.9.6, BQ 0.13.6)
Changelog
Cloud Storage connector:
-
Change default values for GCS batch/directory operations properties to improve performance:
fs.gs.copy.max.requests.per.batch (default: 1 -> 15) fs.gs.copy.batch.threads (default: 50 -> 15) fs.gs.max.requests.per.batch (default: 25 -> 15) fs.gs.batch.threads (default: 25 -> 15)
-
Migrate logging to Google Flogger.
To configure Log4j as a Flogger backend set
flogger.backend_factory
system property tocom.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance
orcom.google.cloud.hadoop.repackaged.gcs.com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance
if using shaded jar.For example:
java -Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance ...
-
Delete read buffer in
GoogleHadoopFSInputStream
class and remove property that enables it:fs.gs.inputstream.internalbuffer.enable (default: false)
-
Disable read buffer in
GoogleCloudStorageReadChannel
by default because it does not provide significant performance benefits:fs.gs.io.buffersize (deafult: 8388608 -> 0)
-
Add configuration properties for buffers in
GoogleHadoopOutputStream
:fs.gs.outputstream.buffer.size (default: 8388608) fs.gs.outputstream.pipe.buffer.size (default: 1048576)
-
Deprecate and replace properties with new one:
fs.gs.io.buffersize -> fs.gs.inputstream.buffer.size (deafult: 0) fs.gs.io.buffersize.write -> fs.gs.outputstream.upload.chunk.size (default: 67108864)
-
Enable fadvise
AUTO
mode by default:fs.gs.inputstream.fadvise (default: SEQUENTIAL -> AUTO)
-
Update all dependencies to latest versions.
BigQuery connector:
-
POM updates for GCS connector 1.9.6.
-
Migrate logging to Google Flogger.
To configure Log4j as a Flogger backend set
flogger.backend_factory
system property tocom.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance
orcom.google.cloud.hadoop.repackaged.bigquery.com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance
if using shaded jar.For example:
java -Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance ...
-
Poll BQ jobs in their correct locations.
-
Update all dependencies to latest versions.