这是indexloc提供的服务,不要输入任何密码
Skip to content

2018-08-30 (GCS 1.9.6, BQ 0.13.6)

Compare
Choose a tag to compare
@medb medb released this 31 Aug 01:06
· 922 commits to master since this release

Changelog

Cloud Storage connector:

  1. Change default values for GCS batch/directory operations properties to improve performance:

    fs.gs.copy.max.requests.per.batch (default: 1 -> 15)
    fs.gs.copy.batch.threads (default: 50 -> 15)
    fs.gs.max.requests.per.batch (default: 25 -> 15)
    fs.gs.batch.threads (default: 25 -> 15)
    
  2. Migrate logging to Google Flogger.

    To configure Log4j as a Flogger backend set flogger.backend_factory system property to com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance or com.google.cloud.hadoop.repackaged.gcs.com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance if using shaded jar.

    For example:

    java -Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance ...
    
  3. Delete read buffer in GoogleHadoopFSInputStream class and remove property that enables it:

    fs.gs.inputstream.internalbuffer.enable (default: false)
    
  4. Disable read buffer in GoogleCloudStorageReadChannel by default because it does not provide significant performance benefits:

    fs.gs.io.buffersize (deafult: 8388608 -> 0)
    
  5. Add configuration properties for buffers in GoogleHadoopOutputStream:

    fs.gs.outputstream.buffer.size (default: 8388608)
    fs.gs.outputstream.pipe.buffer.size (default: 1048576)
    
  6. Deprecate and replace properties with new one:

    fs.gs.io.buffersize -> fs.gs.inputstream.buffer.size (deafult: 0)
    fs.gs.io.buffersize.write -> fs.gs.outputstream.upload.chunk.size (default: 67108864)
    
  7. Enable fadvise AUTO mode by default:

    fs.gs.inputstream.fadvise (default: SEQUENTIAL -> AUTO)
    
  8. Update all dependencies to latest versions.

BigQuery connector:

  1. POM updates for GCS connector 1.9.6.

  2. Migrate logging to Google Flogger.

    To configure Log4j as a Flogger backend set flogger.backend_factory system property to com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance or com.google.cloud.hadoop.repackaged.bigquery.com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance if using shaded jar.

    For example:

    java -Dflogger.backend_factory=com.google.common.flogger.backend.log4j.Log4jBackendFactory#getInstance ...
    
  3. Poll BQ jobs in their correct locations.

  4. Update all dependencies to latest versions.