-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Description
Connector Name
destination-bigquery
Connector Version
3.0.1
What step the error happened?
During the sync
Relevant information
data has a ASCII 0 character, i've been able to load this same the 3 weeks using the latest 2.x connector version, but now i've upgraded all deps (airbyte to 1.7.1, bigquery to 3.0.1) and now the full refresh is failing, i'm currently trying the bigquery write api instead of staging, but i dont think it will work out
the bigquery docs say you cant load using files with 0 ascii character
https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-csv#:~:text=Note%3A%20By%20default%2C%20if%20the%20CSV%20file%20contains%20the%20ASCII%200%20(NULL)%20character%2C%20you%20can%27t%20load%20the%20data%20into%20BigQuery.%20If%20you%20want%20to%20allow%20ASCII%200%20and%20other%20ASCII%20control%20characters%2C%20then%20set%20%2D%2Dpreserve_ascii_control_characters%3Dtrue%20to%20your%20load%20jobs.
Relevant log output
2025-06-28 09:53:59 replication-orchestrator INFO Failures: [ {
"failureOrigin" : "destination",
"failureType" : "system_error",
"internalMessage" : "java.lang.RuntimeException: Failed to load CSV data from gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz to table airbyte_internal.airbyte_catco_dbrecords8360cc3a9463aa60e3102abea0255eb2",
"externalMessage" : "Failed to load CSV data from gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz to table airbyte_internal.airbyte_catco_dbrecords8360cc3a9463aa60e3102abea0255eb2",
"metadata" : {
"attemptNumber" : 4,
"jobId" : 9,
"from_trace_message" : true,
"connector_command" : "write"
},
"stacktrace" : "java.lang.RuntimeException: Failed to load CSV data from gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz to table airbyte_internal.airbyte_catco_dbrecords8360cc3a9463aa60e3102abea0255eb2\n\tat io.airbyte.integrations.destination.bigquery.write.bulk_loader.BigQueryBulkLoader.load(BigQueryBulkLoader.kt:64)\n\tat io.airbyte.integrations.destination.bigquery.write.bulk_loader.BigQueryBulkLoader.load(BigQueryBulkLoader.kt:33)\n\tat io.airbyte.cdk.load.pipeline.db.BulkLoaderTableLoader.accept(BulkLoaderTableLoader.kt:51)\n\tat io.airbyte.cdk.load.pipeline.db.BulkLoaderTableLoader.accept(BulkLoaderTableLoader.kt:23)\n\tat io.airbyte.cdk.load.task.internal.LoadPipelineStepTask$execute$$inlined$fold$1.emit(Reduce.kt:225)\n\tat kotlinx.coroutines.flow.FlowKt__ChannelsKt.emitAllImpl$FlowKt__ChannelsKt(Channels.kt:33)\n\tat kotlinx.coroutines.flow.FlowKt__ChannelsKt.access$emitAllImpl$FlowKt__ChannelsKt(Channels.kt:1)\n\tat kotlinx.coroutines.flow.FlowKt__ChannelsKt$emitAllImpl$1.invokeSuspend(Channels.kt)\n\tat kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)\n\tat kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:100)\n\tat kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:124)\n\tat kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:89)\n\tat kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:586)\n\tat kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:820)\n\tat kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:717)\n\tat kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:704)\nCaused by: com.google.cloud.bigquery.BigQueryException: An error occurred during execution of job: Job{job=JobId{project=cat-copwa, job=63e9d279-b113-413a-b92b-4cddc25fe29e, location=europe-west1}, status=JobStatus{state=RUNNING, error=null, executionErrors=null}, statistics=LoadStatistics{creationTime=1751093530755, endTime=null, startTime=1751093530839, numChildJobs=null, parentJobId=null, scriptStatistics=null, reservationUsage=null, transactionInfo=null, sessionInfo=null, totalSlotMs=null, inputBytes=null, inputFiles=null, outputBytes=null, outputRows=null, badRecords=null}, userEmail=airbyte-bq-write@cat-copwa.iam.gserviceaccount.com, etag=MnI4dyfPavAXOJiLHwqmsA==, generatedId=cat-copwa:europe-west1.63e9d279-b113-413a-b92b-4cddc25fe29e, selfLink=https://bigquery.googleapis.com/bigquery/v2/projects/cat-copwa/jobs/63e9d279-b113-413a-b92b-4cddc25fe29e?location=europe-west1, configuration=LoadJobConfiguration{type=LOAD, destinationTable=GenericData{classInfo=[datasetId, projectId, tableId], {datasetId=airbyte_internal, projectId=cat-copwa, tableId=airbyte_catco_dbrecords8360cc3a9463aa60e3102abea0255eb2}}, decimalTargetTypes=null, destinationEncryptionConfiguration=null, createDisposition=null, writeDisposition=WRITE_APPEND, formatOptions=CsvOptions{type=CSV, allowJaggedRows=true, allowQuotedNewLines=true, encoding=null, fieldDelimiter=null, nullMarker=null, quote=null, skipLeadingRows=1, preserveAsciiControlCharacters=null}, nullMarker=\\N, maxBadRecords=null, schema=Schema{fields=[Field{name=_airbyte_raw_id, type=STRING, mode=REQUIRED, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=_airbyte_extracted_at, type=TIMESTAMP, mode=REQUIRED, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=_airbyte_meta, type=JSON, mode=REQUIRED, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=_airbyte_generation_id, type=INTEGER, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=__v, type=NUMERIC, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=_id, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=notes, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=stage, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=state, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=images, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=labels, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=damaged, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=actionAt, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=closedAt, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=location, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=lockedBy, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=metadata, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=reviewAt, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=assetType, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=createdAt, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=createdBy, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=decisions, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=deletedBy, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=flaggedAt, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=reference, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=responses, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=startedAt, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=updatedAt, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=updatedBy, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=workspace, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=attributes, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=lastAction, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=prevention, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=reasonCode, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=recordType, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=subscribed, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=accessGroup, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=lockedUntil, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=submittedAt, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=locationName, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=reviewStatus, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=stageChanges, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=linkedRecords, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=rulesActioned, type=JSON, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=userReference, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=_ab_cdc_cursor, type=INTEGER, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=clientResponse, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=timeToDecision, type=NUMERIC, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=submittedByRole, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=submittedByEmail, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=_ab_cdc_deleted_at, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}, Field{name=_ab_cdc_updated_at, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null, rangeElementType=null}]}, ignoreUnknownValue=null, sourceUris=[gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz], fileSetSpecType=null, columnNameCharacterMap=null, schemaUpdateOptions=null, autodetect=null, timePartitioning=null, clustering=null, useAvroLogicalTypes=null, labels=null, jobTimeoutMs=600000, rangePartitioning=null, hivePartitioningOptions=null, referenceFileSchemaUri=null, connectionProperties=null, createSession=null}}, \n For more details see Big Query Error collection: BigQueryError{reason=invalid, location=null, message=Error while reading data, error message: CSV processing encountered too many errors, giving up. Rows: 1095765; errors: 5; max bad: 0; error percent: 0},\n BigQueryError{reason=invalid, location=gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz, message=Error while reading data, error message: Bad character (ASCII 0) encountered.; line_number: 90969 byte_offset_to_start_of_line: 172499870 column_index: 24 column_name: \"reference\" column_type: STRING value: \"Pen Test ref 6756...\" File: gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz},\n BigQueryError{reason=invalid, location=gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz, message=Error while reading data, error message: Bad character (ASCII 0) encountered.; line_number: 91402 byte_offset_to_start_of_line: 173242187 column_index: 39 column_name: \"locationName\" column_type: STRING value: \"..\\\\..\\\\..\\\\..\\\\..\\\\.....\" File: gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz},\n BigQueryError{reason=invalid, location=gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz, message=Error while reading data, error message: Bad character (ASCII 0) encountered.; line_number: 91408 byte_offset_to_start_of_line: 173250722 column_index: 39 column_name: \"locationName\" column_type: STRING value: \"../../../../../.....\" File: gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz},\n BigQueryError{reason=invalid, location=gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz, message=Error while reading data, error message: Bad character (ASCII 0) encountered.; line_number: 91963 byte_offset_to_start_of_line: 174365931 column_index: 39 column_name: \"locationName\" column_type: STRING value: \"..\\\\..\\\\..\\\\..\\\\..\\\\.....\" File: gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz},\n BigQueryError{reason=invalid, location=gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz, message=Error while reading data, error message: Bad character (ASCII 0) encountered.; line_number: 91969 byte_offset_to_start_of_line: 174373909 column_index: 39 column_name: \"locationName\" column_type: STRING value: \"../../../../../.....\" File: gs://cat-co-airbyte-bq-staging/sync/airbyte_mdb_euw1_cat-co_db/entries/2025/06/28/06/4acb80c2-c1fe-45d9-aa93-60335213444c2025_06_28_1751093320392_0.csv.gz}:\n\tat io.airbyte.integrations.destination.bigquery.BigQueryUtils.waitForJobFinish(BigQueryUtils.kt:223)\n\tat io.airbyte.integrations.destination.bigquery.write.bulk_loader.BigQueryBulkLoader.load(BigQueryBulkLoader.kt:62)\n\t... 15 more\nCaused by: com.google.cloud.bigquery.BigQueryException: Error while reading data, error message: CSV processing encountered too many errors, giving up. Rows: 1095765; errors: 5; max bad: 0; error percent: 0\n\tat com.google.cloud.bigquery.Job.reload(Job.java:471)\n\tat com.google.cloud.bigquery.Job.waitForInternal(Job.java:290)\n\tat com.google.cloud.bigquery.Job.waitFor(Job.java:202)\n\tat io.airbyte.integrations.destination.bigquery.BigQueryUtils.waitForJobFinish(BigQueryUtils.kt:195)\n\t... 16 more\n",
"timestamp" : 1751093637153
}Contribute
- Yes, I want to contribute