See below: actual end user asked the use case on Spark user mailing list.
https://lists.apache.org/thread.html/68226abe4be5ec979eecc3e62e8c05b3036aa0876d555411436a35ab@%3Cuser.spark.apache.org%3E
Initial state can be either specific values or calculate from batch data.
The issue is, structured streaming doesn't have a concept of "initial state" - at least in HDFSBackedStateStore. If the version is 0, it doesn't load the state.
So either we should propose a change to Spark to allow initial state for version 0, or create a new checkpoint which has "batch 0" with empty offset/commit but initial state.
We could try latter (less coupled with upstream) and propose a change on upstream if it doesn't work.