这是indexloc提供的服务,不要输入任何密码
Skip to content
This repository was archived by the owner on May 17, 2021. It is now read-only.

Conversation

@watou
Copy link
Contributor

@watou watou commented Jul 17, 2015

This code change attempts to suppress rare cases where an attempt to cancel an update echo fails. An update echo is when the execute() method polls the Nest API and sends updates on the event bus for bound items. The binding's internalReceiveUpdate method, which is intended to handle updates from all other sources, consults a dynamic list of events to ignore and upon finding and removing a matching event, does not forward the update back to the Nest API. In rare cases and as yet unexplained, the echo cancellation fails and the binding sends the update back to the Nest API.

This code change employs two approaches to suppress this behaviour:

  1. Instead of using a string concantenation of the itemName and string version of the new State as previously, it places Update objects on the list, which are comprised of the itemName and the state. The Update class's equals method is used by the list's remove method to find and remove the first matching update. Previously String.equals was used, but appeared to be occasionally failing for as yet unknown reasons.
  2. When updates arrive for non-outbound items, a warning is logged and sending the update to Nest is skipped.

Note: This PR only attempts to work around the issue of unsuppressed echoes. My working theory, lacking other insight, is some kind of JVM bug. Please test this PR extensively before I merge it (and I will continue to test it), but hopefully some kind of fix or workaround can make it into openHAB 1.7.1.

@watou watou self-assigned this Jul 17, 2015
@watou watou added the bug label Jul 17, 2015
@mrguessed
Copy link
Contributor

I've been running the patch for ~4hs now, and there's no trace of issue in the logs. Prior to that, I'd get:

07:23:46.312 WARN  o.o.b.n.i.m.AbstractRequest[:147]- Method failed: HTTP/1.1 400 Bad Request
07:23:46.316 ERROR o.o.b.n.internal.NestBinding[:352]- Error updating data model: DataModelResponse[devices=<null>,structures=<null>]

every ~10-15 minutes.

In my environment, I only ever use sendCommand in my rules/scripts, and these are infrequently done (during Alarm arming/disarming events for the house). As a result, I'm not really testing the update codepath for anything other than suppressing echos.

@watou
Copy link
Contributor Author

watou commented Jul 19, 2015

Thanks @mrguessed! Please check again tomorrow if you can, just to make sure this resolves it for good.

Google Group discussion here.

@watou watou added this to the 1.7.1 milestone Jul 19, 2015
@mrguessed
Copy link
Contributor

Your patch is correct, and avoids the thread-safety issue... in openHAB Core.

As it turns it out, the issue is in DateTimeType.java, specifically the use of a shared/static SimpleDateFormat object, which is specifically listed as an issue in it's JavaDoc:
http://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html

It's an oldie, but a goodie, I remember running into that problem in the late 90's:

Date formats are not synchronized. It is recommended to create separate format instances for each thread. If multiple threads access a format concurrently, it must be synchronized externally.

Your new changes avoid the issue since you're no longer using toString() on your DateTimeType objects, so they're not handed a corrupted value string... which later fail comparison.

@watou
Copy link
Contributor Author

watou commented Jul 19, 2015

Criminy, what a nuisance to have a non-threadsafe toString() method. My patch should also run faster and create less garbage, so I'll merge it. Thanks very much, Mark.

watou added a commit that referenced this pull request Jul 19, 2015
Nest-Binding: Suppress occasionally failing update echo cancellations.
@watou watou merged commit ec48e28 into openhab:master Jul 19, 2015
@mrguessed
Copy link
Contributor

Yeah, it's not limited to it's DateTimeType.toString() method either, since the SimpleDateFormat objects are shared and used in parse() also.

I suspect if there's enough "DateTime" concurrency going on in an openHAB deployment that there will be little corruptions going on all over.

eg. multiple Bindings parsing, or logging, DateTimeType data at the same time.

I think it can be fixed by changing the const's into methods, of the same name, and having them return a new object for each call. Unfortunately, the const declarations in this class were made public so it makes the fix a little more sticky to fix since there may be other callers (including Rule code, potentially)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants