这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@ApproximateIdentity
Copy link
Contributor

The method was being computed each time it was called, but all data used
to do the computation is constant throughout the life of the class.

The method was being computed each time it was called, but all data used
to do the computation is constant throughout the life of the class.
@ApproximateIdentity
Copy link
Contributor Author

By the way this isn't a minor issue. This speeds up my calls to the predict method https://github.com/ApproximateIdentity/rosetta/blob/4aa21a62719be5bb32f17ea1e7454742365257c2/rosetta/text/vw_helpers.py#L471 by ~15 times on real world data.

@dkrasner
Copy link
Contributor

@ApproximateIdentity this looks fine

did you run an integration test just for the sake of sanity?

@ApproximateIdentity
Copy link
Contributor Author

ApproximateIdentity commented Sep 22, 2017

First regarding your question:

So actually I didn't and there is a new failure, but I think it has more to do with how the test is written than anything. This test fails:

https://github.com/columbia-applied-data-science/rosetta/blob/master/rosetta/tests/test_text.py#L326-L334

But really what I'm wondering is, should the following even be allowed: https://github.com/columbia-applied-data-science/rosetta/blob/master/rosetta/tests/test_text.py#L330-L331

I.e. should there ever be a situation when the user sets the _lambda_word_sums attribute directly? That attribute is dependent on the topic data passed in: https://github.com/ApproximateIdentity/rosetta/blob/constExpElogbeta/rosetta/text/vw_helpers.py#L321

So basically I feel like the test here itself is using an internal API in a way that's inconsistent with the class. Personally I would just strip out the test entirely, but I'm not sure if that would be a popular way to go.

Secondly I pushed a bunch more commits because there are different tests that aren't working right. My fixes don't fix all tests...there are still issues with a couple tests, but I haven't quite sorted out why they're failing yet.

edit: Actually the more I mess with this it seems like it's almost easiest to just pin pandas<=0.16 or whatever in requirements.

@ApproximateIdentity ApproximateIdentity force-pushed the constExpElogbeta branch 2 times, most recently from fd8d7c8 to a574a0e Compare October 2, 2017 10:34
@ApproximateIdentity
Copy link
Contributor Author

I removed my commits that were updating tests for the newer versions of pandas and instead pinned the pandas version to 0.16.2 in the requirements file. I've verified that the tests all pass with that version (except for the test that I removed).

Personally I think this is now ready to be merged.

There have been backwards-incompatible changes in later versions of
pandas for which rosetta has not been updated. Until rosetta is updated,
the version should be pinned to reflect this.
This test was using unexposed internal class methods in a way which was
incompatible with legitimate usage. It worked before due to how the
internals of the class were desiged, but no longer due to the fact that
the LDAResults._expElogbeta() method is const.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants