-
Notifications
You must be signed in to change notification settings - Fork 45
Open
Description
My vw data is of this format
| this is great
| I try to learn English everyday
[...]
saved as data.vw
I try to run this code:
from rosetta.text.vw_helpers import LDAResults
from rosetta.text.text_processors import SFileFilter, VWFormatter
def generate_filefilter():
sff = SFileFilter(VWFormatter())
sff.load_sfile('data.lda.vw')
df = sff.to_frame()
df.head()
df.describe()
sff.filter_extremes(doc_freq_min=5, doc_fraction_max=0.8)
sff.compactify()
sff.save('sff_file.pkl')
if __name__ == '__main__':
generate_filefilter()
And the error is:
Traceback (most recent call last):
File "/<home>/.venv/lib/python2.7/site-packages/rosetta/text/text_processors.py", line 380, in _parse_preamble
if preamble[-1] != ' ':
IndexError: string index out of range
Metadata
Metadata
Assignees
Labels
No labels