This project is a collection of a couple different experiments and utilities for topic extraction, including:
- Comparing popular topic extraction libraries on different kinds of documents
- A simple implementation of decision trees to explain group membership for clustered texts
- Some additional utilites for explaining decision tree rules based on bag of word embedding features
- Plotting utilities for derived clusters and/or their assigned topics
- The results report outlining approach and findings
- The outputs folder, which contains breakdowns of dataset topics, topic evaluations, and tests of group differences
- Decision tree topic extraction implementation