Monday, January 8, 2018

UI and Feedback Feature Updates

Significant improvements have been made to the UI. These changes include table formatting, content listing choices, addition of a page for a tutorial (the content of which will be written closer to the release), and making style changes to the font and color of the page.

I decided to roll back the UI elements intended to gather feedback for individual entity results and classified statements. For the individual entity results (on the UI these are described as "people", "locations", and "other named subjects" ) I added two columns to the table with drop down menus. One indicated the whether an entry had been correctly chunked (i.e. a person entry contained both the first and last name and excluded any other tokens/words). The other was to provide feedback on the classification of the entry (i.e. is an entry found in the person table correctly categorized or should it have been tagged as a location or general named entity or not categorized at all).

For the entire-phrase results, I used a simple radio button to indicate whether a phrase had been correctly identified as containing a quote/stat/date/time.

This decision to remove these feedback features was made for multiple reasons. My intentions for the data I would have been collecting weren't driven by a strong enough sense of direction. I began implementing the UI elements thinking that it would be helpful to capture feedback from the current Scaffold algorithm implementation in order to improve future versions. I began, however, to consider what specific role this data would play in significantly improving the accuracy of Scaffold's results going forward.

I began researching established that the best chance I had improving the accuracy of my named entity chunking results was training my own as opposed to using the pre-trained chunker that NLTK provides. It's an option that I didn't have the experience for going into the project, but would feel confident about taking on now having acquired a significantly better understanding of NLP and ML concepts than when I began.


My conclusion from all of this was that I wasn't going to improve any of the end-product use cases by implementing the feedback-gathering features. On top of that, the feedback I was planning on gathering would not have been of the volume or containing the qualities that could contribute to the algorithmic improvements they were intended to support. The hit to the release timeline couldn't be justified, so feedback elements and all plans to hook them up to a database were shelved in good conscience.

My concession is that I'm planning to deploy with an AWS RDS MySQL database tied to the application in order to at least catch the raw text articles being input to Scaffold. This should at least open the doors to building a training set and understanding how Scaffold is being used.

No comments:

Post a Comment

Late January Updates

Nothing too big, this time, just wanted to pop over here for a check-in. My eagerness to move forward with my work on Scaffold has been mom...