Computational text analysis is the analysis of digital documents uploaded to a data base by machine learning. It uses natural text and computer processing to examine each word in text documents. An example of this is Google’s word usage – overtime option (analysis of the popularity of a word in a certain year) which many of us have used without much thought, showing its prevalence in our day to day lives. In this digital age it has become a very helpful tool for use as a historian, helping better understand and match the language in context of historical events on a broader level. Topics in which historians would spend a lifetime analyzing documents, is done in a few minutes in certain cases. However, because of the immensity it generally comes with, this data should not be taken as face value and used only as a tool in a deeper dive of study.
One of the reasons why computational text analysis is so useful as a historian is that it can speed up the processes of historical reading. To be successful in historical reading, one must follow closely with historical reading skills, those being sourcing, close reading, contextualizing, and corroborating as referred to in “Model of Historical Thinking” from our first week of class. In some instances of study, its use can touch base on all but close reading analysis. Sourcing can be covered by checking the reliability of the website used for analysis and picking a topic with endless amounts to authors. Since all this data can be view accurately by computers, this technology allows for discarding outliners and deceptive documents. Contextualizing with computational text analysis is also similar to sourcing regarding reliability while also displaying exact dates and years from all historic data. This helps answer questions on the context of documents easily letting reviewers make connections to important past values and national events. In this analysis style, corroborating is presented by the endless amount of data from historical documents processed online to be juxtaposed. This brings to life the discovery of points of agreement, and the consistency found between documents important for a reliable study.
An example, in our class we looked at “The Language of the State of the Union” which is a website that compares the number of used words in each presidents State of the Union address. This asset provides interactive charts and graphs sorting addresses by date or word density. This method of exploration helps paint a broad picture of change in American History. However, it is important to denote that it is only general information and historians, or researchers could create bias if taken as face value. Overall computational text analysis is a valuable resource for historians if used complementary in their research, covering a large blend of online data.