Link

What is Text Analysis?

1.PNG

In the slippage between our literary notion of a text and the computer’s literal processing lie the disappointment and the possibility of text analysis. Computers cannot understand a text for us. They can, however, do things that may surprise us. Hermeneutica, Geoffrey Rockwell and Stefan Sinclair.

Terminology

Close Reading In literary criticism, close reading is the careful, sustained interpretation of a brief passage of a text. Close reading emphasizes the single and the particular over the general, effected by close attention to individual words, the syntax, the order in which the sentences unfold ideas, as well as formal structures.

Franco Moretti Moretti and his followers take the longue durée view of literature – looking at the temporal trends in dozens or even hundreds of years of literary history.

Distant Reading Distant reading is opposed to close reading: a traditional approach in literary studies when a critic closely examines a separate text, traces all the possible intertextual connections. Distant reading has the opposite goal: the scholar should “step back” from an individual text to see a larger picture: for example, the history of a genre during a century or the evolution of a particular artistic device over many decades

Corpus: The source text that will be analyzed. It can be a single text (Great Expectations) or a combination of texts (all of Jane Austen’s books).

N-gram: Looks for the repetition of a contiguous sequence of n (any number) of items from the source text.

Name entity recognition: Analyzing the text and retrieving name entities such as names, countries, dates, etc.

Tokens: Words, phrases, sentences that are used for text analysis. Tokenization prepares the text for text analysis.

Topic modeling: A statistical model to divide the words in the source text into a set number of topics.