Contrastive Analysis Reflection

Using Stylo package in R Studio, I was able to use digital literary methods such as rolling deltas, zeta and the oppose() function on my corpus. The first step was to open up r studio and run the oppose() function on my corpus. To run the oppose() function I had to set my working directory on my corpus, the primary set and my secondary set. My primary set contained texts that were full of Mark Twain texts and my secondary set were all all texts by Charles Dickens and I also had a test set which were filled with texts by Jane Austen. When I ran my oppose function I received a list of words preferred and a list of words avoided. This would mean that the words preferred came from Mark Twains texts and the words avoided came from Charles Dickens. I ran the oppose function again but this time I changed my options from words to markers and I received a graph called Craig’s Zeta. The Craigs zeta is a function for comparing two sets of texts. It is a method that slips our texts into two equal representations in a graph and to check the appearances of particular words, the preferred and avoided words list, over the slices represented. This is where my test set came in with my tests set I was able to see if Jane Austen wrote more similar to Charles Dickens or more similar to Mark Twain and according to the Craig’s Zeta, Jane Austen’s texts are more similar to that of Charles Dickens. I ran the test again but with Jane Austen as my primary set and Charles Dickens as my secondary set because according to my first results Jane Austen and Charles Dickens wrote more alike. This must be because they were both English novelists so there must have been some commonly used English words that they both preferred unlike the American language which was most likely avoided. This must mean that location makes a difference in texts and the particular way they’re written by authors. But I did notice that there was a slight overlap which must mean that both authors shared a considerable amount of words through out their texts probably because of gender similarities. According to my texts, Jane Austen was more like my secondary set so she must have used words from the words avoided list and not the words preferred. In my second test, my test set was filled with texts by Mark Twain and because he is a male I would have assumed that his texts were more similar to that of Charles Dickens. According to Craigs Zeta I was right. The marks show that Mark Twain wrote more similar to that the writings of Charles Dickens based on the avoided words lists that was provided to me by the Craig’s Zeta. This could be because of gender similarities as the two authors were males while the one in my primary set was a female. In this graph I did notice Austen and Dickens did not overlap so that must mean they had no similarities at all based on words on the words avoided/ preferred list. I used my words preferred list to compare to texts/ corpora by modernists authors. My preferred word_ list had been then renamed to wordlist.txt so I could use that as my existing wordlist rather than r studio making another one for the corpus I want to compare my wordlist to. In r studio, I ran stylo() and using my corpus follow of modernist texts and with my existing word list, I also used PCA (cov.) to generate a principal components analysis graph. This graph will tell me the closer it is to 0,0,┬ábased on my existing wordlist, the more the texts are similarly written to the primary set I used which in my case it would be to that of Mark Twain. A lot of them were hovering around 0,0 and some were really close but I did notice that none of them were grouped as they were all spread out. Maybe this wasn’t the best corpus to compare my wordlist too but it did show me that a lot of people wrote similar to that of Mark Twain or that Mark Twains style resembled texts by modernists authors. It seems to me that majority of the texts were by males so maybe it would have looked different if I had renamed my texts and separated them by males/ females which would probably have showed me a better clustering of texts. n

One Reply to “Contrastive Analysis Reflection”

  1. I’m curious about your expectations for comparing Twain and Dickens. Were you looking for national differences? And what would testing Jane Austen help you see?

    Why do you think Austen is more like Dickens than Twain? And why is Twain more like Dickens than Austen? And do those results contradict each other? It’s unclear to me how you’re sorting out the various “signals” of nationality and gender (or genre?). Looking at the list itself might help you identify the words that are making this difference.

    I think there are a lot of unexamined assumptions you’re making . . . like why test the modernist folder in PCA? What were you expecting to see in doing so? Do you have any way of explaining PC1 and PC2?

Comments are closed.