This method identifies words that are distinct between two corpora; which words one prefers and avoids.
This method will always find difference, so have to tread lightly.
Can compare author to author, group of authors to another group, genre to genre, etc.
create these folders:
Look at both “words” and “markers” options.
Also output are .txt files of the preferred and avoided words.
“Preferred” are the ones the primary set prefers.
“Avoided” are words the secondary set prefers (i.e., primary avoids)
You can also have a folder test_set which graphs as “+”s.
To compare texts/corpora alongside a generated preferred word_list, do this:
• run oppose() to generate a word list for primary_set which is named words_preferred.txt
• Rename this file wordlist.txt and make sure it’s in your working directory. This will be a difference list between the two folders you tried earlier.
• put your test cases in the folder “corpus”
• run stylo, and select “use existing list” and run a PCA covariance.
To interpret, the closer to zero your items appear, the more akin they are to your wordlist.txt, i.e., the more similarly they use the “preferred words” you generated.