Final

Here are some terms I’ll ask you to define on the final:

corpus
cluster analysis
MFW
OCR
CSV
PCA
n-gram
edges
nodes
dendrogram
topic modelling (bag of words)
stop words
collocation
directed vs. undirected
raw counts vs. relative frequencies