Classification: the sigmoid, Learning
in LR, the cross-entropy loss function,
Gradient Descent,
Regularization,
Multinomial logistic regression,
interpreting models,
Deriving the Gradient Equation.Vector Semantics :
Lexical Semantics,
Vector Semantics,
Words,Vectors,
Cosine for measuring similarity,
TF-IDF,
Weighing terms in the vector,
Applications of the tf-idf vector model,
PMI,
Word2vec,
Visualizing Embeddings,
Semantic properties,
Bias,Embeddings,
Evaluating Vector Models.
English Word Classes,The Penn Treebank Part-of-Speech Tagset, Part-of-Speech Tagging, HMM PoS Tagging, Maximum Entropy Markov Models, Bidirectionality, Part-of-Speech Tagging for Other Languages.
Sequence Processing with Recurrent Networks : Simple Recurrent Networks, Applications of RNNs, Deep Networks: Stacked and Bidirectional RNNs, Managing Context in RNNs, LSTMs and GRUs, Words, Characters and Byte-Pairs.
Probabilistic Context-Free Grammars, Probabilistic CKY Parsing of PCFGs,
Ways to Learn PCFG Rule Probabilities,
Problems with PCFGs, Improving PCFGs by Splitting Non-Terminals, Probabilistic Lexicalized CFGs, Probabilistic CCG Parsing, Evaluating Parsers, Human Parsing.
Dependency Parsing : Dependency Relations, Dependency Formalisms, Dependency Treebanks, Transition-Based Dependency Parsing, Graph-Based Dependency Parsing, Evaluation.