Term
|
Definition
| Sort documents to user-defined classes |
|
|
Term
|
Definition
| Automate the selection of positive and negative terms in a document. Useful for political polls, marketing. |
|
|
Term
|
Definition
| Calculating the frequency of n-grams in a certain language that are usually spam words. |
|
|
Term
| Rule based spam identification |
|
Definition
| Filters spam based on rules and adds weight to certain n-grams and once it passes some threshold, its identified as spam. |
|
|
Term
| Statistical approach spam identification |
|
Definition
| These learn from a large set of examples--one spam set, one ham set. They can adapt based on what emails are marked as spam by all or specific users. |
|
|
Term
| Rule based identification drawbacks |
|
Definition
| They are, by nature, one step behind spammers because a pattern has to be identified first and by that time, the spam is already out. |
|
|
Term
|
Definition
| Training set and test set that is pre-programmed with the correct answers. |
|
|
Term
| Supervised learning method |
|
Definition
1. Label a corpus of artciels with desired categories to make training and test sets 2. Apply machine learning software to the labeled training system set that summarizes whats been learned. 3. Generate predictions for test set model 4. Deploy model on untested set |
|
|
Term
|
Definition
| There are no pre-assumed categories but there are now cluster articles that have similar properties, like being about sports. Its less costly because you dont have to sit someone down and label every single document but the clusters may not be intuitive and clustering solutions are difficult to evaluate. |
|
|
Term
|
Definition
| Looks at most relevant properties of spam |
|
|
Term
| Kitchen sink feature engineering |
|
Definition
| Use many features in the hope that some will be relevant and useful. Make every word a feature and choose a machine learning method that is good at focusing on few but important features and ignores irrelevant features. |
|
|
Term
| Hand crafted strategy of feature enginering |
|
Definition
| Carefully and thoughtfully identify a small set of features that are likely to be relevant. The downside is that you have the choose the features. |
|
|
Term
| Naive Bayes for document classification |
|
Definition
| Take a word. Count how much of that word is in spam and how much is in ham and calculat ethat ratio Then calculate the odds ratio (ham/total over spam/total). Combine the |
|
|
Term
|
Definition
| Pretend you're dealing with an unstructured set of data that ignores syntax and topic structure. Put all the words of a document in a bag, draw a word and calculate which document its most likely to have come from. |
|
|
Term
|
Definition
| Error-driven learning. It predicts outcomes and then adjusts the weights when it makes the wrong prediction. Initially the weights are uninformative but over time it builds up an ability to associate features with outcomes. Its a network with two layers; one node for each possible unput features and one for each possible outcome (spam and ham) |
|
|
Term
|
Definition
| How do people learn regular and irregular forms of words? |
|
|
Term
|
Definition
| Star with good performance on some task, then get substantially worse, and then gradually get better again. |
|
|
Term
|
Definition
| A test given to kids with a made up noun, "wug" and see if kids can determine the plural form. |
|
|
Term
|
Definition
Quantity: keep it short and sweet. Not TMI. Quality: Don't lie or be sarcastic. Relation: Say things that are pertinent to the question. Manner: Be clear, brief, and orderly. |
|
|
Term
|
Definition
| A robot that was an expert is moving shapes around. like, REALLY good. This showed that AI is successful but only in a very controlled and within a specific domain |
|
|
Term
|
Definition
| A man sits in a room with a Chinese rule book. The input is in English, he translates it using the rule book, and outputs in perfect chinese. Does he know chinese? Does the room know chinese? |
|
|
Term
|
Definition
| A therapy model that wasnt very good at her job. |
|
|
Term
|
Definition
| the logical aspects of language and its meaning |
|
|
Term
|
Definition
| How context contributes to meaning |
|
|