-
Notifications
You must be signed in to change notification settings - Fork 0
/
hayes-roth.names
130 lines (117 loc) · 6.28 KB
/
hayes-roth.names
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
1. Title: Hayes-Roth & Hayes-Roth (1977) Database
2. Source Information:
(a) Creators: Barbara and Frederick Hayes-Roth
(b) Donor: David W. Aha ([email protected]) (714) 856-8779
(c) Date: March, 1989
3. Past Usage:
1. Hayes-Roth, B., & Hayes-Roth, F. (1977). Concept learning and the
recognition and classification of exemplars. Journal of Verbal Learning
and Verbal Behavior, 16, 321-338.
-- Results:
-- Human subjects classification and recognition performance:
1. decreases with distance from the prototype,
2. is better on unseen prototypes than old instances, and
3. improves with presentation frequency during learning.
2. Anderson, J.R., & Kline, P.J. (1979). A learning system and its
psychological implications. In Proceedings of the Sixth International
Joint Conference on Artificial Intelligence (pp. 16-21). Tokyo, Japan:
Morgan Kaufmann.
-- Partitioned the results into 4 classes:
1. prototypes
2. near-prototypes with high presentation frequency during learning
3. near-prototypes with low presentation frequency during learning
4. instances that are far from protoypes
-- Described evidence that ACT's classification confidence and
recognition behaviors closely simulated human subjects' behaviors.
3. Aha, D.W. (1989). Incremental learning of independent, overlapping, and
graded concept descriptions with an instance-based process framework.
Manuscript submitted for publication.
-- Used same partition as Anderson & Kline
-- Described evidence that Bloom's classification confidence behavior
is similar to the human subjects' behavior. Bloom fitted the data
more closely than did ACT.
4. Relevant Information:
This database contains 5 numeric-valued attributes. Only a subset of
3 are used during testing (the latter 3). Furthermore, only 2 of the
3 concepts are "used" during testing (i.e., those with the prototypes
000 and 111). I've mapped all values to their zero-indexing equivalents.
Some instances could be placed in either category 0 or 1. I've followed
the authors' suggestion, placing them in each category with equal
probability.
I've replaced the actual values of the attributes (i.e., hobby has values
chess, sports and stamps) with numeric values. I think this is how
the authors' did this when testing the categorization models described
in the paper. I find this unfair. While the subjects were able to bring
background knowledge to bear on the attribute values and their
relationships, the algorithms were provided with no such knowledge. I'm
uncertain whether the 2 distractor attributes (name and hobby) are
presented to the authors' algorithms during testing. However, it is clear
that only the age, educational status, and marital status attributes are
given during the human subjects' transfer tests.
5. Number of Instances: 132 training instances, 28 test instances
6. Number of Attributes: 5 plus the class membership attribute. 3 concepts.
7. Attribute Information:
-- 1. name: distinct for each instance and represented numerically
-- 2. hobby: nominal values ranging between 1 and 3
-- 3. age: nominal values ranging between 1 and 4
-- 4. educational level: nominal values ranging between 1 and 4
-- 5. marital status: nominal values ranging between 1 and 4
-- 6. class: nominal value between 1 and 3
9. Missing Attribute Values: none
10. Class Distribution: see below
11. Detailed description of the experiment:
1. 3 categories (1, 2, and neither -- which I call 3)
-- some of the instances could be classified in either class 1 or 2, and
they have been evenly distributed between the two classes
2. 5 Attributes
-- A. name (a randomly-generated number between 1 and 132)
-- B. hobby (a randomly-generated number between 1 and 3)
-- C. age (a number between 1 and 4)
-- D. education level (a number between 1 and 4)
-- E. marital status (a number between 1 and 4)
3. Classification:
-- only attributes C-E are diagnostic; values for A and B are ignored
-- Class Neither: if a 4 occurs for any attribute C-E
-- Class 1: Otherwise, if (# of 1's)>(# of 2's) for attributes C-E
-- Class 2: Otherwise, if (# of 2's)>(# of 1's) for attributes C-E
-- Either 1 or 2: Otherwise, if (# of 2's)=(# of 1's) for attributes C-E
4. Prototypes:
-- Class 1: 111
-- Class 2: 222
-- Class Either: 333
-- Class Neither: 444
5. Number of training instances: 132
-- Each instance presented 0, 1, or 10 times
-- None of the prototypes seen during training
-- 3 instances from each of categories 1, 2, and either are repeated
10 times each
-- 3 additional instances from the Either category are shown during
learning
5. Number of test instances: 28
-- All 9 class 1
-- All 9 class 2
-- All 6 class Either
-- All 4 prototypes
--------------------
-- 28 total
Observations of interest:
1. Relative classification confidence of
-- prototypes for classes 1 and 2 (2 instances)
(Anderson calls these Class 1 instances)
-- instances of class 1 with frequency 10 during training and
instances of class 2 with frequency 10 during training that
are 1 value away from their respective prototypes (6 instances)
(Anderson calls these Class 2 instances)
-- instances of class 1 with frequency 1 during training and
instances of class 2 with frequency 1 during training that
are 1 value away from their respective prototypes (6 instances)
(Anderson calls these Class 3 instances)
-- instances of class 1 with frequency 1 during training and
instances of class 2 with frequency 1 during training that
are 2 values away from their respective prototypes (6 instances)
(Anderson calls these Class 4 instances)
2. Relative classification recognition of them also
Some Expected results:
Both frequency and distance from prototype will effect the classification
accuracy of instances. Greater the frequency, higher the classification
confidence. Closer to prototype, higher the classification confidence.