Assess 311 Midterm
Assess 311 Midterm
LEARNING MODULE
FOR
ASSESS 311 ASSESSMENT IN LEARNING 1
_____________________________________________________
WEEK 7
Overview:
This course focuses on the principles, development and utilization of conventional assessment
tools to improve the teaching-learning process. It emphasizes on the use of assessment of, as, and
for, in measuring knowledge, comprehension and other thinking skills in the cognitive, psychomotor of
affective domains. It allows students to go through the standard steps in test construction and
development and the application in grading systems.
General Objectives:
Each chapter in this module contains a major lesson involving the Outcomes- Based Education.
The units are characterized by continuity, and are arranged in such a manner that the present unit is
related to the next unit. For this reason, you are advised to read this module. After each unit, there are
exercises to be given. Submission of task given will be during your scheduled class hour.
The different forms of assessment are classified according to purpose, form, interpretation of
learning, function, ability, and kind of learning.
Classification Type
Purpose Educational
Psychological
Form Paper-and-Pencil
Performance-based
Function Teacher-made
Standardized
Kind of Learning Achievement
Aptitude
Ability Speed
Power
Interpretation of Learning Norm-referenced
Criterion-referenced
Educational assessments are used in the school setting for the purpose of tracking the growth of
learners and grading their performance. This assessment in the educational setting comes in the form
of formative and summative assessment. These work hand-in-hand to provide information about
student learning. Formative assessment is a continuous process of gathering information about
student learning at the beginning, during, and after instruction so that teachers can decide how to
improve their instruction until learners are able to meet the learning targets. When the learners are
provided with enough scaffold as indicated by the formative assessment, then the summative
assessment is conducted. The purpose of summative assessment is to determine and record what the
learners have learned. On the other hand, the purpose of formative assessment is to track and monitor
student learning and their progress toward the learning target. Formative assessment can be any form
of assessment (Paper-and-pencil or performance-based) that is conducted before, during, and after
instruction. Before instruction begins, formative assessment serves as diagnostic tool to determine
whether learners already know about the learning target. More specifically, formative assessment
given at the start of the lesson determines the following:
1. What learners know and do not know so that instruction can supplement what learners do not
know.
2. Misconceptions of learners so that they can be corrected.
3. Confusion of learners so that they can be clarified.
4. What learners can and cannot do so that enough practice can be given to perform the task.
The information from educational assessment at the beginning of the lesson is used by the teacher
to prepare relevant instruction for learners. For example, if the learning target is for learners to
determine the by-product of photosynthesis, then the teacher can ask learners if they know what the
food of plants is. If incorrect answers are provided, then the teacher can recommend references for
them to study. If the learning target is for learners to divide a three-digit number by a two-digit number,
then the teacher can start with a three-item exercise on the task to identify who can and cannot
perform task. For those who can do the task, the teacher can provide more exercises; for those who
Educational assessment during instruction is done where the teacher stops at certain parts of the
teaching episodes to ask learners questions, assign exercises, short essays, board work, and other
tasks. If the majority of the learners are still unable to accomplish the task, then the teacher realizes
that further instruction is needed by learners. The teacher continuously provides a series of practice
drills and exercises until the learners are able meet the learning target. These drills and exercises are
meant to make learners consolidate the skill until they can execute it with ease. At this point of the
instruction, the teacher should be able to see the progress of the learners in accomplishing the task.
The teacher can require the learners to collect the results of their drills and exercises so that learners
can track their own progress as well. This procedure allows learners to become active participants in
their own learning. At this point of the instruction, the results of assessment are no yet graded because
the learners are still in the process of reaching the learning target; and some learners do not progress
at the same rate as the others.
When the teacher observes that majority or all of the learners are able to demonstrate the learning
target, then the teacher can now conduct the summative assessment. It is best to have a summative
assessment for each learning target so that there is evidence that learning has taken place. Both the
summative and formative assessments should be aligned to the same learning target; in this case,
there should be parallelism between the tasks provided in the formative and summative assessment.
Psychological assessments, such as tests and scales, are measures that determine the learners’
cognitive and non-cognitive characteristics. Examples of cognitive tests are those that measure ability,
aptitude, intelligence, and critical thinking. Affective measures are for personality, motivation, attitude,
interest, and disposition. The results of these assessments are used by the school’ s guidance
counselor to perform interventions on the learners’ academic, career, and social and emotional
development
Paper-and-pencil types of assessments are cognitive tasks that require a single correct answer.
They usually come in the form of test types, such as binary (true or false), short answer (identification),
matching type, and multiple choice. The items usually pertain to a specific cognitive skill, such as
recalling, understanding, applying, analyzing, evaluating, and creating. On the other hand,
performance-based type of assessments requires learners to perform tasks, such as demonstrations,
arrive at a product, show strategies, and present information. The skills applied are usually complex
and require integrated skills to arrive at the target response. Examples include writing essay, reporting
in front of the class, reciting a poem, demonstrating how a problem was solved, creating a word
problem, reporting the results of an experiment, dance and song performance, painting and drawing,
playing a musical instrument, etc. Performance-based tasks are usually open-ended, and each learner
arrives with various possible responses.
The use of paper-and-pencil and performance-based tasks depends on the nature and content of
the learning target. Below are examples of learning targets that require a paper-and-pencil type of
assessment:
Identify the part of the plants
Label the parts of the microscope
Compute the compound interest
Classify the phase of a given matter
Provide the appropriate verb in the sentence
Identify the type of sentence
Standardized tests have fixed directions for administering and scoring. They can be purchased
with test manuals, booklets, and answer sheets. When these tests were developed, the items were
sampled on a large number of target groups called the norm. The norm group’ s performance is used
to compare the results of those who took the test.
Non-standardized or teacher-made tests are usually intended for classroom assessment. They are
used for classroom purposes, such as determining whether learners have reached the learning target.
These intend to measure behavior (such as learning) in line with the objectives of the course.
Examples are quizzes, long tests, and exams. Formative and summative assessments are usually
teacher-made tests.
Can a teacher-made test become a standardized test? Yes, as long as it is valid, reliable, and with
a standard procedure for administering, scoring, and interpreting results.
According to Lohgman (2005), aptitudes are the characteristics that influence a person’ s
behavior that aid goal attainment in a particular situation. Specifically, aptitude refers to the degree of
readiness to learn and perform well in a particular situation or domain (Corno et al. 2002). Examples
include the ability to comprehend instructions, manage one’ s time, use previously acquired
knowledge appropriately, make good inferences and generalizations, and manage one’ s emotions.
Other developments have also led to the conclusion that assessment of aptitude can go beyond
cognitive abilities. An example is the Cognitive Abilities Measurement that measures working memory
capacity, ability to store old information and process new ones, and speed of an individual in retrieving
and processing new information (Kyllonen and Christal 1989). Magno (2009) also created a taxonomy
of aptitude test items. The taxonomy provides item writers with a guide on the type of items to be
included when building an aptitude test depending on the skills specified. The taxonomy includes 12
classifications categorized as verbal and nonverbal. The schemes in the verbal category include
verbal analogy, syllogism, and number or letter series; the nonverbal is composed of topology, visual
discrimination, progressive series, visualization, orientation, figure ground perception, surface
development, object assembly, and picture completion.
Speed tests consist of easy items that need to be completed within a time limit. Power tests consist
of items with increasing level of difficulty, but time is sufficient to complete the whole test. An example
of a power test was the one developed by the National Council Of Teachers of Mathematics that
determines the ability of the examinees to utilize data to reason and become creative, formulate, solve,
and reflect critically on the problems provided. An example of a speed test is a typing test in which
examinees are required to correctly type as many words as possible given a limited amount of time.
There are two types of test based on how the scores are interpreted: norm-referenced and
criterion-referenced tests. Criterion-referenced test has a given set of standards, and the scores are
compared to the given criterion. For example, in a 50-item test: 40-50 is very high, 30-39 is high, 20-29
is average, and 10-19 is low, and 0-9 is very low. One approach is criterion-referenced interpretation is
that the score is compared to a specific cutoff. An example is the grading in schools where the range of
range of grades 96-100 is highly proficient, 90-95 is proficient, 80-89 is nearly proficient, and below 80
is beginning. The norm-referenced test interprets results using the distribution of scores of a sample
group. The mean and standard deviations are computed for the group. The standing of every
individual in a norm-referenced test is based on how far they are from the mean and standard
deviation of the sample. Standardized tests usually interpret scores using a norm set from a large
sample.
Having an established norm for a test means obtaining the normal or average performance in the
distribution of scores. A normal distribution is obtained by increasing the sample size. A norm is a
standard and is based on a very large group of samples. Norms are reported in the manual of
standardized tests.
A normal distribution found in the manual takes the shape of a bell curve. It shows the number of
people within a range of scores. It also reports the percentage of people with particular scores. The
norm is used to convert a raw score into standard scores for interpretability.
ACTIVITY #7
Discussion and Exercise Questions
Directions: Read and understand this module. Provide what is being asked. Write your answer in a
long bond paper (Hand written) and attach it to the last page of this module.
Tasks: Case Analysis
A. Below is an illustrative scenario. Provide your answers to the questions based on the information
presented.
Case
A teacher in Mathematics wanted to determine how well the learners have learned their lesson on
fractions. After two weeks of drills and exercises, the teacher wanted to record how well the learners
have learned about fractions. The specific learning competencies taught by the teacher are (1) adding
similar fractions and (2) solving word problems involving the addition of similar fractions. The school
has an available standardized test on mathematics, but it covers many topics aside from fraction.
Why do we need to define the test objectives or learning outcomes targeted for assessment?
In designing a well-planned written test, first and foremost, you should be able to identify the
intended learning outcomes in a course, where a written test is an appropriate method to use. These
learning outcomes are knowledge, skills, attitudes, and values that every student should develop
throughout the course. Clear articulation of learning outcomes is a primary consideration in lesson
planning because it serves as the basis of evaluating the effectiveness of the teaching and learning
process determined through testing or assessment. Learning objectives or outcomes are measurable
statements that articulate, at the beginning of the course, what students should know and be able to do
or value as a result of taking the course.
In developing a written test, the cognitive behaviors of learning outcomes are usually targeted. For
the cognitive domain, it is important to identify the levels of behavior expected from the students.
Traditionally, Bloom’ s Taxonomy was used to classify learning objectives based on levels of
complexity and specificity of the cognitive behaviors. With knowledge at the base (e.i., lower-order
thinking skill), the categories progress to comprehension, application, analysis, synthesis, and
evaluation. However, Anderson and Krathwohl, Bloom’ s student and research partner, respectively,
came up with a revised taxonomy, in which the nouns used to represent the levels of cognitive
behavior were replaced by verbs, and the synthesis and evaluation were switched. (Figure 4.1
presents the two taxonomies.)
Bloom (1956)
Create
Evaluation
Evaluate
Synthesis
Analyze
Analysis
Apply
Application
Comprehension
Comprehension
Remember
Knowledge
A table of specifications (TOS), sometimes called a test blueprint, is a tool used by teachers to
design a test. It is a table that maps out the test objectives, contents, or topics covered by the test; the
levels of cognitive behaviour to be measured; the distribution of items, number, placement and weights
of test items; and the test format. It helps ensure that the course’ s intended learning outcomes,
assessments, and instruction are aligned.
Generally, the TOS is prepared before a test is created. However, it is ideal to prepare one even
before the start of instruction. Teachers need to create a TOS for every test that they intend to develop.
The test TOS is important because it does the following:
- Ensures that the instructional objectives and what the test captures match
- Ensures that the test developer will not overlook details that are considered essential to a good
test
- Makes developing a test easier and more efficient
- Ensures that the test will sample all important content areas and processes
- Is useful in planning and organizing
- Offers an opportunity for teachers and students to clarify achievements expectations
ACTIVITY # 8
Discussion and Exercise Questions
Directions: Read and understand this module. Provide what is being asked. Write your answer in a
long bond paper (Hand written) and attach to the last page of this module.
Task: To be able to check whether you have learned the important information about planning the test,
please provide your answers to the questions given in the graphical representation.
Planning
the Test
Learner assessment within the framework of classroom instruction requires planning. The
following are the steps in developing a table of specifications:
1. Determine the objectives of the test. The first step is to identify the test objectives. This should be
based on the instructional objectives. In general, the instructional objectives or the intended learning
outcomes are identified at the start, when the teacher creates the course syllabus. There are three
types of objectives: (1) cognitive, (2) affective, and (3) psychomotor. Cognitive objectives are designed
to increase an individual’ s knowledge, understanding, and awareness. On the other hand, affective
objectives aim to change an individual’ s attitude into something desirable, while psychomotor
objectives are designed to build physical or motor skills. When planning for assessment, choose only
the objectives that can be best capture by a written test. There are objectives that are not meant for a
written test. For example, if you test the psychomotor domain, it is better to do a performance-based
assessment. There are also cognitive objectives that are sometimes better assessed through
performance-based assessment. Those that require the demonstration of creation of something
tangible like projects would also be more appropriately measured by performance-based assessment.
For a written test, you can consider cognitive objectives, ranging from remembering to creating of
ideas that could be measured using common formats for testing, such as multiple choice, alternative
response test, matching type, and even essays or open-ended tests.
2. Determine the coverage of the test. The next step in creating the TOS is to determine the
contents of the test. Only topics or contents that have been discussed in class and are relevant should
be included in the test.
3. Calculate the weight for each topic. Once the test coverage is determined, the weight of each
topic covered in the test is determined. The weight assigned per topic in the test is based on the
relevance and the time spent to cover each topic during instruction. The percentage of time for a topic
in a test is determined by dividing the time spent for that topic during instruction by the total amount of
time spent for all topics covered in the test. For example, for a test on the theories of personality for
general psychology 101 class, the teacher spent ¼ to 1 ½ hours class sessions. As such, the weight
for each topic is as follows:
Topic No. of Sessions Time Spent Percentage of Time
(Weight)
4. Determine the number of items for the whole test. To determine the number of items to be
included in the test, the amount of time needed to answer the items are considered. As a general rule,
students are given 30-60 seconds for each item in test formats with choices. For a one-hour class, this
means that the test should not exceed 60 items. However, because you need also to give time for test
paper/booklet distribution and giving instructions, the number of items should be less, maybe just 50
Topic Test Objective No. of Hours Format and No. and Percent
Spent Placement of of items
items
Theories and Recognize 0.5 Multiple choice 5 (10.0 %)
Concepts important item #s 1-5
concepts in
personality
theories
Psychoanalytic Identify the 1.5 Multiple choice 15 (30.0 %)
Theories different theories item #s 6-20
of personality
under the
psychoanalytic
model
Etc.
TOTAL 5 50 (100 %)
2. Two-way TOS. A two-way TOS reflects not only the content, time spent, and number of items but
also the levels of cognitive behaviour targeted per test content based on the theory behind cognitive
Content Time No. & KD* Level of Cognitive Behaviour, Item Format, No.
Spent Percent and Placement of Items
of Items R U AP AN E C
Theories and 0.5 5 F I.3
Concepts Hours (10.0 %) #1-3
C I.2
#4-5
Psychoanalytic 1.5 15 F I.2
Theories Hours (30.0 %) #6-7
C I.2 I.2
#8-9 #10-11
P I.2 1.2
#12-13 #14-15
M 1.3 II.1 II.1
#16-18 #41 #42
Etc.
Scoring 1 point per item 2 points per item 3 points per
item
OVERALL 5 50 20 20 10
TOTAL (100.0 %)
Task: Supposed you are currently teaching. Apply what you have learned by creating a two-way
TOS of the final exam of your class. Take into consideration the content or topic, time spent for each
topic; knowledge dimension; and item format, number, and placement for each level of cognitive
behavior.
What are the general guidelines in choosing the appropriate test format?
Not every test is universally valid for every type of learning outcome. For example, if an intended
outcome for a Research Method 1 course is “ to design and produce a research study relevant to
one’ s field of study,” you cannot measure this outcome through a multiple-choice test or a
matching-type test.
To guide you on choosing appropriate test format and designing fair and appropriate yet
challenging tests, you should ask the following important questions:
1. What are the objectives or desired learning outcomes of the subject/unit/lesson being assessed?
Deciding on what test format to use generally depends on your learning objectives or the desired
learning outcomes (DLOs) are statements of what learners are expected to do or demonstrate as
result of engaging in the learning process. It is suggested that you return to Lesson 4 to review on how
to set or write instructional objectives or intended learning outcomes for a subject.
The level of thinking to be assessed is also an important factor to consider when designing your
test, as this will guide you in choosing the appropriate test format. For example, if you intend to assess
how much your learners are able to identify important concepts discussed in class (i.e., remembering
or understanding level), a selected-response format such as multiple-choice test would be appropriate.
However if you intend to assess how your students will be able to explain and apply in another setting
a concept or framework learned in class (i.e., applying and/or analyzing level), you may consider
giving constructed-response test formats such as essays.
It is important that when constructing classroom assessment tools, all levels of cognitive behaviors
are represented-from Remembering (R), Understanding (U), Applying (Ap), Analyzing (An), Evaluating
(E), and Creating (C)_ and taking into consideration the Knowledge Dimensions, i.e., Factual (F),
Conceptual (C), Procedural (P), and Metacognition (M). You may return to Lesson 2 and Lesson 4 to
review the different levels of Cognitive Behavior and Knowledge Dimensions.
3. Is the test matched or aligned with the course’ s DLOs and the course contents or learning
activities?
The assessment task should be aligned with the instructional activities and the DLOs. Thus, it is
important that you are clear about what DLOs are to be addressed by your test and what course
activities or tasks are to be implemented to achieve the DLOs.
For example, if you want learners to articulate and justify their stand on ethical decision-making
and social responsibility practices in business (i.e., DLO), then an essay test and class debate are
appropriate measures and tasks for this learning outcome. A multiple-choice test may be used but only
if you intend to assess learners’ ability to recognize what is ethical versus unethical decision-making
practice. In the same manner, matching-type items may be appropriate if you want to know whether
your students can differentiate and match the different approaches or terms to their definitions.
Test items should be meaningful and realistic to the learners. They should be relevant or related to
their everyday experiences. The use of concepts, terms, or situations that have not been discussed in
ACTIVITY # 10
Discussion and Exercise Questions
Directions: Read and understand this module. Provide what is being asked. Write your answer in a
long bond paper (Hand written) and attach to the last page of this module.
Task: Assumed that you are currently teaching. Create an assessment plan for a particular subject. For
each subject, list down the desired learning outcomes and subject topic or lesson; and for each
desired learning outcome, identify the appropriate test format to assess learners’ achievement of the
outcome. Example is provided below.
Subject: ____________
---------------------------------------------Nothing Follows-------------------------------------
Writing multiple-choice items requires content mastery, writing skills, and time. Only good and
effective items should be included in the test. Poorly-written test items could be confusing and
frustrating to learners and yield test scores that are not appropriate to evaluate their learning and
achievement. The following are the general guidelines in writing good multiple-choice items. They are
classified in terms of content, stem, and options.
Content:
1. Write items that reflect only one specific content and cognitive processing skills.
Faulty: Which of the following is the type of statistical procedure used to test a hypothesis
regarding significant relationship between variables particularly in terms of the extent and
direction of association?
A. ANCOVA C. Correlation
B. ANOVA D. t-test
Good: Which of the following is an inferential statistical procedure used to test a hypothesis
regarding significant differences between two qualitative variables?
A. ANCOVA C. Chi-Square
B. ANOVA D. Mann-Whitney Test
2. Do not lift and use statements from the textbook or other learning materials as test questions.
3. Keep the vocabulary simple and understandable based on the level of learners/examinees.
4. Edit and proofread the items for grammatical and spelling before administering them to the learners.
Stem:
1. Write the directions in the stem in a clear and understandable manner.
Faulty: Read each question and indicate your answer by shading the circle corresponding to your
answer.
Good: This test consists of two parts. Part A is a reading comprehension test, and Part B is a
grammar/language test. Each question is a multiple choice item with five (5) options. You are to
answer each question but will not be penalized for a wrong answer or for guessing. You can go back
and review your answer during the time allotted.
2. Write stems that are consistent in form and structure, that is, present all items either in question form
or in descriptive or declarative form.
Faulty: (1) Who was the Philippine president during Martial Law?
(2) The first president of the Commonwealth of the Philippines was ______.
Good: (1) Who was the Philippine president during Martial Law?
(2) Who was the first president of the Commonwealth of the Philippines?
3. Word the stem positively and avoid double negatives, such as NOT and EXCEPT in a stem. If a
negative word is necessary, underline or capitalize the words for emphasis.
Faulty: Which of the following is not a measure of variability?
Good: Which of the following is NOT a measure of variability?
4. Refrain from making the stem too wordy or containing too much information unless the
problem/question requires the facts presented to solve the problem.
Faulty: What does DNA stands for, and what is the organic chemical of complex molecular structure
found in all cells and viruses and codes genetic information for the transmission of inherited traits?
Good: As a chemical compound, what does DNA stand for?
Options:
1. Provide three (3) to five (5) options per item, with only one being the correct or best
answer/alternative.
2. Write options that are parallel or similar in form and length to avoid giving clues about the correct
answer.
ACTIVITY # 11
CRITERIA:
Content Mastery: Items reflect only one specific content and cognitive processing 10pts.
skills
Validity: Test must be valid for a particular year level and every type of learning 10pts.
outcome
Alignment: Test should be aligned with the instructional activities or tasks for a 10pts.
particular year/level
TOTAL: 30pts.
End of week 11
---------------------------------------------Nothing Follows--------------------------------------