0% found this document useful (0 votes)
32 views10 pages

2014 DeepCaptcha

Uploaded by

ricardososaphd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views10 pages

2014 DeepCaptcha

Uploaded by

ricardososaphd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DeepCAPTCHA: An Image CAPTCHA Based on Depth

Perception

Hossein Nejati Ngai-Man Cheung Ricardo Sosa


Singapore University of Singapore University of Singapore University of
Design and Technology Design and Technology Design and Technology
20 Dover Drive, Singapore 20 Dover Drive, Singapore 20 Dover Drive, Singapore
hossein [email protected] ngaiman [email protected] ricardo [email protected]

Dawn C.I. Koh


Singapore University of
Design and Technology
20 Dover Drive, Singapore
dawn [email protected]

ABSTRACT
Over the past decade, text-based CAPTCHA (TBC) have
become popular in preventing adversarial attacks and spam
in many websites and applications including emails services,
social platforms, web-based market places, and recommen-
dation systems. However, in addition to several problems
with TBC, it has become increasingly difficult to solve in
recent years, to keep up with OCR technologies. Image-
based CAPTCHA (IBC), on the other hand, is a relatively
new concept that promises to overcome key limitations of
TBC. In this paper we present an innovative IBC, Deep-
CAPTCHA, based on design guidelines, psychological the-
ory and empirical experiments. DeepCAPTCHA exploits
the human ability of depth preception. In our IBC users
should arrange 3D objects in terms of size (or depth). In
our framework for DeepCAPTCHA, we automatically mine
3D models, and use a human-machine Merge Sort algorithm
to order these unknown objects. We then create new appear-
ances for these objects at multiplication factor of 200, and Figure 1: DeepCAPTCHA is an image-based
present these new images to the end-users for sorting (as CAPTCHA to address the shortcommings of
CAPTCHA tasks). Humans are able to apply their rapid text-based CAPTCHA, by exploiting humans
and reliable object recognition and comparison (arise from depth-perception abilities (Compared a text-based
years experience with the physical environment) to solve CAPTCHA from Google).
DeepCAPTCHA, while machines are still unable to com-
plete these tasks. Experimental results show that humans
can solve DeepCAPTCHA with a high accuracy (˜84%) and
ease, while machines perform dismally. mans and machines. These tests require tasks that are easy
for most humans to solve, while being almost intractable for
state-of-art algorithms and heuristics. Today’s CAPTCHA
1. INTRODUCTION have two main benefits: (1) they are web-based means to
CAPTCHA, standing for “Completely Automated Pub- avoid the enrollment of automatic programs in places where
lic Turing test to tell Computers and Humans Apart”, is an only humans are allowed (e.g. email registrations, on-line
automatic challenge-response test to distinguish between hu- polls and reviewing systems, social networks, etc.); and (2)
they also reveal gaps between human and machine perfor-
Permission to make digital or hard copies of all or part of this work for mance in a particular task. Efforts to close these gaps by
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear artificial intelligence (AI) researchers make CAPTCHA an
this notice and the full citation on the first page. Copyrights for components evolving battle field between designers and researchers.
of this work owned by others than ACM must be honored. Abstracting with The most popular type of CAPTCHA is text-based CAPTCHA
credit is permitted. To copy otherwise, or republish, to post on servers or to (TBC), which has been in use for over a decade [1, 2]. The
redistribute to lists, requires prior specific permission and/or a fee. Request use of TBC has not only helped user security (e.g. in email
permissions from [email protected]. services) and removing spam (e.g. in review and recommen-
MMSys ’14, March 19 - 21 2014, Singapore, Singapore
Copyright 2014 ACM 978-1-4503-2705-3/14/03ÃĆ ...$15.00 dation systems), but also indirectly helped the improvement
http://dx.doi.org/10.1145/2557642.2557653. of optical character recognition (OCR) technology [2], and

81
even digitizing a large number of books (in reCAPTCHA 4. Low effort: low cognitive demands
project [3, 4]). However in order to stay ahead of optical
character recognition technologies, TBCs have become in- 5. Perceptible: visible under sensory limitations, amenable
creasingly distorted and complex, thus becoming very diffi- to scale
cult to solve for increasing numbers of human users. Com- 6. Usability: Suitable for a variety of platforms
mon human error include mistaking a consecutive ’c’ and
’l’ for a ’d’, or a ’d’ or ’o’ for an ’a’ or ’b’ etc. [5]. It is [G5:] Resistance to random attacks.
noted that these mistakes come from users highly proficient
in the Roman alphabet using high-resolution displays [5]. [G6:] Robustness to brute force (exhaustive) attacks.
This suggests that TBCs are at the limit of human abilities.
In addition, TBCs are inherently restrictive: users need to [G7:] Situated cognition: exploits experiential and embod-
be literate in the language or the alphabet used in the TBCs. ied knowledge in humans [10].
This violates important universal design guidelines, exclud-
A systematic approach to design a CAPTCHA that satis-
ing a large proportion of potential users [6].
fies the above guidelines is to design the CAPTCHA task
In recent years, image-based CAPTCHA (IBC) have been
(Ctask) based on humans’ higher order cognitive skills, and
introduced to address the shortcomings of TBCs. These new
therefore taking advantage of intuitive visual abilities that
types of CAPTCHA capitalizes on the human innate abil-
are hard to pre-solve exhaustively (G7).
ity to interpret visual representations, such as distinguish-
In this paper we introduce DeepCAPTCHA, based on de-
ing images of animals (e.g. Animal Pix [1]), scenes (e.g.
sign guidelines, psychological theory, and empirical experi-
Pix1 , and Bongo2 ), or face images (e.g. Avatar CAPTCHA
ments. We here exploit human’s perception of relative size
[7]). Therefore, IBCs can address a significantly wider age
and depth of the complex objects to create an IBC. These
range and education level (than TBCs), and are mostly lan-
depth-related visual cues are intuitively used by the human
guage independent, except for requiring that the instruc-
visual system from early age [11], and are applicable to all
tions for solving the CAPTCHA be conveyed in a language-
3D every-day or well-known objects including representa-
independent manner, e.g. using diagrams, symbols or video.
tions of animals, man-made objects and characters across
IBCs are therefore closer to the universal design ideals,
multiple scales. Natural objects across scales can range from
and are also more intuitive for humans than TBCs. On the
atoms and molecules to insects, small animals such as mice
other hand, any IBC, if not carefully implemented, suffers
to human-size animals, to the biggest animals such as ele-
from a vulnerability that almost never plagues TBCs: that
phants, whales and dinosaurs, to natural elements such as
of exhaustive attacks. A TBC is typically generated “on
mountains, all the way to planets. Likewise, every-day arti-
the fly” by creating a text string from a finite set of sym-
ficial objects can intuitively be classified across scales from
bols (e.g. the Roman alphabet), and then increasing the
millimeters such as earrings, to centimeters such as keys and
difficulty to identify the string by applying image distortion
coins, to human-size such as bicycles and chairs, to a few me-
filters. Although the alphabet is finite, a virtually infinite
ters such as houses, to larger objects such as skyscrapers and
set of CAPTCHA may be generated from the string creation
airplanes, to super-structures like cities.
combined with the image distortion operation. This makes
DeepCAPTCHA is built on the basis of human intuitive
it intractable for an attacker to exhaustively pre-solve (e.g.
ability to distinguish and sort categories of objects based
by using human solvers) all possible TBCs. By compari-
on their size in a very large collection of everyday objects
son, IBCs typically start with a finite database of images.
that are familiar across cultures. The simplicity of this task
If these images are used without distortion, as in the case
for humans comes from the rapid and reliable object detec-
of CAPTCHA The Dog [8], then an exhaustive attack is
tion, recognition and comparison which arise from years of
possible. If only simple image distortions are applied, after
embodied experience of interacting with the physical envi-
exhaustive attack, machine can match a distorted image, to
ronment (Hoffmann, M., & Pfeifer, R. (2012)). The impli-
previously solved image, e.g. in [9]. Therefore, for any IBC
cations of embodiment for behavior and cognition: animal
to be viable, it should follow a series of design guidelines to
and robotic case studies, In W. Tschacher & C. Bergomi, ed.,
achieve CAPTCHA goals:
’The Implications of Embodiment: Cognition and Commu-
[G1:] Automation and gradability. nication’, Exeter: Imprint Academic, pp. 31-58). Humans
can rapidly recognize representations of objects, and using
[G2:] Easy to solve by a large majority of humans. their prior information about the real-world sizes of these
objects, order them in terms of size [12]. We exploit the
[G3:] Hard to solve by automated scripts. same principle here, having n objects (here n = 6), and ask
[G4:] Universal (adapted from Center for Universal Design3 ): the user to order them in terms of size, but we change the
appearances of each object to the extent that humans can
1. Equitable: diverse age and culture groups still recognize them, but machines cannot. Figure 1 illus-
trates an example of a DeepCAPTCHA task, on the user’s
2. Flexibility: customizable for specific target applica- device. This seemingly trivial task for humans can be made
tions very difficult for machines as we describe the details of our
framework in the next sections.
3. Intuitive: instructions are easy to understand and fol-
However, there are several challenges to reach an idea
low
CAPTCHA task (Ctask). The DeepCAPTCHA should (1)
1
http://gs264.sp.cs.cmu.edu/cgi-bin/esp-pix address random attacks by a well-designed interface; (2) be
2
http://www.captcha.net/captchas/bongo immune to exhaustive attacks, even when the database of
3
http://www.ncsu.edu/ncsu/design/cud/ objects is leaked (3) be robust to machine-solver attacks

82
while keeping the task easy for a wide range human users.
In addition, to have an automatically gradable Ctask, (5)
the DeepCAPTCHA system should be able to automatically
mine new objects and discover the relative size between them
and the rest of database, and finally remove objects with size
ambiguity. In this work we address these challenges by in-
troducing a five-parted framework, build around the core of
3D model analysis and manipulation.
Our choice of 3D models as the core of DeepCAPTCHA
framework enables us to have full control on object manipu-
lation, and feature analysis. Using 3D model allows appear-
ance alternations at multiplication factor of 200, without
compromising the ease of recognition by human. This is re- Figure 2: Examples of Pix (left), and Bongo (right)
markable and impossible to achieve by conventional TBCs CAPTCHA.
and almost all other IBCs, where the employed distortions
make CAPTCHA tasks more difficult for humans. We there-
fore start with using web crawlers to mine 3D models from guidelines,
the web. As the crawlers return any sorts of model, the sec- • Proposing a fully automatic framework that collects,
ond part of the framework automatically filter out models maintains, and refreshes its CAPTCHA materials (3D
which are simple enough to be distinguished by machine- objects),
learning techniques, by analyzing model features. Now that
all models have a baseline complexity, the third part of the • Propose an automatic method to label unknown ma-
framework orders database objects based on relative size. terials in the database, using a human-machine com-
This part is a machine-human combination of the Merge Sort bination in Merge Sort.
algorithm, that enables DeepCAPTCHA to use human ob-
ject recognition ability via the Amazon Turk service. There In addition to these main contributions, we present smaller
are considerable benefits of using Merge Sort including sys- novel techniques in each part of the framework to create a
tem time and budget constraints that we discuss in Section practical IBC.
4.3. Having an ordered database of 3D objects, the fourth
part of our framework is to automatically change original 2. RELATED WORKS
object appearances, using on-the-fly translation, rotation, Image-based CAPTCHA (IBC) capitalizes on human abil-
re-illuminated, re-coloring, and finally background clutter- ity to detect, recognize, or understand aspects of images, in
ing, to camouflage objects from the “eyes” of machines. This a task that is (still) hard for machines. A well-designed IBC
part can reach a multiplication factor of over 1:200 (200 dif- can be used by people of different ages, nationalities, and
ferent images from a single 3D model). This large multi- literacy levels (universality in CAPTCHA design). In this
plication factor protects DeepCAPTCHA against both ex- section we review several IBCs proposed in the literature,
haustive and machine-solver attacks (G3 & G6). Firstly, no each with individual pros and cons.
2D image is stored to be exhausted in the first place, and IronClad4 is an IBC based on recognizing images of simple
for each Ctask, a completely new appearance of objects are objects such as balls, bars, and keys. The user should enter
created before presenting it to the users. Secondly, even if the number of instances for each class of objects to com-
the database of objects are leaked, machines cannot link the plete the task. This IBC is based on humans’ object recog-
newly generated images, to the correct source object, and nition abilities, however, as it uses a non-distorted database
therefore fail to assign the correct size. We discuss the de- of labeled images, IronClad is vulnerable to be attacked by
tails of this object appearance alteration part of the frame- computer vision algorithms (G3) and also exhaustive attacks
work in Section 4.4, where we see the vital role of this part to (G6). In addition, this IBC uses English as it primary lan-
make DeepCAPTCHA successful. Finally, with the freshly guage (user read class names in English) and is therefore not
generated appearances, the fifth part of the framework is to satisfying the universality guideline (G4).
create a user-friendly CAPTCHA interface, compatible to Pix CAPTCHA5 has a large database of labeled images,
all smart devices with large and small screens, while being and for each Ctask chooses 4 images from the same label,
robust against random attacks (G5). distort them and present to the user. The user should then
In our experiments we show that humans can solve Deep- assign a single label to these four images. Another IBC,
CAPTCHA tasks with relative ease (average accuracy of Bongo CAPTCHA6 is very similar to Pix in its concept. The
83.7%), most likely due to their ability in object recognition user should assign a single block in request to one of the two
and 3D generalization [13], while machines showing constant shown classes, based on its visual characteristics. Bongo also
failure in either recognizing objects, or guessing the relative uses a pre-labeled database from which it selected the images
sizes. for a new Ctaks. Although both Pix and Bongo IBCs use
In summary, this paper presents the following main con- humans’ concept abstraction abilities, both of these captchas
tributions: are vulnerable to exhaustive attacks (G6). More over, Bongo
has also very high (50%) chance of random success (G7).
• A novel image-based CAPTCHA (IBC) based on hu-
4
mans high-level cognitive processes of depth percep- http://www.securitystronghold.com/products/ironclad-
tion, captcha/
5
http://www.captcha.net/captchas/pix/
6
• The first IBC to satisfy all seven CAPTCHA design http://www.captcha.net/captchas/bongo/

83
[18]), high success rate of current machine algorithms (e.g.
face detection CAPTCHA [6]), and being prone to exhaus-
tive attack (arguably all of previous IBCs).
In this work, we present the first practical IBC, Deep-
CAPTCHA, that satisfies all CAPTCHA design guidelines,
based on human’s ability in depth perception. We start
with a brief review of depth perception in humans in the
next chapter, and follow to DeepCAPTCHA system design
details.

Figure 3: Examples face-based CAPTCHA: Avatar 3. DEPTH PERCEPTION


[7] (left), face detection [14] (right) CAPTCHA. Human’s perception of the world relies on the informa-
tion arriving at his/her sensory receptors, and the ability
to transform and interpret sensory information (the abil-
CAPTCHA The Dog [8], presents 9 images of cats and ity to use what is known to interpret what is being seen).
dogs and the user should choose all dogs. This IBC is a Processes of perception provide the extra layers of interpre-
classical example of exhaustive attacks (G6). While the tation that enable humans to navigate successfully through
database used in this CAPTCHA consisted of 3 million la- the environment[21]. This process assigns meaning to per-
beled images of cats and dogs, Golle could use Amazon Me- cepts (identification), and this identification involves higher
chanical Turk service to exhaust all of the database images, level cognitive processes including theories, memories, val-
and then use a look-up table to solve each new Ctask [15]. ues, beliefs, and attitudes concerning the objects.
In addition, a machine learning approach recently claims to Depth perception is a vital task in the human visual sys-
break the Asirra CAPTCHA [15] (G3). tem that uses object identification, parallel lines, and tex-
Google researchers, Gossweiler et al. [16], presented an tures as depth-related visual cues to create a 3D map of the
IBC which asks the user to rotate images to be up-right. surrounding environment. Our eyes only have two-dimensional
The authors cleverly avoid “easy” images, which their feature retina images and no special third component for depth per-
vectors are correlated with their up-right position (e.g. im- ception. This requires an interpretation of our physiologi-
ages with distinct horizontal lines). The approach requires cal cues that leads to useful “perception” of the 3D world,
a relatively small display size which is preferred by mobile by combining the retinal images of our two eyes. Humans
devices. Authors reported average human accuracy to be use different visual cues including disparity and convergence
82.8%. Despite this high accuracy for humans, the proba- (slight image differences in retina images, due to distance
bility of a successful random attack is reported to be 4.48% between pupils), motion parallax (when moving, different
which is not significantly low (G5). Furthermore, the images amount of object movement on the retina, determines their
are not distorted, which introduces the risk of breach in case relative distances), pictorial cues (visual cues such as lin-
of an exhaustive attack (G6). ear perspective and texture gradient), overlay (objects oc-
Confident CAPTCHA7 is an IBC which asks the user to cluding each other are perceived in ordered depth), shadows
click on an image representing an object, animal, landscape, (providing information about 3D form of objects as well as
etc, based on an instruction. However, the images used are the position of the light source), relative size (objects of the
not distorted, which makes that CAPTCHA vulnerable if same size but in varying distances cast different retinal im-
the database is publicized, or in case of exhaustive attack age sizes), and known size (comparing current visual angle
(G6). A similar but stronger CAPTCHA is introduced in of an object projected onto the retina, with previous knowl-
[17] which includes two steps. In the first step the user edge of the object size, can determine the absolute depth of
should click on the center of one of several perturbed com- a known object) [22, 12].
posite images. If successful, the user then is asked about the We here focus on determining depth of known objects,
content of another image, with possible answers enumerated using previous knowledge of their sizes. When two known
in a list of 15 categories. The main drawbacks of this ap- objects are viewed together, humans can determine relative
proach are the simple distortions used are easily breakable depth of the two objects based on perceived object sizes.
by machines (e.g. [5]) (G3) and it is still vulnerable to ex- For example if a chair and a car are perceived with the same
haustive attacks (G6). size, then the chair is perceived much closer than the car
Several face-based CAPTCHA have been proposed in the as the prior information dictates that a car is much larger
literature including [6, 9, 14, 7, 18] (Figure 3). In this than a chair. In our proposed CAPTCHA, we use the same
CAPTCHA the user should locate a face in a cluttered back- principle to create a task (Ctask): we show 2D images of
ground or match two faces belonging to the same person. different objects, such that they appear almost the same
All these IBCs are vulnerable to exhaustive attacks (G6). size, and ask the user to order them these objects terms of
In addition, [6], [9], and [7] are breakable by machines [5, size. This task is only possible if the normal size of these
19, 20] (G3). Finally, [9] and [18] has a high random success objects are known, in other words, only if these objects are
probability (G7). identified. While humans can perform this task with relative
Based on our literature review, until now, none of the pro- ease (due to their prior knowledge of the objects), it is still a
posed IBCs can achieve all the goals of a reliable CAPTCHA significantly difficult task for machine algorithms to perform.
(G1-G7). Main weaknesses of IBCs in the literature are high Automatic object recognition is a well-known field in com-
chance of random attack success (e.g. Gender CAPTCHA puter vision, with a wide range of algorithms proposed for
different object recognition scenarios, that can be catego-
7
http://www.confidenttechnologies.com/products/confidentcaptcha
rized into color-based and shape-based methods. While meth-

84
6 images of different objects, and user’s task is to order these
Figure 4: Our proposed framework for Deep- images in terms of their relative size (i.e. the size of the ob-
CAPTCHA ject each image represent). In order to automatically create
these Ctasks, we device our framework in five parts, namely,
object mining, object filtering, object ordering, appearance
altering, and finally presenting the Ctask to the user (user
interface). Figure 4 illustrates the DeepCAPTCHA frame-
work. Each of the five components of the framework is pro-
posed with a goal-specific design that make the entire the
framework easy to implement and use, while making the
resulting Ctaks robust against attacks (automatic solvers,
random attacks, as well as exhaustive attacks). We there-
fore believe that our framework is not only the first frame-
work for an image-based CAPTCHA (IBC) that satisfies
all CAPTCHA design guidelines, but also a framework that
can be easily scaled and evolved along side the advances in
computer vision field. Next in this section we describe the
details of each component and the contribution in each of
these components, followed by a discussion on the frame-
work scaling and evolving ability.

4.1 Object Mining


For a CAPTCHA to work on 3D models, the very first step
is to create a database of models, which can be automatically
obtained, maintained, and refreshed. Thus the first part of
our framework is the object mining part, in which we find
free 3D models from the web using web crawlers. There
are many publicly available crawlers among which we mine
ods such as [23, 24, 25] mainly used color-signatures of the
several free 3D model datasets8 for 3D models in formats
objects to recognize their images, works such as [26, 27, 28]
of VRML, 3D Studio Max (.3ds and .max), Maya (.ms),
mainly used shape features of the objects for recognition,
Sketchup (.skp), etc. (as all of these models can be easily
and finally some other methods used a fusion of both color
converted to VRML 2.0), using a free FireFox crawler plug-
and shape features [29, 30, 31]. Based on these approaches,
in, FoxySpider9 . We could easily gathered 1000 3D models
to prevent a machine from recognizing an object, we have to
within minutes including models of people, furniture, plants,
hide both its color and shape signatures, to such an extent
animals, insects, and other man-made objects. We annoni-
that machines fail to learn and recognize the objects, but
mize each model by changing its file name to a generic name
also not to the extent that human users fail to recognize the
and removing all the included descriptive information such
(new appearances of the) object.
as tags and object descriptions.
In the next section, we describe our proposed framework
to automatically choose 3D models of objects, determine 4.2 Object Filtering
their relative sizes, hide their original signatures, and then
Using web crawlers we receive a diverse range of 3D mod-
present it to the end users to be sorted. We also discuss
els with different quality, complexity, color, texture, etc.
the level of appearance alteration to avoid machines from
If a model is too simple or its appearance does cannot be
recognizing objects, while keeping the Ctask easy enough
changed significantly (e.g. a sphere), machine learning tech-
for humans.
niques can be used to learn its appearance and break the
CAPTCHA. We should therefore automatically filter out
4. SYSTEM DESIGN AND FRAMEWORK “easy” (for machine) models and only use “hard” models in
In order to create a CAPTCHA based on depth percep- the CAPTCHA. The second part of the DeepCAPTCHA
tion, we have to have full control on object manipulation, performs a rough classification on each new model to deter-
and therefore we choose to use 3D models of objects as the mine its difficulty for machines to learn the object’s signa-
core of our framework. Among different representations of ture.
3D models, we choose to work with VRML format as it is We here use no of parts, no of points, no of joints, SIFT
an XML-based 3D object modeling language that is easy to features, roughness, and curvature of the four major views
read and manipulate and is also light-weight and supported (front, right, left, back). To gather the training data, we
by a wide range of software, such as web-browsers, stand- gathered classification accuracies of several well-known clas-
alone renderers, and smart-phone applications. Moreover, sifiers (linear, radial-basis, and polynomial SVM, random-
almost all of the well-established 3D modeling formats (e.g. forest, and K-NN) on images of 100 objects. For each object,
3D Studio Max , Maya, and Sketchup) can be easily con- we have 24 views, each view showing the object in a random
verted to VRML format using small open source/freeware color. We use 14 views for training and 10 for testing for each
applications. 8
www.planit3d.com; www.klicker.de;
Based on this VRML core, we here describe the design of www.sweethome3d.com; free3d.architecture.sk/
a fully-automated framework to mine 3D models from the 9
https://addons.mozilla.org/En-
web and use them to create Ctasks. Each Ctasks is a grid of us/firefox/addon/foxyspider/

85
We choose the relative size approach as it produces a
stronger Ctask. To store object relations, we propose to use
a directed graph of linked-lists data structure G = (V, E).
Each vertex v is a linked-list storing objects with similar
sizes, and each edge e represents the relation “larger than”
to another set of objects.
The challenge for automatically sorting the object database
is that machine-learning approaches fail to perform object
recognition (a desirable feature for CAPTCHA). We propose
to use a modified Merge Sort algorithm that uses Amazon
Mechanical Turk (AMT) service (actual humans) (similar to
[32]) to compare objects and add them to the belonging ver-
tices in G. Merge Sort treats the database as an unsorted
list of objects, and as it progresses in this list, whenever
a comparison between two objects xi and xj (xi , xj ∈ / V
or (xi , xj ) ∈
/ E) is required, a their images are sent to the
Figure 5: Filtering ”easy” models based on rough AMT service for a relative size comparison. After a certain
classification accuracies on model features. number of AMT responses, the median of the answers are
set to the database.
In occasions, scale thresholds may not be sharp, so the
of the classifiers. Figure 5 illustrates the maximum accuracy selection of 3D models needs to account for this potential
of these classifiers based on the aforementioned features. As confusion by discarding objects that may be perceived as
this figure shows, classification for the majority of models belonging to more than one size category. Other foresee-
is at the random rate, however, some of the models create able challenges include the likely familiarity of the target
spikes in this trend as they are easy-to-learn models based on user with object category and instance (groups of users may
their visual appearances. The objective of the second part not be familiar with some car categories or models, uncon-
of the DeepCAPTCHA framework is to remove these “easy” ventional chair models or uncommon animals), as well as
models from the dataset. We therefore use model features, closeness to a set of canonical projections of objects, as fur-
labeled with the related maximum classification accuracy, to ther explained below. To avoid models with ambiguity (e.g.
train a new SVM classifier to label each new model as easy car model that also appears like a toy), we discard objects
or hard. that more than l (here l = 3) AMT users assign different
Now that all database models have a baseline complex- relations to them.
ity, we need to know their relative size to create a Deep-
CAPTCHA Ctask, which is the next part of our framework. 4.3.1 Partially Sorted Database
We chose the relative size scenario that requires more ob-
4.3 Object Ordering ject comparisons and therefore costs more than absolute size
In a DeepCAPTCHA task we present 6 objects to be scenario (choosing stronger CAPTCHA, over fewer number
sorted in terms of size by the user, and as these objects of comparisons). For a complete ordered set we need nlog(n)
are automatically mined from the web, we have no intu- comparisons in the worst case, and as we are outsourcing the
ition about their real-world sizes. We therefore need to first comparisons, the number of comparisons becomes a very
know the relative size between the objects, in order to know important issue of the system design for DeepCAPTCHA.
whether a CAPTCHA task (Ctasks) is solved correctly. The This is because the system is charged for each comparison
purpose of the third part in the DeepCAPTCHA framework by AMT, and also the comparison by humans becomes the
is use a human-machine sorting algorithm (modified Merge running time bottleneck of the entire system. There are two
Sort) to continuously order the database of objects, as the possible approaches in size assignment, namely, using abso-
stream of new object is continuously refreshing the database. lute size categories, and using relative size relationships.
There are two possible approaches in object size assign- The first approach to use partially-sorted database to cre-
ment, namely, using absolute size categories, and using rela- ate Ctask. Considering the database of the objects as the list
tive size relationships. In the absolute size scenario, humans L, to be sorted, we can sort disjointed sub-lists l1 , l2 , ..., lk ⊂
assign a categorical size (e.g. very small, small, normal, big, L, up to a state that the final Ctask meets the required ro-
very big, etc.) to each object. This scenario has the advan- bustness against random attacks (having large enough an-
tage that it requires only n comparisons, (n is the number swer space), while consuming the minimum resources. The
of objects in the database). On the other hand, using cat- size of these sorted sub-lists can be increased when the sys-
egorical size restricts object sizes to a limited set, which in tem constraints are relaxed. The main idea behind using
turn restricts the possible set answers to the final Ctask, partially sorted dataset is that we only need know the re-
thus reducing the robustness of the CAPTCHA against ran- lation between a small number of objects for each Ctask,
dom attacks. Moreover, with a limited set of size categories, therefore given a large number of disjointed small sorted
there is a high possibility that different users assign differ- sub-lists with len(li )  len(L), that are just long enough
ent sizes to the same object. In contrast, using relative size to create a Ctask, we can avoid a considerable number of
relationships increases the space of possible answers to the comparisons.
Ctask, and also addresses the ambiguity problem of assign- Using Merge Sort brings forth two advantages in imple-
ing different sizes to the same object, but requiring a larger menting the partially-sorted database approach. First, dur-
number of comparisons (O(nlogn)). ing the sorting process in the Merge Sort, at each point one

86
can know the number of disjointed sorted sub-lists, as well
as their lengths. Second, Merge Sort can be directly used for
parallel sorting, which matches the nature of AMT service,
significantly reduces the required running time, and creates
individual sorted sub-lists in the process.

4.3.2 reCAPTCHA Concept


Our second approach to reduce the number of required
comparisons is to use the reCAPTCHA [4] concept in Deep-
CAPTCHA. The reCAPTCHA project helps to digitize books,
by sending words that cannot be read by computers (OCR)
to the Web in the form of CAPTCHA for humans to de-
Figure 6: Machine performances for recognizing
cipher. Each CAPTCHA task is consisted of two words,
10 sample objects, as the viewpoint difference in-
one known to the system, and one unknown to coming from
creases. 20 degrees viewpoint change seems an op-
OCR. After more than a certain number of users enter the
timal choice.
same answer for the unknown word, the system registers the
value as the correct answer. Similarly in DeepCAPTCHA,
we have a partially ordered dataset and therefore in each Re-coloring: Given an object with p parts, we draw p
Ctask, we can select c − 1 objects {x1 , ..., xc−1 } from the samples from the HSV space using uniform probability, in
ordered list and an object x0 from the unordered list. Then order to hide the color signature of the object. The HSV
if more than a certain number of users select the same or- color space is used in several color-based object recognition
der for object x0 , this order is added as the relation of x0 approaches including [40, 33, 39, 41], thus, randomizing the
to {x1 , ..., xc−1 } the database. Using this approach, we canobject color signature in this space helps confusing these
gradually increase the set of database relationships without machine algorithms. In addition for each object part to be
outsourcing them to the AMT service. re-colored, we with a probability of 50%, we assign a new
color, or assign the color of the previous part in the object.
4.4 Appearance Alteration This uncertainty in assigning new color, or using a previously
Regardless of the number of objects in the database, the used color is to prevent algorithms to identify/count object
entire database may be exhausted using services, if these parts from their color.
objects are used in their original appearances (similar to Background Cluttering: Cluttered background is one
previous IBCs e.g. CAPTCHA The Dog [8]. The fourth of the commonly used methods to fool object detection al-
part of our framework is to alter object appearances before gorithms in text- and image- CAPTCHA (e.g. in [6][14]).
presenting them as Ctasks. We use 3D translation, 3D rota- Like these CAPTCHA, we also use background cluttering
tion, re-illumination, re-coloring, and background cluttering to increase the difficulty for machines, but unlike previous
to hide the object signature from the “eyes” of machines. CAPTCHA, we clutter the background with the very same
Translation and Rotation: For each of the selected features for which machines are searching. We use fraction
objects, we use translation and rotation to change the view- of random object parts (randomly generated object sections
point of the object. Object rotation in 3D significantly from VRML codes from the database), and thus add sev-
change the object appearance in the final 2D image. The eral other objects’ feature signatures into the CAPTCHA
center of rotation is the center of object bounding box. For image. As a result of each CAPTCHA image containing
each subject we rotate the object between 20 to 340 de- signatures of several objects other than the main object, the
grees. This is because our experiments show a minimum of difficulty of automatic object recognition dramatically in-
20 degrees change of view to reduce performance of machine- creases. On the other hand humans can easily differentiate
learning algorithms close to random. Figure 6 illustrates the fully represented object from the cluttered background
machine-learning performances as the viewpoint difference of object parts. To further aid this differentiation, we reduce
increases, for 10 sample objects. For this experiment, we the transparency of the background by 20%. This effect at-
feed 14 comonly used features in object detection, namely, tracts human attention to the main object due to its higher
RGB histogram, opponent color histogram, Hue-Saturation contrast.
histogram, NRGB histogram, transformed color histogram, The final image representing the selected object is resized
color moments, SIFT, color invariant SIFT, HueSIFT, HSVSIFT, so that all six selected objects appear (almost) the same size
opponent SIFT, RG SIFT , RGB SIFT, and SURF (see [33, in the final CAPTCHA presentation. We resize the images
34, 35, 36, 37, 38, 39]). so that lengths of objects along their major axis are the
In addition to general object rotation, in cases that VRML same. The major axis in each image is found by fitting a
joints are defined for the object (e.g. human body joints) we line on the 2D location of pixels belonging to the object in
randomize the joint angle in the permitted range to increase the 2D image. Using 3D model we can easily find the pixels
the appearance variation. belonging to the object by using the alpha/depth channel of
Re-illumination: After translation and rotation, we re- the VRML renderer. An Example of the final altered image
illuminate the selected object using 12 VRML point-lights, is illustrated in Figure 7.
around the bounding box of the object, 4 round the front, The result of the appearance alteration part is a multipli-
4 around the middle, and 4 around the back of the object, cation factor of over 1:200, i.e. we can create 200 different
with 3 object-width away from the center of front, middle, images from each 3D model in the database, on-the-fly. This
or back. In each alteration, we randomly turn on 3 to 6 multiplication factor, comparable to the text-CAPTCHAs,
lights. is never achieved by previously proposed IBCs, and plays a

87
Figure 8: ROC curves (red) of machine performance
Figure 7: An example 3D model (top), and its final on classifying objects into size categories (top) and
altered image (bottom) created on the fly, to be sent recognizing objects (bottom). Blue line indicates
for Ctask. applied threshold.

significant role in protection against exhaustive and machine- select images in the order of size by clicking (or tapping) on
solver attacks. them. As the user selects images, his/her selection appears
In order to test the effectiveness of appearance alteration, as image thumbnails above the 3 × 2 grid, and the user can
we perform two sets of experiments. Note that for an al- reset the current selection by clicking (or tapping) on these
gorithm to solve DeepCAPTCHA, it has to either identify thumbnails. Finally there is a submit button to be hit when
the objects and then use a prior information to determine a confident order is reached and a refresh button for the
the size of the objects (e.g. first identify a bee and a car, cases that the user cannot solve the Ctask for any reason.
and based on a prior information of relative size of cars and In each request for Ctask, we allow up to one “equal” rela-
insects, solve the CAPTCHA), or directly estimate the rela- tion between the presented objects (i.e. not more than two
tive size of the presented objects (e.g. into generic size cate- objects are equal in size). In addition, to avoid any ambigu-
gories). We therefore test machine performances in both of ity in size, we restrict the objects with no “equal” relation to
these scenarios. Using 100 objects as the dataset, we create have at least 2 levels of distance from each other (i.e. objects
20 images per object based on the above appearance alter- with minimum distance of d = 2 in the database graph G).
ations. We then trained several well-known classifiers (lin- Random attack success probability in this presentation
ear, radial-basis, and polynomial SVM, random-forest, and is 16! = 1.3889 × 10−3 for cases with none of the objects
K-NN) using randomly selected %60 of the images, based have equal sizes, and 26! = 2.7778 × 10−3 for cases with
on the aforementioned 14 features (RGB histogram, oppo- two objects having the same size. This probability for the
nent color histogram, SIFT, SURF, etc.) and then tested success of random attacks falls well in the safe range based
their performances on the remaining %40. Our experimen- on the literature [6, 8, 16].
tal results show that due to the introduced variations in It should be noted that in case of using partially sorted
object appearances, machines constantly fail in both object database, the random attack success probability would be
recognition and object size estimation. Figure 8 illustrates higher than above. This is because the space of possible an-
machines random behavior in assigning size categories and swers reduces, due to reduction of the total number of usable
object recognition. objects (sorted lists of objects). Given a sorted dataset of
n objects, and c objects in each Ctask, the probability of a
4.5 Presentation successful random attack is
The final part of the DeepCAPTCHA framework is the 1 c!(n − c)! 1 (n − c)!
presentation on the user’s device. This part should present = = =
n Cc × c! n!c! n Pc n!
an intuitive user interface, that is both easy to understand
and use for humans, and robust to random attacks. As If the sorting stops just before the final merge (reducing
most of the internet users migrate to use mobile devices, n2 comparisons), the success probability increases to (n2−c)!
n2!
.
this interface should be compatible to various devices and However, this reduction does not threat the final robustness
screen sizes. of the Ctask against random attacks. For example, in a
We use a simple, light-weight user interface consisting of dataset of 1000 objects (realistic databases may have more
a grid 3 × 2 images to be sorted by the user. The user can than 100000 objects) with each Ctask requiring 6 objects,

88
stopping at sub-lists with lengths of n2 and n4, incease suc-
cess probability of random attack from 1994, 010, 994, 000
to 161, 752, 747, 000 and 13, 813, 186, 000 respectively, which
don’t have any significant effect on the robustness of Ctasks.

5. REALIZATION AND HUMAN TESTS


For actual realization of the interface, we developed a
webservice that when called, sends a Ctask from our Deep-
CAPTCHA web server. Using JavaScript+HTML5 enabled
us to efficiently display DeepCAPTCHA on both conven-
tional and mobile devices, and moreover makes it possible
to adapt to mobile device capabilities such as orientation
change to re-order the interface for the current orientation,
or shake action by the user as the refresh action.
We conduct an experiment to observe actual human users Figure 9: Average human performances on Ctasks
performance and their preference between DeepCAPTCHA with different distortion levels.
and conventional text CAPTCHA. After collecting the user
information (gender, age, education level, etc.), the users Prefered Ease-of-Use
should complete 7 DeepCAPTCHA tasks, and 3 text CAPTCHA DeepCAPTCHA 86.85% 3.29 / 5
tasks (from Google’s text CAPTCHA web service). In order Text CAPTCHA 13.16% 0.87 / 5
to assess the increase of difficulty for human users by adding
appearance alterations, we divided the 7 DeepCAPTCHA Table 1: Average user preferences and easy-of-use.
tasks into 3 sets. The very first DeepCAPTCHA task is
consisted of only original objects (without appearance al-
teration); the next 3 tasks are consisted of objects with CAPTCHAs including computer vision solver algorithms
color and lighting variations, and finally the last 3 Deep- and exhaustive attackers.
CAPTCHA tasks are consisted of full appearance alteration In this paper we presented DeepCAPTCHA, an IBC build
including re-coloring, re-illumination, and background addi- on CAPTCHA design guidelines, humans ability of depth
tion. perception, and empirical experiments, that the first practi-
A total of 66 users participated in our experiment, us- cal IBC that satisfy all CAPTCHA design guidelines. Deep-
ing PCs, tablets, or smart phones, with average age of 28.5 CAPTCHA exploits 3D models, human-machine sorting,
years, 69% male and 31% female, and with education lev- and appearance alteration distortions to confuse machines,
els from high-school students to PhD students. These users while keeping the task easy for humans. Our experiments
connected to our web service from several countries around also show a high performance and satisfaction from human
the world including US, Canada, Australia, Singapore, and users, and constant failure from cnventional mahine algo-
China, solving 311 DeepCAPTCHA tasks and more than rithms.
115 text CAPTCHA tasks. DeepCAPTCHA can also stimulate interesting research
Figure 9 illustrates the average of human performances in computer vision field, as this type of IBCs that draw
in our experiment, with 87.7% in total of the experiment experience from interacting with the physical environment,
(including all three types of DeepCAPTCHA tasks). We through an evolutionary process. We can therefore envision
can observe the decrease in average performance as the ap- a robot that learns to differentiate objects by size as a re-
pearance alteration level increases (from 92.31% to 86.87% sult of observing and interacting with everyday objects over
and 83.7%), however, even with addition of backgrounds, extended time periods.
the performance level is higher than acceptable range in the Finally, based on our proposed approaches for cost reduc-
literature [6, 8, 16]. tion, we can have a constantly evolving database for our
We also performed a usability test in which 86.84% pre- IBC. As future work, we plan to extend our current object
ferred DeepCAPTCHA over text CAPTCHA, and depict- mining strategy to: a) target specific types or scales of mod-
ing a much higher ease of use for DeepCAPTCHA than text els in available 3D model databases such as Google Ware-
CAPTCHA, shown in Table 1. Results of this usability test house10 , Archive 3D11 , and the rapidly evolving repositories
indicates that given a reliable service, users are likely to for 3D printing files; b) connect with engineering, design and
migrate from using TBCs to using DeepCAPTCHA (partic- art programs where students continuously produce original
ularly due to the ease-of-use). Moreover, their high prefer- 3D models. Using an ever-growing collection of 3D models,
rence for DeepCAPTCHA can also indicate users’ trust in our approach can be closer to TBCs in terms of the pool
security of DeepCAPTCHA. These two points motivate us of original content. Using this constant stream of new 3D
to advance our IBC to the next level, beta-version, based on models, we can even be closer to the TBCs in terms of the
a realistically large database and reliable webservers. pool of original content.

6. SUMMARY AND CONCLUSION 7. REFERENCES


[1] L. von Ahn, M. Blum, N. Hopper, and J. Langford,
Text-based CAPTCHAs (TBCs) have been used over a
“Captcha: Using hard AI problems for security,”
decade now with its limitations and image-based CAPTCHAs
10
(IBCs) have risen to address the problems in TBCs. How- http://sketchup.google.com/3dwarehouse/
11
ever, IBCs face challenges that does not threat text-based http://archive3d.net/

89
EUROCRYPT 2003, pp. 646–646, 2003. in CISP, vol. 3, 2008, pp. 456–460.
[2] J. Yan and A. S. El Ahmad, “Usability of captchas or [26] M. Fussenegger, P. M. Roth, H. Bischof, and A. Pinz,
usability issues in captcha design,” in SOUPS ’08, 2008, pp. “On-line, incremental learning of a robust active shape
44–52. model,” Proc. DAGM-Symp. Pattern Recognit, pp. 122
[3] L. von Ahn, M. Blum, and J. Langford, “Telling humans –131, 2006.
and computers apart automatically,” Commun. ACM, [27] A. Toshev, A. Makadia, and K. Daniilidis, “Shape-based
vol. 47, no. 2, pp. 56–60, 2004. object recognition in videos using 3d synthetic object
[4] L. von Ahn, B. Maurer, C. McMillen, D. Abraham, and models,” in CVPR, 2009, pp. 288–295.
M. Blum, “recaptcha: Human-based character recognition [28] Y. Zhao and A. Cai, “A novel relative orientation feature
via web security measures,” Science, vol. 321 (5895), pp. for shape-based object recognition,” in IC-NIDC, 2009, pp.
1465–1468, 2008. 686–689.
[5] B. B. Zhu, J. Yan, Q. Li, C. Yang, J. Liu, N. Xu, M. Yi, [29] A. Diplaros, T. Gevers, and I. Patras, “Combining color
and K. Cai, “Attacks and design of image recognition and shape information for illumination-viewpoint invariant
captchas,” in CCS ’10. ACM, 2010, pp. 187–200. object recognition,” Image Processing, IEEE Transactions
[6] Y. Rui and Z. Liu, “Artifacial: Automated reverse turing on, vol. 15, pp. 1–11, 2006.
test using facial features,” MMSys, vol. 9, no. 6, pp. [30] W. Hu, X. Zhou, W. Li, W. Luo, X. Zhang, and
493–502, 2004. S. Maybank, “Active contour-based visual tracking by
[7] D. DSouza, P. C. Polina, and R. V. Yampolskiy, “Avatar integrating colors, shapes, and motions,” Image Processing,
captcha: Telling computers and humans apart via face IEEE Transactions on, vol. 22, no. 5, pp. 1778–1792, 2013.
classification.” in EIT. IEEE, 2012, pp. 1–6. [31] W. Wang, L. Chen, D. Chen, S. Li, and K. Kuhnlenz, “Fast
[8] J. Elson, J. R. Douceur, J. Howell, and J. Saul, “Asirra: a object recognition and 6d pose estimation using viewpoint
captcha that exploits interest-aligned manual image oriented color-shape histogram,” in ICME, 2013, pp. 1–6.
categorization,” in CCS ’07, 2007, pp. 366–374. [32] G. Little, L. B. Chilton, M. Goldman, and R. C. Miller,
[9] D. Misra and K. Gaj, “Face recognition captchas,” AICT, “Turkit: human computation algorithms on mechanical
p. 122, 2006. turk,” in Proceedings of the 23nd annual ACM symposium
[10] W. J. Clancey, Situated Cognition: On Human Knowledge on User interface software and technology. New York,
and Computer Representations (Learning in Doing: Social, NY, USA: ACM, 2010, pp. 57–66.
Cognitive and Computational Perspectives). Cambridge [33] H. Yu, M. Li, H.-J. Zhang, and J. Feng, “Color texture
University Press, Aug. 1997. [Online]. Available: moments for content-based image retrieval,” in ICIP, vol. 3,
http://www.worldcat.org/isbn/0521448719 2002, pp. 929–932.
[11] W. Hudson, “Pictorial depth perception in sub-cultural [34] I. Omer and M. Werman, “Color lines: Image specific color
groups in africa,” The Journal of Social Psychology, vol. 52, representation,” CVPR, pp. 946–953, 2004.
no. 2, pp. 183–208, 1960. [35] K. Konstantinidis, A. Gasteratos, and I. Andreadis, “Image
[12] V. Bruce, P. Green, and M. Georgeson, Visual Perception: retrieval based on fuzzy color histogram processing,” Optics
Physiology, Psychology and Ecology. 3rd Ed. Psychology Communica-
Press, Hove., 1996. tions, vol. 248, no. 4, pp. 375 – 386, 2005. [Online]. Available:
[13] T. Palmeri and I. Gauthier, “Visual object understanding,” http://www.sciencedirect.com/science/article/pii/S0030401804013069
Nature Reviews Neuroscience, vol. 5 (4), pp. 291–303, 2004. [36] A. Abdel-Hakim and A. Farag, “Csift: A sift descriptor
[14] G. Goswami, B. M. Powell, M. Vatsa, R. Singh, and with color invariant characteristics,” in CVPR, vol. 2, 2006,
A. Noore, “Facedcaptcha: Face detection based color image pp. 1978–1983.
captcha,” FGCS, 2012. [37] F. Pavel, Z. Wang, and D. Feng, “Reliable object
[15] P. Golle, “Machine learning attacks against the asirra recognition using sift features,” in MMSP, 2009, pp. 1–6.
captcha,” in CCS ’08, 2008, pp. 535–542. [38] X.-Y. Wang, J.-F. Wu, and H.-Y. Yang, “Robust image
[16] R. Gossweiler, M. Kamvar, and S. Baluja, “What’s up retrieval based on color histogram of local feature regions,”
captcha?: a captcha based on image orientation,” in WWW Multimedia Tools Appl., vol. 49, no. 2, pp. 323–345, 2010.
’09, 2009, pp. 841–850. [39] F. D. M. de Souza, E. Valle, G. C. Chavez, and
[17] R. Datta, J. Li, and J. Z. Wang, “Imagination: a robust A. de Albuquerque Araujo, “Hue histograms to
image-based captcha generation system,” in MULTIMEDIA spatiotemporal local features for action recognition,”
’05, 2005, pp. 331–334. CoRR, 2011.
[18] J. Kim, S. Kim, J. Yang, J.-h. Ryu, and K. Wohn, [40] R. Cucchiara, C. Grana, M. Piccardi, A. Prati, and
“Facecaptcha: a captcha that identifies the gender of face S. Sirotti, “Improving shadow suppression in moving object
images unrecognized by existing gender classifiers,” detection with hsv color information,” in ITS, 2001, pp.
Multimedia Tools and Applications, pp. 1–23, 2013. 334–339.
[19] M. Korayem, A. Mohamed, D. Crandall, and [41] X. Hu and W. Hu, “Motion objects detection based on
R. Yampolskiy, “Learning visual features for the avatar higher order statistics and hsv color space,” in ICM, vol. 3,
captcha recognition challenge,” in ICMLA, vol. 2, 2012, pp. 2011, pp. 71–74.
584–587.
[20] T. Yamasaki and T. Chen, “Face recognition challenge:
Object recognition approaches for human/avatar
classification,” in ICMLA, vol. 2, 2012, pp. 574–579.
[21] P. G. Zimbardo, Psychology and life. HarperCollins, 1992.
[22] S. Coren, C. Porac, and L. M. Ward, Senses and sensation;
Perception. Academic Press (New York), 1979.
[23] T. Gevers and H. Stokman, “Robust histogram
construction from color invariants for object recognition,”
PAMI, vol. 26, pp. 113–118, 2004.
[24] H. Stokman and T. Gevers, “Selection and fusion of color
models for image feature detection,” PAMI, vol. 29, no. 3,
pp. 371–381, 2007.
[25] T. Liu, H. Guo, and Y. Wang, “A new approach for
color-based object recognition with fusion of color models,”

90

You might also like