Chapter -1
Advanced Topics on Web Engineering
Advanced Internet Programming 1
Contents
• Web services
• Semantic Web
• RSS and ATOM
• Captcha
• Workflow Languages
Advanced Internet Programming 2
Web Services
• Web services are software systems that allow machines to
communicate over the internet using standardized protocols.
• They are designed to support interoperable machine-to-machine
interaction, typically over HTTP.
• Web services offer a way for applications, often written in different
languages, to interact and exchange data across a network.
• They follow a client-server model, where one application (client)
requests a service, and another (server) provides the service.
Advanced Internet Programming 3
Types of Web Services
• SOAP (Simple Object Access Protocol): SOAP is a protocol that
uses XML for message format and relies on different transport
protocols like HTTP or SMTP.
• It is widely used in enterprise systems due to its robustness and security.
SOAP is considered heavyweight due to its complexity in the message
structure.
• REST (Representational State Transfer): REST is an architectural
style that uses simple HTTP requests for communication (GET,
POST, PUT, DELETE).
• REST is lightweight, making it highly suitable for web-based systems and
APIs. Unlike SOAP, REST is more flexible, does not require XML, and can
exchange data in JSON, XML, or other formats.
Advanced Internet Programming 4
Semantic Web
• The Semantic Web extends the current web by adding a layer of
meaning to web content, enabling machines to understand and
process the data.
• Instead of treating web pages as a collection of unrelated content,
the Semantic Web introduces standards and technologies that
make data on the web more structured and meaningful for
automated agents.
• The goal of the Semantic Web is to transform the web into a space
where data is interconnected and machines can "understand"
relationships between different pieces of information.
• It relies heavily on metadata and ontologies to define these
relationships.
Advanced Internet Programming 5
RSS and Atom
• RSS (Really Simple Syndication) and Atom are web feed formats
used to publish frequently updated information such as blog entries,
news headlines, or podcasts.
• They allow users to subscribe to updates from websites without
having to visit them manually. Both formats provide a standardized
way of delivering content updates.
• RSS: Introduced in the late 1990s, RSS is an XML-based format.
• Websites syndicate (publish) their content in RSS feeds, and feed readers
(aggregators) pull this content periodically to show updates.
• Atom: Atom was developed as an alternative to RSS, aiming to address
some of its limitations.
• It also uses XML but offers a more flexible and feature-rich format for representing web
feeds.
Advanced Internet Programming 6
CAPTCHA (Completely Automated Public Turing test to
tell Computers and Humans Apart)
• A CAPTCHA is a type of challenge-response test used to determine
whether the user is human or an automated bot.
• CAPTCHAs are widely implemented on web pages, particularly in
forms, login pages, and online services, to prevent bots from
spamming, creating fake accounts, or exploiting services.
Advanced Internet Programming 7
CAPTCHA
• CAPTCHA tests exploit tasks that are easy for humans but hard for
computers, such as:
• Recognizing distorted text.
• Selecting images that match a given category (e.g., "select all
images containing cars").
• Solving simple puzzles or logic questions.
Advanced Internet Programming 8
Types of CAPTCHA
• Text-based CAPTCHAs: Users must type a distorted or blurred text
string into a field.
• Image-based CAPTCHAs: Users select images that correspond to
a description, often used by Google’s reCAPTCHA system.
• Audio CAPTCHAs: For accessibility, these provide an audio clip
that the user must interpret and type.
Modern CAPTCHA Systems:
• reCAPTCHA: Developed by Google, reCAPTCHA uses more sophisticated
methods such as behavioral analysis (e.g., tracking mouse movements) to
distinguish between humans and bots.
• The latest versions often don’t require direct user interaction ("I’m not a
robot" checkbox).
Advanced Internet Programming 9
Examples
Advanced Internet Programming 10
Importance of CAPTCHA
• CAPTCHAs are essential for maintaining security on the web by
preventing automated scripts from abusing online services.
Advanced Internet Programming 11
Workflow Languages
• Workflow languages are specialized programming languages or
notations used to describe, automate, and manage business
processes or workflows.
• A workflow represents a series of tasks or activities that must be
completed in a specific sequence or parallel to achieve a particular
business goal.
• Workflow languages allow organizations to model, execute, and
manage these processes efficiently.
• These languages play a critical role in Business Process
Management (BPM) and service-oriented architectures (SOA).
Advanced Internet Programming 12
Key Concepts in Workflow Language
• Tasks: Basic units of work.
• Transitions: Define how the flow moves between tasks.
• Actors: Individuals or systems responsible for executing tasks.
Advanced Internet Programming 13
End of Chapter-1
Advanced Internet Programming 14