JAIDEV EDUCATION SOCIETY’S
J D COLLEGE OF ENGINEERING AND MANAGEMENT
KATOL ROAD, NAGPUR
Website: www.jdcoem.ac.in E-mail:
[email protected] (An Autonomous Institute, with NAAC "A" Grade)
Affiliated to DBATU, RTMNU & MSBTE Mumbai
Department of Computer Science & Engineering
“A Place to Learn, A Chance to Grow”
Session: 2024-25
VISION MISSION
1. To create self-learning environment by facilitating leadership qualities, team spirit and
ethical responsibilities.
To be recognized for excellent engineering, developing global leaders both
in educational and research in the domain of computer science and wireless 2. To improve department-industry collaboration, interaction with professional society
engineering. through technical knowledge and internship program.
3. To promote research and development with current techniques through well qualified
resources in the area of computer science and wireless engineering.
UNIT: 01
Semantic Web Vision: Todays’ web, Examples of semantic web from today’s web, Semantic
web technologies, layered approach Structured web documents in XML: The XML language,
Structuring, Namespaces, Querying and Addressing XML documents, Processing.
Today’s web:
The Semantic Web aims to enhance the current web by enabling data to be shared and reused
across applications, enterprises, and community boundaries. Here’s how elements of the
Semantic Web vision are reflected in today's web:
Key Components and Their Presence Today:
1. Linked Data:
o Definition: Linked Data refers to a method of publishing structured data so that
it can be interlinked and become more useful through semantic queries.
o Current Examples:
DBpedia: Extracts structured data from Wikipedia, enabling users to
query relationships and properties of Wikipedia resources.
Open Government Data: Various governments publish data in a
Linked Data format to promote transparency and innovation.
2. Ontologies and Vocabularies:
o Definition: Ontologies provide a shared and common understanding of a
domain that can be communicated between people and application systems.
o Current Examples:
Schema.org: A collaborative initiative launched by major search
engines (Google, Microsoft, Yahoo, Yandex) to create and support a
common set of schemas for structured data markup on web pages.
FOAF (Friend of a Friend): An ontology describing persons, their
activities, and their relations to other people and objects.
3. Resource Description Framework (RDF):
o Definition: RDF is a standard model for data interchange on the web. It allows
the merging of data even if the underlying schemas differ.
o Current Examples:
RDFa: A W3C Recommendation that adds a set of attribute-level
extensions to HTML, XHTML, and various XML-based document
types for embedding rich metadata within web documents.
RDF datasets: Numerous datasets published in RDF format, such as
those in the Linked Open Data Cloud, providing interconnected datasets
across various domains.
4. SPARQL:
o Definition: SPARQL is the query language and protocol used to query RDF
data.
o Current Examples:
SPARQL Endpoints: Public endpoints provided by many
organizations, such as DBpedia, Wikidata, and various research
institutions, allow querying of RDF datasets.
5. Knowledge Graphs:
o Definition: Knowledge graphs represent a network of real-world entities and
illustrate the relationships between them.
o Current Examples:
Google Knowledge Graph: Enhances search results by providing
structured information about search queries from a variety of sources.
Microsoft's Satori: Powers Bing's knowledge graph, offering a similar
enhancement of search capabilities.
Benefits in Today's Web:
1. Improved Search Engine Capabilities:
o Search engines use structured data to provide more relevant results and rich
snippets, making it easier for users to find the information they need.
2. Enhanced Interoperability:
o Different systems can interoperate more efficiently by using common standards
for data exchange, facilitating seamless data integration and communication.
3. Data Reusability and Integration:
o Data published in a structured format can be reused and integrated across
different applications and domains, promoting innovation and new uses of
existing data.
4. Intelligent Applications:
o Applications can leverage the interconnected data to provide more intelligent
and context-aware services, such as personalized recommendations and
advanced analytics.
Examples of semantic web from today’s web:
1. Schema.org
Description: A collaborative initiative by Google, Microsoft, Yahoo, and Yandex to create and
support a common set of schemas for structured data markup on web pages.
Example: When you search for a recipe on Google, the search engine can display rich snippets
with cooking time, ingredients, and ratings. This is possible because websites use Schema.org
to mark up their content in a way that search engines can understand.
2. DBpedia
Description: A project that extracts structured information from Wikipedia and makes it
available on the web.
Example: DBpedia converts Wikipedia content into a format that can be queried using
SPARQL. Researchers and developers can use this structured data to analyze relationships and
trends in a variety of domains.
3. Google Knowledge Graph
Description: A knowledge base used by Google to enhance its search engine's results with
information gathered from a variety of sources.
Example: When you search for "Albert Einstein," Google’s Knowledge Graph provides a
summary of key facts, related people, and a timeline of his life. This data is structured and
interconnected, allowing for richer search results.
4. Wikidata
Description: A collaboratively edited knowledge base operated by the Wikimedia Foundation,
designed to support Wikipedia and other Wikimedia projects.
Example: Wikidata provides a central repository for structured data across all Wikimedia
projects. This data is used to populate infoboxes in Wikipedia articles, and can be queried
directly for various applications.
5. FOAF (Friend of a Friend)
Description: An ontology describing persons, their activities, and their relations to other
people and objects.
Example: FOAF profiles are used by social networking sites to describe users and their
relationships. This allows different platforms to share user information in a standardized way,
facilitating interoperability between social networks.
6. Open Government Data
Description: Many governments publish their data in a Linked Data format to promote
transparency and innovation.
Example: The UK government publishes a wide range of datasets on data.gov.uk, including
information on transportation, health, and education. This data is available in formats that
facilitate linking and querying.
7. Facebook Open Graph
Description: An API developed by Facebook to integrate websites with the social graph.
Example: When you like an article on a website that uses the Facebook Open Graph protocol,
this action can be shared on your Facebook profile. The Open Graph tags also allow Facebook
to pull relevant information from the page to display on your profile.
8. Amazon Product Graph
Description: Amazon uses structured data to connect various aspects of products, reviews, and
user preferences.
Example: When browsing products, Amazon provides recommendations based on structured
data about your previous purchases, viewed items, and the purchasing behavior of similar users.
This structured data enables sophisticated recommendation algorithms.
9. Spotify's Music Knowledge Graph
Description: Spotify uses a graph-based approach to connect artists, albums, genres, and user
preferences.
Example: Spotify’s Discover Weekly playlist leverages this graph to recommend new music
based on your listening history and connections between different musical elements.
10. Bio2RDF
Description: A project that transforms open biological and biomedical data into RDF, making
it accessible and interoperable.
Example: Researchers can query Bio2RDF datasets to discover connections between genes,
diseases, and drugs, facilitating new insights in biomedical research.
Semantic web technologies:
Semantic Web technologies form the foundation for creating a web of data that is
understandable by machines, enabling more intelligent and interconnected web applications.
Here are the core technologies that make up the Semantic Web:
1. Resource Description Framework (RDF)
Description: RDF is a standard model for data interchange on the web. It uses triples
(subject-predicate-object) to represent data.
Example:
<http://example.org/person/JohnDoe> <http://xmlns.com/foaf/0.1/name> "John Doe".
2. RDF Schema (RDFS)
Description: RDFS provides a basic vocabulary for RDF data, allowing the definition of
classes and properties.
Example:
<http://example.org/person> rdf:type rdfs:Class.
<http://example.org/name> rdf:type rdf:Property; rdfs:domain <http://example.org/person>;
rdfs:range xsd:string.
3. Web Ontology Language (OWL)
Description: OWL is a more expressive language than RDFS for defining complex
relationships between concepts and for reasoning about data.
Example:
<http://example.org/person> rdf:type owl:Class.
<http://example.org/JohnDoe> rdf:type <http://example.org/person>.
4. SPARQL Protocol and RDF Query Language (SPARQL)
Description: SPARQL is the query language for querying RDF data.
Example:
SELECT ?name WHERE {
?person <http://xmlns.com/foaf/0.1/name> ?name.
}
5. Uniform Resource Identifier (URI)
Description: URIs uniquely identify resources on the web.
Example:
http://example.org/person/JohnDoe
6. RDFa (RDF in Attributes)
Description: RDFa is a specification for embedding RDF data in HTML and other XML
documents.
Example:
<div vocab="http://schema.org/" typeof="Person">
<span property="name">John Doe</span>
</div>
7. OWL Ontologies
Description: OWL ontologies provide more complex and rich descriptions of concepts and
their relationships.
Example:
<http://example.org/Student> rdf:type owl:Class;
rdfs:subClassOf <http://example.org/Person>.
Applications of Semantic Web Technologies:
1. Knowledge Graphs: Used by search engines like Google and Bing to provide rich,
structured information about search queries.
2. Linked Open Data: Initiatives like DBpedia and Wikidata provide interconnected
data sets for public use.
3. Healthcare and Life Sciences: Used for integrating and querying biomedical data
from various sources.
4. E-commerce: Amazon uses structured product data to enhance search and
recommendation systems.
5. Social Media: Platforms like Facebook use Open Graph to integrate social data across
the web.
Layered approach Structured web documents in XML
A layered approach to structured web documents using XML involves several layers, each
building on the previous to create a robust framework for defining, describing, and using data.
This approach ensures that data is not only structured and well-defined but also that it can be
processed and understood both by machines and humans.
1) URI and Unicode Layer:
The first layer, URI and Unicode, follows the important features of the existing
WWW.
Unicode is a standard of encoding international character sets and it allows that all
human languages can be used (written and read) on the web using one standardized
form.
Uniform Resource Identifier (URI) is a string of a standardized form that allows to
uniquely identify resources (e.g., documents).
Provides a Simple and extensible way for identifying resources. E.g. : Website, doc,
image etc.
2) Extensible Markup Language (XML) layer:
Extensible Markup Language (XML) layer with XML namespace and XML schema
definitions makes sure that there is a common syntax used in the semantic web.
XML is a general purpose markup language for documents containing structured
information.
A XML document contains elements that can be nested and that may have attributes
and content. XML namespaces allow to specify different markup vocabularies in one
XML document.
XML schema serves for expressing schema of a particular set of XML documents.
XML describes what is in the Doc. Not what document looks like. XML schema
provides grammars for leagal XML documents.
Resource Description Framework (RDF) layer:
A core data representation format for semantic web is Resource Description
Framework (RDF).
RDF is a framework for representing information about resources in a graph form.
It was primarily intended for representing metadata about WWW resources, such as
the title, author, and modification date of a Web page, but it can be used for storing
any other data.
It is based on triples subject-predicate-object that form graph of data. All data in the
semantic web use RDF as the primary representation language.
While describing classes of resources and the properties between them, using RDF
Schema (which is a simple modelling language), it also provides a simple reasoning
framework for inferring types of resources.
Ontology Vocabulary is a language Layer:
which provides a common vocabulary and grammar for published data as well as a
semantic description of the data used to preserve the ontologies and to keep them
ready for inference.
Ontology means describing the semantics of the data, providing a uniform way to
enable communication by which different parties can understand each other.
Logic and Proof Layer:
In the Semantic Web, the building of systems follows a logic which considers the
structure of ontology.
A reasoned could be used to check and resolve consistency problems and the
redundancy of the concept translation.
A reasoning system is used to make new inferences.
Used for checking the validity of specific statements.
Trust Layer:
Trust is the final layer of the Semantic Web.
Depends on the source of information as well as the policies available on the
information source which can deny wanted applications or users access to these
sources.
This component concerns the trustworthiness of the information on the Web in order
to provide an assurance of its quality.
User Interface and allocation Layer:
User Interface and allocations Layer deploys a baseline that all user interfaces and
applications should Satisfy.
Vertical layers: Crypto:
Starts from layer1 to layer 6. Digital signature is a step towards a web of trust (for
identification)
Structured web documents in XML
The XML language, Structuring, Namespaces, Querying and Addressing XML documents,
Processing.
The XML language
An XML document consists of a prolog, a number of elements, and an optional epilog.
Prolog
The prolog consists of the XML declaration, and an optional reference to ex-ternal structuring
documents. Here is an example of an XML declaration:
<?xml version="1.0" encoding="UTF-16" ?>
It specifies that the current document is an XML document, and defines the version and the
character encoding used in the particular system (such as UTF-8, UTF-16 and ISO 8859-1).
The character encoding is not mandatory, but its specification is considered good practice.
A reference to external structuring documents looks as follows:
<!DOCTYPE book SYSTEM "book.dtd">
Here the structuring information is found in a local file called book.dtd. In-stead the reference
might be a URL. If only a locally recognized name or only a URL is used, then the label
SYSTEM is used.
XML elements
XML elements represent the “things” the XML document talks about, such as books,
authors, publishers etc.
They are the main concept of XML documents.
An element consists of an opening tag, its content, and a closing tag. For
example:
<lecturer>David Billington</lecturer>
Tag names can be chosen almost freely, there are very few restrictions.
The most important ones are that the first character must be a letter, an under- score or
a colon; and that no name may begin with the string “xml” in any combination of cases
(such as “Xml” and “xML”).
The content may be text, or other elements, or nothing. For example:
<lecturer>
<name>David Billington</name>
<phone> +61 − 7 − 3875 507 </phone>
</lecturer>
If there is no content, then the element is called empty. An empty element like
<lecturer></lecturer>
can be abbreviated as:
<lecturer/>
Attributes
An empty element is not necessarily meaningless, because it may have some properties
in terms of attributes.
An attribute is a name-value pair inside the opening tag of an element.
<lecturer name="David Billington" phone="+61 − 7 − 3875 507"/>
Here is an example of attributes for a non-empty element:
<order orderNo="23456" customer="John Smith"
date="October 15, 2002">
<item itemNo="a528" quantity="1"/>
<item itemNo="c817" quantity="3"/>
</order>
The same information could have been written as follows, replacing attributes by nested
elements:
<order>
<orderNo>23456</orderNo>
<customer>John Smith</customer>
<date>October 15, 2002</date>
<item>
<itemNo>a528</itemNo>
<quantity>1</quantity>
</item>
<item>
<itemNo>c817</itemNo>
<quantity>3</quantity>
</item>
</order>
When to use elements and when attributes is often a matter of taste. However, note that
the nesting of attributes is impossible.
Difference Between HTML and XML
HTML XML
In HTML Language Display the data (Look In XML Language Transport and store the
and feel). data.
It Provides framework to define markup
It is a Markup language itself.
language.
It is not Case Sensitive. It is a Case Sensitive
In XML Language you Can Create own
In HTML Language have Predefine tags.
tags.
It is a Static. It is a Dynamic.
Example: Example:
<html> <College>
<body> <Class>
<p>HTML INTRODUCTION </p> <Name> RAM</Name>
</html> </Class>
</body> </College>
Difference Between DTD and XSD
DTD XSD
DTD stands for Document Type Definition XSD stands for XML Schema Definition
It Doesn’t support data types It Supports the data types
It Doesn’t Support Namespace It Supports the Namespace
It Doesn’t define order for child elements It’s order can be defined.
It is Not Extensible It is Extensible
Example:
<xs: element name =”Address”>
Example: <xs: complex Type>
<!DOCTYPE Address[ <xs: Sequence>
<! Element Address (Name)> <xs: element name =”name”
<! Element Name (#PCDATA)> type=”xs:String”>
]> </xs: Sequence>
</xs: complex Type>
</xs: element>
XML DTD:
DTD stands for Document Type Definition.
A DTD defines the structure and the legal elements and attributes of an XML document.
In the context of the Semantic Web, XML Document Type Definitions (DTDs) were
initially used to define document structure and constraints in XML data.
It is used to describe XML Language precisely.
It is used to define structure of XML document.
It Contains list of legal elements.
It is used to perform Validation.
DTD SYNTAX:
<!DOCTYPE element DTD Identifier
[ declaration 1
declaration 2
]>
Types of DTD.
There are two types of DTD are as Follows:
1) Internal DTD
2) External DTD
1) Internal DTD
The elements as declared within the XML files.
An Internal DTD is embedded directly within the XML document it is defining.
It is written inside the <!DOCTYPE> declaration at the beginning of the XML file.
Internal DTDs are useful when the structure of the XML document is small and
specific to that single document, as it avoids the need to manage a separate DTD file.
Defined within the XML document itself; suitable for single-use, document-specific
structures.
Syntax:
<! DOCTYPE root –element
[Element -Declaration]>
Example:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note [
<!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Alice</to>
<from>Bob</from>
<heading>Reminder</heading>
<body>Don't forget our meeting tomorrow!</body>
</note>
2) External DTD
The elements are declared outside XML file.
An External DTD is defined in a separate file and referenced by the XML document.
This type is useful when the same DTD needs to be applied to multiple XML
documents, as it keeps the DTD in a single, reusable file.
External DTDs can be referenced using either a system identifier (local file path or
URL) or a public identifier (commonly used in conjunction with a URL).
Defined in a separate file and referenced by the XML document; suitable for shared or
reusable structures across multiple documents.
Syntax:
<! DOCTYPE root-element SYSTEM “ File-name”>
Example of an External DTD Reference in an XML Document:
Assuming the external DTD is saved in a file named note.dtd:
note.dtd (External DTD file):
!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
note.xml (XML document referencing the external DTD):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Alice</to>
<from>Bob</from>
<heading>Reminder</heading>
<body>Don't forget our meeting tomorrow!</body>
</note>
XML Namespaces
The XML Namespace used to avoid element name conflict in XML document.
It is a set of unique names.
It is Identified by URI(Uniform Resource Identifier) Attribute name must start with
"xmlns"
Syntax: <element xmlns: name = "URL">
URI
element and Attributes Prefix names belongs to URL.
Conflict: Generally, conflict occurs when we try to mix XML documents from
different XML Application.
Example of conflict.
1. Xml 2. Xml
<class> <class>
<name> SHYAM </name> <name> RAM</name>
</class> </class>
Conflict occurs due to same element Name
Example of Namespace:
1.xml
<?xml version="1.0" encoding: "UTF-8"?>
<c1:class xmlns: cl = “class1-------“>
</c1:class>
2. xml
<?xml version="1.0" encoding: "UTF-8"?>
<c2:class xmlns: c2= “class2-------“>
</c2:class>
XML-Schemas:
It is Commonly known as XML Schema Definition (XSD), It is used to describe &
validate the structure and Content of XML Data.
It is a method of expressing constraints about XML documents.
It is like DTD but provides more control on XML Structure.
Syntax:
<XS: Schema xmlns:xs="______">
Types of XML Schema Definition
1) Simple Type
2) Complex Type
1) Simple Type
It is used only in the context of the text.
A Simple Type in XML Schema is used to define elements or attributes that contain
only text and cannot have child elements or attributes.
Simple types are typically used for defining basic data types like strings, numbers, dates,
and other primitive data.
You can also restrict or derive new types based on predefined ones, applying constraints
like length, pattern, range, and more.
Syntax:
eg: xs: Int, xs: string.
<xs: element name= "Phone" type="xs: int”/>
Example:
<xs:simpleType name="zipCodeType">
<xs:restriction base="xs:string">
<xs:pattern value="\d{5}"/>
</xs:restriction>
</xs:simpleType>
2) Complex Type
It is the container for other element, definitions. Allows you to Specify which child
elements an element Can Contain & to Provide some Structure within Your XML
documents.
A Complex Type is used to define elements that can contain other elements,
attributes, or both.
Complex types allow for the creation of more elaborate and nested XML structures
and are fundamental for representing hierarchical data.
Example of Complex Type (Add.xsd)
<?xml version="1.0" encoding="UTF-8"?>
<xs: Schema xmlns:xs= "Schema1....”>
<xs: element name = "Address">
<xs: Complex Type>……….. child elements should appear in Sequence
<xs: sequence>
<xs: element name=” Name" type="xs:string”/>
<xs: element name= "Phone" type= "xs: int"/>
</xs: Sequence>
</xs: Complex type>
</xs: element>
<xs:schema>
(Add.xml)
<?xml version="1.0" encoding="UTF-8"?>
<Address
xsi:schemalocation = “Add.xsd”>
<Name> Aman </Name>
<Phone> 9810 / Phone>
</Address>