0% found this document useful (0 votes)
55 views20 pages

Module 1 - Data Representation, and Data Structures-1

The document provides an overview of data representation and data structures. It defines key terms like data, data types, primitive/system defined data types, composite/derived data types, and classes of data types. It explains how data is organized hierarchically into fields, records, and files with examples and discusses more complex data organization using data structures.

Uploaded by

arturogallardoed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
55 views20 pages

Module 1 - Data Representation, and Data Structures-1

The document provides an overview of data representation and data structures. It defines key terms like data, data types, primitive/system defined data types, composite/derived data types, and classes of data types. It explains how data is organized hierarchically into fields, records, and files with examples and discusses more complex data organization using data structures.

Uploaded by

arturogallardoed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 20

Module 1: Data Representation, and Data Structures

1.1 Data
Data are simply values or set of values. A data refers to a single unit of values
and is either the value of a variable or a constant. For example, a data item is
a row in a database table, which is described by a data type. A data item that
does not have subordinate data items is called an elementary item. A data
item that is composed of one or more subordinate data items is called a group
item. A record can be either an elementary item or a group item. For example,
an employee’s name may be divided into three sub items –first name, middle
name and last name but the social_security_number would normally be
treated as a single item.

In the above diagram above, (ID, Age, Gender, First, Middle, Last, Street, Area)
are elementary data items, whereas (Name, Address) are group data items.

Data are frequently organized into a hierarchy of fields, records and files. In
order to understand these terms, let us see the following example.

An Entity is something that has certain attributes or properties which may be


assigned values. The values themselves may be either numeric or
nonnumeric. In the above example, an employee of a given organization is
entity. Entities with similar attributes (e.g. all the employees in an
organization) collectively form an entity set. Each attributes of an entity set
has a range of values. The set of all possible values could be assigned to the
particular attributes.

Page 1 of 20
The term "information'' is sometimes used for data with given attributes, or,
in other words, meaningful or processed data

The way that data are organized into the hierarchy of fields, records and files
reflects the relationship between attributes, entities and entity sets. That is,
a field is a single elementary unit of information representing an attributes of
an entity. A record is the collection of field values of a given entity and a file is
the collection of records of the entities in a given entity set. Each record in a
file may contain many field items, but the value in a certain field may uniquely
determine the record in the file. Such a field K is called a Primary Key, and
the values K1, K2....Kn in such field are called keys or key values.

Records may also be classified according to length. A file can have fixed-length
or variable-length records. In fixed-length records, all the record contain the
same data items with the same amount of space assigned to each data item.
In variable-length records, file records may contain different lengths. For
example, student records usually have variable lengths, since different
students take different numbers of courses. Usually, variable-length records
have a minimum and maximum length.

The above organization of data into field, record and files may not be complex
enough to maintain and efficiently process certain collections of data. For this
reason, data are also organized into more complex types of structures. The
study of such data structures which form the subject matter of the text,
includes the following three steps:-
(a) Logical or mathematical description of the structure
(b) Implementation of the structure on a computer
(c) Quantitative analysis of the structure, which includes determining the
amount of memory needed to store the structure and the time required
to process the structure.

1.2 Data Types


Data types are used within type systems, which offer various ways of defining,
implementing and using the data. Different type systems ensure varying
degrees of type safety.

A Data type simply refers to a defined kind of data, that is, a set of possible
values and basic operations on those values. A data type consists of:
✓ a domain (= a set of values)
✓ A set of operations that may be applied to the values.

Computer memory is all filled with zeros and ones. If we have a problem and
wanted to code it, it is very difficult to provide the solutions in terms of zeros

Page 2 of 20
and ones. To help users, programming languages and compilers are providing
the facility of data types.

For example, integer takes 2 bytes (actual value depends on compiler), float
takes 4 bytes, etc. What this means is that in memory we are combining 2
bytes (16 bits) and calling it as integer. Similarly, combining 4 bytes (32 bytes)
and calling it as float. A data types reduces coding efforts.

Some examples of data types in most programming languages are:


(a) Boolean (e.g., True or False)
(b) Character (e.g., a)
(c) Date (e.g., 03/01/2016)
(d) Double (e.g., 1.79769313486232E308)
(e) Floating-point number (e.g., 1.234)
(f) Integer (e.g., 1234)
(g) Long (e.g., 123456789)
(h) Short (e.g., 0)
(i) String (e.g., abcd)
(j) Void (e.g., no data)
(k) Nothing (e.g., null)

1.3 Classes of Data Types


Basically, at the top level, they are several data types. Some of them are:
(a) System defined data types (also called Primitive data types)
(b) Composite/User defined data types
(c) Abstract Data types
(d) Enumerated Data types
(e) Others Data Types

1.3.1 System defined/Primitive/Inbuilt data types


Primitive data type is the most fundamental data type usable in Programming
languages and are defined by the system. They are predefined data types that
contain simple values (i.e., stores single values) of specific characteristics,
such as numeric or textual and are not based on any other type. They are the
most basic data types in most programming languages and serve as a building
block from which other, more sophisticated types are built.

The number of bits allocated for each primitive data types depends on the
programming languages, compilers and operating system. Different languages
may use different sizes for each of the primitive data types. Depending on the
size of the data types, the total available values (domain) will also change. For
example, “int” may take 2 bytes or 4 bytes. If it takes (16 bits), then, the total
possible values are -32,768 to +32, 767 (-215 to 215 – 1). If it takes 4 bytes (32

Page 3 of 20
bits), then the possible values are between -2, 147, 483, 648 t0
+2,147,483,648 (-231 to 231 – 1).

Common Primitive Data Types in some programming languages are shown in


the table below:

Data Types Definition Examples


Integer (int) Numeric data type for numbers without -200, 8, 405
fractions
byte The byte data type is used to save memory 100
in large arrays where the memory savings is
most required. It saves space because a byte
is 4 times smaller than an integer.
Long Long data types are often 32- or 64-bit 123456789
integers in code. Sometimes, these can
represent integers with 20 digits in either
direction, positive or negative.
Short Similar to the long data type, a short is a 203
variable integer. Programmers represent
these as whole numbers, and they can be
positive or negative. Sometimes a short data
type is a single integer.
Character Single letter, digit, punctuation mark, a, 1, !
(char) symbol, or blank space
Boolean True or false values 0 (false), 1 (true)
(bool)
Floating Numeric data type for numbers with 707.07, 0.7,
Point (float) fractions 707.00
Floating Stores fractional numbers. Sufficient for 1.7976931348623
Point storing 15 decimal digits 2E308
(double)

1.3.2 Composite/Derived/Non-Primitive/Reference Data Types


A Composite data type/Compound data type is any data type which can be
constructed in a program using the programming language's primitive data
types and other composite types. They are data types that have one or more
fields dynamically linked to fields in another data type. Composite data types
are useful for creating a single data type that references information in more
than one data source. They include structures, arrays, lists, classes, tuples
and collections.

Page 4 of 20
A “struct” in C's and C++'s is example of a composite type, a datatype that
composes a fixed set of labelled fields or members. It is so called because of
the struct keyword used in declaring them, which is short for structure.
A struct declaration consists of a list of fields, each of which can have any
type. Structures is declared is C and C++ as shown below:

struct Account {
int account_number;
char *first_name;
char *last_name;
float balance;
}; a1

The variable of the structure can be accessed by simply using the instance of
the structure followed by the dot (.) operator and then the field of the
structure. For example,
a1.balance = 20000.00

Another example is Classes in C++ and Java

In C++, class is a group of similar objects. It is a template from which objects


are created. It can have fields, methods, constructors etc. An example of C++
class that has three fields only and one function is:

class Student
{
public:
int id; //field or data member
float salary; //field or data member
String name;//field or data member
public:
display (string staffName, int staffId, float pay); // function
}

Java Classes are created using the Java keyword class as shown in the
example below:

class Bicycle {
// state or field
private int gear = 5;
// behavior or method
public void braking() {
System.out.println("Working of Braking");

Page 5 of 20
}
}

1.3.3 Abstract Data Types


Abstract Data type (ADT) is a type of data type whose behaviour is defined
by a set of values and a set of predefined operations within the language. As
their name implies, it does not specify how data types are implemented
because their implementation is hidden. It is an abstraction of a data
structure that provides only the interface to which the data structure must
adhere. The interface does not give any specific details about something
should be implemented or in what programming language.

In other words, we can say that abstract data types are the entities that are
definitions of data and operations but do not have implementation details. In
this case, we know the data that we are storing and the operations that can
be performed on the data, but we do not know about the implementation
details. The reason for not having implementation details is that every
programming language has a different implementation strategy, for example;
a C data structure is implemented using structures while a C++ data
structure is implemented using objects and classes.

An ADT does not specify how data will be organized in memory and what
algorithms will be used for implementing the operations. It is called
“abstract” because it gives an implementation-independent view. The
process of providing only the essentials and hiding the details is known as
abstraction. Commonly used ADTs include Linked Lists, Stacks, Queues,
Priority Queues, Binary Trees, Disjoint Sets (Union and Find), Hash Tables,
Graphs, etc.

The ADT is made of with primitive datatypes, but operation logics are hidden.
It has basic operations lsuch as insertion, deletion, or updating. For example,
some ADT operations used with Stack are:
▪ isFull(), This is used to check whether stack is full or not
▪ isEmpry(), This is used to check whether stack is empty or not
▪ push(x), This is used to push x into the stack
▪ pop(), This is used to delete one element from top of the stack
▪ peek(), This is used to get the top most element of the stack
▪ size(), this function is used to get number of elements present into the
stack

Page 6 of 20
1.3.4 Enumerated Data Types
An Enumerated type is a type whose legal values consist of a fixed set of
constants. Common examples include compass directions, which take the
values North, South, East and West and days of the week, which take the
values Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, and
Saturday.

In the Java programming language, an enumerated type is defined by using


the enum keyword. For example, you would specify a days of the week
enumerated type as:

enum Days {SUNDAY, MONDAY, TUESDAY, WEDNESDAY,


THURSDAY, FRIDAY, SATURDAY};

C++ Enums can be thought of as classes that have fixed set of constants.
Enumeration and implemented the enum keyword similar to that of Java.A
simple example of enum data type used in C++ program.

#include <iostream>
using namespace std;
enum week {Monday, Tuesday, Wednesday, Thursday, Friday,
Saturday, Sunday };
int main()
{
week day;
day = Friday;
cout << "Day: " << day+1<<endl;
return 0;
}

1.3.5 Other Data Types


Other data types include String, Pointers, Union and Text
(a) String data type: One of the most widely used data types is a string.
A string consists of one or more characters, which can include letters,
numbers, and other types of characters. A string represents
alphanumeric data. This means that a string can contain many
different characters, but they are all considered as if they were text,
even if the characters are numbers. A string can also contain spaces.
The program below illustrates how a string data types is used in C++.

#include <iostream>
include <string>

Page 7 of 20
using namespace std;
int main () {
// Create a string variable
string greeting = "Hello";
// Output string value
cout << greeting;
return 0;
}

(b) Pointer data type: A Pointer is a variable that stores the memory
address of an object. The pointer then simply “points” to the object.
The type of the object must correspond with the type of the pointer.
Pointers are used extensively in both C and C++ for three main
purposes:

▪ To allocate new objects on the heap.


▪ To pass functions to other functions.
▪ To iterate over elements in arrays or other data structures.

Pointers are used to store and manage the addresses of dynamically


allocated blocks of memory. Such blocks are used to store data objects
or arrays of objects. Most structured and object-oriented languages
provide an area of memory, called the heap or free store, from which
objects are dynamically allocated.

Pointer declaration of C++ takes the following syntax:


datatype *variable_name;

Here are an example of valid pointer declarations in C++:


int *x; // a pointer to integer
double *x; // a pointer to double
float *x; // a pointer to float
char *ch // a pointer to a character

A C++ program that illustrates how a Pointer data types is used in C++ is
shown below:
// C++ program to point address of a pointer
#include <iostream>
using namespace std;
int main ()
{
int *ptr, var; // * is called a deference operator

Page 8 of 20
var = 5;
// Assign address of var to ptr
ptr = &var; // & is called Reference Operator

// Access value pointed by ptr


cout << "Value pointed by ptr: " << *ptr << endl;

// Address stored by ptr


cout << "Address stored at ptr: " << ptr << endl;

// Address of pointer ptr


cout << "Address of ptr: " << &ptr;
}

The Output of the above program are:


Value pointed by ptr: 5
Address stored at ptr: 0x7ffe1c19b10c
Address of ptr: 0x7ffe1c19b110

(c) Union data types: A Union is a special data type available that allows
to store different data types in the same memory location. You can
define a union with many members, but only one member can contain
a value at any given time. Unions provide an efficient way of using the
same memory location for multiple-purpose. With a union, all members
share the same memory. The Program below illustrates how Union is
used in C++.

#include <iostream>
using namespace std;
// Creating Union
union Job {
float salary;
int workerNo;
} j;

int main () {
// Assigning values to member of the Union
j.salary = 12.3;
// when j.workerNo is assigned a value,
// j.salary will no longer hold 12.3
j.workerNo = 100;
cout <<("Salary = ”<< j.salary);
cout <<"Number of workers = ", j.workerNo);

Page 9 of 20
return 0;
}

The Output is:

Salary = 0.0
Number of workers = 100

(d) Text data types: A Text data type can hold any letter, number, symbol
or punctuation mark. It is a variable-length data type that can store
long character strings. It is sometimes referred to as 'alphanumeric' or
'string'. The data can be pure text or a combination of text, numbers
and symbols. An SQL Text data type can hold up to 2,147,483,647
bytes of data. They are variants of Text data type, for example, in
MySQL, the variants of Text data type are TINYTEXT, TEXT,
MEDIUMTEXT and LONGTEXT.

A Text data type is used in MySQL as shown below:


CREATE TABLE articles (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255),
summary TEXT(255)
);

1.4 Overview of Data structures


Data Structures are building blocks of a program. If a program is built using
improper data structures, then the program may not work as expected always.
It is very much important to use right data structures for a program.
Data structures are method of representing the logical relationships between
individual data elements related to the solution of a given problem. Data
structures are the most convenient way to handle data of different types
including abstract data type for a known problem. The components of data
can be organized and records can be maintained. Further, the record
formation leads to the development of abstract data type and database

A Data structure is the implementation of an abstract data type in a particular


programming language. Data structures can also be referred to as “data
aggregate”. A carefully chosen data structure will allow the most efficient
algorithm to be used. Thus, a well-designed data structure allows a variety of
critical operations to be performed using a few resources, both execution time
and memory spaces as possible.

Page 10 of 20
A Data structure is a structured set of variables associated with one another
in different ways, cooperatively defining components in the system and
capable of being operated upon in the program. Data structures are the basis
of programming tools and the choice of data structures should provide the
following:
(a) The data structures should satisfactorily represent the relationship
between data elements.
(b) The data structures should be easy so that the programmer can easily
process the data.

Different kinds of data structures are suited to different kinds of applications,


and some are highly specialized to specific tasks. For example, B-trees are
particularly well-suited for implementation of databases, while compiler
implementations usually use hash tables to look up identifiers.

Data Structure is a systematic way to organize data in order to use it


efficiently. The following terms are the foundation terms of a data structure.
✓ Interface: Each data structure has an interface. Interface represents
the set of operations that a data structure supports. An interface only
provides the list of supported operations, type of parameters they can
accept and return type of these operations.
✓ Implementation: Implementation provides the internal representation
of a data structure. Implementation also provides the definition of the
algorithms used in the operations of the data structure.

1.4.1 Objectives of Data Structures


The objectives of Data Structures are:
✓ Data structure enables an efficient storage of data for an easy access.
✓ It enables to represent the inherent relationship of the data in the real
world.
✓ It enables an efficient processing of data.
✓ It helps in data protection and management

1.4.2 Characteristics of a Data Structure


(a) Correctness − Data structure implementation should implement its
interface correctly.
(b) Time Complexity − Running time or the execution time of operations
of data structure must be as small as possible.
(c) Space Complexity − Memory usage of a data structure operation
should be as little as possible.

1.4.3 Need for Data Structure

Page 11 of 20
As applications are getting complex and data rich, there are three common
problems that applications face now-a-days.
(a) Data Search − Consider an inventory of 1 million (106) items of a store.
If the application is to search an item, it has to search an item in 1
million (106) items every time slowing down the search. As data grows,
search will become slower.
(b) Processor Speed − Processor speed although being very high, falls
limited if the data grows to billion records.
(c) Multiple Requests − As thousands of users can search data
simultaneously on a web server, even the fast server fails while
searching the data.

To solve the above-mentioned problems, data structures come to rescue. Data


can be organized in a data structure in such a way that all items may not be
required to be searched, and the required data can be searched almost
instantly.

1.5 Data Structure Classifications


Data structures have been classified in several ways. The most common
classifications of Data Structures are Linear and Non-Linear Data Structures.

(a) Linear Data Structures: In linear data structures, values are arranged
in linear fashion. Arrays, linked lists, stacks and queues are examples
of linear data structures in which values are stored in a sequence.
(b) Non-Linear Data Structures: This type is opposite to linear. The data
values in this structure are not arranged in order. Tree, graph, table
and sets are examples of non-linear data structures.

Page 12 of 20
Other Classifications are:
(a) Homogenous and Non-Homogenous Data Structures
✓ Homogenous: In this type of data structures, values of the same types
of data are stored, as in an array.
✓ Non-homogenous: In this type of data structures, data values of
different types are grouped, as in structures and classes.
(b) Dynamic and Static
✓ Dynamic: In dynamic data structures such as references and pointers,
size and memory locations can be changed during program execution.
✓ Static: A static data structure is one designed for a certain number and
type of elements. It is designed for one particular use (e.g. one particular
application) and it is never added to or deleted from.

1.6 Principal Data Structure types


The principal data structures are:
1.6.1 Array
An Array data structure or simply an array is a data structure consisting of
a collection of elements (values or variables), each identified by at least one
array index or key. An array is stored so that the position of each element can
be computed from its index tuple by a mathematical formula.

An Array can be One-Dimensional, Two-Dimensional and Multi-Dimensional.


✓ A one-dimensional array (or single dimension array) is a type of linear
array. Accessing its elements involves a single subscript which can
either represent a row or column index.
✓ Two-dimensional (2D) arrays are indexed by two subscripts, one for the
row and one for the column.
✓ A multi-dimensional array of dimension n (i.e., an n-dimensional array
or simply n-D array) is a collection of items which is accessed via n
subscript expressions.

1.6.2 Stack
A Stack is a particular kind of abstract data type or collection in which the
principal (or only) operations on the collection are the addition of an entity to
the collection, known as push and removal of an entity, known as pop. The
relation between the push and pop operations is such that the stack is a Last-
In-First-Out (LIFO) data structure. In a LIFO data structure, the last element
added to the structure must be the first one to be removed. This is equivalent
to the requirement that, considered as a linear data structure, or more
abstractly a sequential collection, the push and pop operations occur only at
one end of the structure, referred to as the top of the stack. Often a peek or

Page 13 of 20
top operation is also implemented, returning the value of the top element
without removing it.

A stack may be implemented to have a bounded capacity. If the stack is full


and does not contain enough space to accept an entity to be pushed, the stack
is then considered to be in an overflow state. The pop operation removes an
item from the top of the stack. A pop either reveals previously concealed items
or results in an empty stack, but, if the stack is empty, it goes into underflow
state, which means no items are present in stack to be removed.
A stack is a restricted data structure, because only a small number of
operations are performed on it. The nature of the pop and push operations
also means that stack elements have a natural order. Elements are removed
from the stack in the reverse order to the order of their addition. Therefore,
the lower elements are those that have been on the stack the longest.

1.6.3 Queue
A Queue is a particular kind of collection in which the entities in the collection
are kept in order and the principal (or only) operations on the collection are
the addition of entities to the rear terminal position and removal of entities
from the front terminal position. This makes the queue a First-In-First-Out
(FIFO) data structure. In a FIFO data structure, the first element added to the
queue will be the first one to be removed. This is equivalent to the requirement
that once an element is added, all elements that were added before have to be
removed before the new element can be invoked. A queue is an example of a
linear data structure.

Queues provide services in computer science, transport, and operations


research where various entities such as data, objects, persons, or events are
stored and held to be processed later. In these contexts, the queue performs
the function of a buffer. Queues are also common in computer programs,
where they are implemented as data structures coupled with access routines,
as an abstract data structure or in object-oriented languages as classes.

1.6.4 Deque
A double-ended queue (dequeue, often abbreviated to deque, pronounced
deck) is an abstract data type that generalizes a queue, for which elements
can be added to or removed from either the front (head) or back (tail). It is also
often called a head-tail linked list, though properly this refers to a specific
data structure implementation

1.6.5 Priority Queue


A priority queue is an abstract data type which is like a regular queue or
stack data structure, but where additionally each element has a "priority"

Page 14 of 20
associated with it. In a priority queue, an element with high priority is served
before an element with low priority. If two elements have the same priority,
they are served according to their order in the queue.

It is a common misconception that a priority queue is a heap. A priority queue


is an abstract concept like "a list" or "a map"; just as a list can be implemented
with a linked list or an array, a priority queue can be implemented with a
heap or a variety of other methods.

1.6.5 Linked lists


One disadvantage of using arrays to store data is that arrays are static
structures and therefore cannot be easily extended or reduced to fit the data
set. Arrays are also expensive in maintaining new insertions and deletions.

A linked list is a linear data structure where each element is a separate object.

Each element (we will call it a node) of a list is comprising of two items - the
data and a reference to the next node. The last node has a reference to null.
The entry point into a linked list is called the head of the list. It should be
noted that head is not a separate node, but the reference to the first node. If
the list is empty then the head is a null reference.

A linked list is a dynamic data structure. The number of nodes in a list is not
fixed and can grow and shrink on demand. Any application which has to deal
with an unknown number of objects will need to use a linked list.
One disadvantage of a linked list is that it does not allow direct access to the
individual elements. If you want to access a particular item, then you have to
start at the head and follow the references until you get to that item. Another
disadvantage is that a linked list uses more memory compare with an array -
we extra 4 bytes (on 32-bit CPU) to store a reference to the next node.

The types of Linked Lists are:


✓ A singly linked list is described above
✓ A doubly linked list is a list that has two references, one to the next
node and another to previous node.

Page 15 of 20
Another important type of a linked list is called a circular linked list where
last node of the list points back to the first node (or the head) of the list.

1.6.6 Record
A Record (also called a tuple or struct) is an aggregate data structure. A record
is a value that contains other values, typically in fixed number and sequence
and typically indexed by names. The elements of records are usually called
fields or members. A record is a special type of data structure that, unlike
arrays, collects different data types that define a particular structure such a
book, product, person and many others. The programmer defines the data
structure under the Type user definition as shown below.

Type
Str25 = String[25];
TBookRec = Record
Title, Author,
ISBN : Str25;
Price : Real;
end;
Var
myBookRec : TBookRec;

1.6.7 Trees
A Tree is a widely used abstract data type (ADT) that simulates a hierarchical
tree structure, with a root value and subtrees of children, represented as a
set of linked nodes. A tree data structure can be defined recursively (locally)
as a collection of nodes (starting at a root node), where each node is a data
structure consisting of a value, together with a list of references to nodes (the
"children"), with the constraints that no reference is duplicated, and none
points to the root.

A node is a structure which may contain a value or condition, or represent a


separate data structure (which could be a tree of its own). Each node in a tree
has zero or more child nodes, which are below it in the tree (by convention,
trees are drawn growing downwards). A node that has a child is called the

Page 16 of 20
child's parent node (or ancestor node, or superior). A node has at most one
parent.

An internal node (also known as an inner node, inode for short, or branch
node) is any node of a tree that has child nodes. Similarly, an external node
(also known as an outer node, leaf node, or terminal node) is any node that
does not have child nodes.

The topmost node in a tree is called the root node. Depending on definition,
a tree may be required to have a root node (in which case all trees are non-
empty), or may be allowed to be empty, in which case it does not necessarily
have a root node. Being the topmost node, the root node will not have a parent.
It is the node at which algorithms on the tree begin, since as a data structure,
one can only pass from parents to children. Note that some algorithms (such
as post-order depth-first search) begin at the root, but first visit leaf nodes
(access the value of leaf nodes), only visit the root last (i.e., they first access
the children of the root, but only access the value of the root last). All other
nodes can be reached from it by following edges or links.

The height of a node is the length of the longest downward path to a leaf from
that node. The height of the root is the height of the tree. The depth of a node
is the length of the path to its root (i.e., its root path). This is commonly needed
in the manipulation of the various self-balancing trees, AVL Trees in
particular. The root node has depth zero, leaf nodes have height zero, and a
tree with only a single node (hence both a root and leaf) has depth and height
zero. Conventionally, an empty tree (tree with no nodes, if such are allowed)
has depth and height −1.

A subtree of a tree T is a tree consisting of a node in T and all of its


descendants in T. Nodes thus correspond to subtrees (each node corresponds
to the subtree of itself and all its descendants) – the subtree corresponding to
the root node is the entire tree, and each node is the root node of the subtree
it determines; the subtree corresponding to any other node is called a proper
subtree (in analogy to the term proper subset).

Page 17 of 20
1.6.8 Graph
A Graph is an abstract data type that is meant to implement the graph and
hypergraph concepts from mathematics. A graph data structure consists of a
finite (and possibly mutable) set of ordered pairs, called edges or arcs, of
certain entities called nodes or vertices. As in mathematics, an edge (x,y) is
said to point or go from x to y. The nodes may be part of the graph structure,
or may be external entities represented by integer indices or references. A
graph data structure may also associate to each edge some edge value, such
as a symbolic label or a numeric attribute (cost, capacity, length, etc.).

The basic operations provided by a graph data structure G usually include:


✓ adjacent(G, x, y): tests whether there is an edge from node x to node y.
✓ neighbors(G, x): lists all nodes y such that there is an edge from x to y.
✓ add(G, x, y): adds to G the edge from x to y, if it is not there.
✓ delete(G, x, y): removes the edge from x to y, if it is there.
✓ get_node_value(G, x): returns the value associated with the node x.
✓ set_node_value(G, x, a): sets the value associated with the node x to a.

Structures that associate values to the edges usually also provide:


✓ get_edge_value(G, x, y): returns the value associated to the edge (x,y).
✓ set_edge_value(G, x, y, v): sets the value associated to the edge (x,y) to
v.

1.6.9 Harsh Data Structure


Hash table (hash map) is a data structure used to implement an associative
array, a structure that can map keys to values. A hash table uses a hash
function to compute an index into an array of buckets or slots, from which the
desired value can be found. Ideally, the hash function will assign each key to
a unique bucket, but it is possible that two keys will generate an identical
hash causing both keys to point to the same bucket. Instead, most hash table
designs assume that hash collisions which are different keys that are assigned
by the hash function to the same bucket will occur and must be
accommodated in some way. In a well-dimensioned hash table, the average
cost (number of instructions) for each lookup is independent of the number
of elements stored in the table. Many hash table designs also allow arbitrary
insertions and deletions of key-value pairs, at (amortized) constant average
cost per operation. In many situations, hash tables turn out to be more
efficient than search trees or any other table lookup structure. For this
reason, they are widely used in many kinds of computer software, particularly
for associative arrays, database indexing, caches, and sets.

Page 18 of 20
1.6.10 Heap Data Structures
Heap is a special case of balanced binary tree data structure where the root-
node key is compared with its children and arranged accordingly. If α has
child node β then −
key(α) ≥ key(β)

Heap data structure is a complete binary tree that satisfies the heap property,
where any given node is:
(a) always greater than its child node/s and the key of the root node is the
largest among all other nodes. This property is also called max heap
property.
(b) always smaller than the child node/s and the key of the root node is
the smallest among all other nodes. This property is also called min
heap property.

1.7 Data Structure Operations

There are six basic operations that can be performed on data structure which
are:
▪ Create
▪ Traversing
▪ Searching
▪ Sorting
▪ Inserting
▪ Deleting
▪ Merging
▪ Display
▪ Updating
▪ Destroying

(a) Create: The create operation results in reserving memory for program
elements. This can be done by declaration statement. Creation of data
structure may take place either during compile-time or run-time. The
malloc() function of C language is used for creation.

(b) Traversing: Traversing means accessing and processing each element


in the data structure exactly once. This operation is used for counting
the number of elements, printing the contents of the elements, etc in
the data structure.

(c) Searching: Searching is finding out the location of a given element from
a set of numbers.

Page 19 of 20
(d) Sorting: Sorting is the process of arranging a list of elements in a
sequential order. The sequential order may be descending order or an
ascending order according to the requirements of the data structure.

(e) Inserting: Inserting an element is adding an element in the data


structure at any position. After insert operation, the number of
elements is increased by one.

(f) Deleting: Deleting an element is removing an element in the data


structure at any position. After deletion operation, the number of
elements is decreased by one.

(g) Merging: The process of combining the elements of two data structures
into a single data structure is called merging.

(h) Update: Update operation refers to updating an existing element from


the array at a given index.

(i) Display: This operation displays all the elements in the entire array
using a print statement.

(j) Destroy: Destroy operation destroys memory space allocated for


specified data structure. free () function of C language is used to destroy
data structure.

Page 20 of 20

You might also like