0% found this document useful (0 votes)
80 views15 pages

AI Project Cycle and Problem Scoping

Uploaded by

kvbsd13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views15 pages

AI Project Cycle and Problem Scoping

Uploaded by

kvbsd13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

2.

1 AI Project Cycle: An Introductiontions


Learning Outcomes
To identify and appreciate AI and describe its applications.
To recognize, engage and relate with the three realms of AI: Data Statistics, Computer
Vision, and Natural Language Processing.

Understanding the term Project


Project activities and its plan must go hand in hand.
In any professional field, to solve a problem or to reach any solution, the approach is not like we do in
our daily lives. In a professional field, when we talk of the term problem it doesn’t mean the day-to-day
difficulties we face and overcome. Here, problem refers to a big task to be accomplished or a set of major
goals to be achieved. Some real-life examples of problems and their solutions given here.

PROBLEM SOLUTION

Persistent traffic jam on a road. Construct a flyover.

A village is hard to reach. Lay down a road.

A region has no medical services available. Build a small hospital.

Attendance takes a lot of time in school and offices. Develop an attendance software.

Did you get the idea?


Every problem drives us to find a solution and every solution needs to be worked upon by the people
who are able to do it. Looking at the above examples, do you think it is a one-day or one-man task to
create those solutions? The answer is: no. Such tasks need right expertise, right kind of qualified people,
suitable tools and money.
In addition to all this, for successful completion, such tasks
need something most important – Plan.
Such tasks, which need planning of resources (people and
tools), time and money for their successful completion in
conformance to the expected requirements are said to be
projects.
Do you remember any school project you worked upon? How did you do it or how your teacher had
advised you to begin? In the centre of everything must have been the plan.

1
According to [Link], a project is a piece of planned work or an activity that is
finished over a period of time and intended to achieve a particular purpose.
The purpose of the project is translated into achievable goals.

ACTIVITY: PLANNING THE SCHOOL FAREWELL PARTY

This sounds something familiar. Even if it is not, let us have a look at a farewell party with the
perspective of a small project to execute which needs planning.
Planning the farewell party needs to answer certain questions which may begin with why, what,
when, where, who and how, etc.
For example, why is the party being held? Who are the hosts? Who are the guests? What is the time
and venue? Who will arrange what?, etc.
Make teams of 4/5 classmates of yours and come up with a plan for school’s farewell party covering
the following:
Budget arrangements – how much, source of money, etc.
Date, time and venue.
Activities in the party – opening, entertainment, music, dinner, gifts, speeches, closing, etc.
List of event managers – not more than 8 persons with one of them the captain of the managers’
team.
Responsibilities of event managers before, during and after the party.
Discuss your plan with others and your teacher.

Understanding Problem Scoping


If you reflect upon the previous activity, you learnt that a good plan and sticking to it make it easier to
achieve the intended goal (project). There is one more important learning you must take out from this
activity. How were you able to finalise the plan for the party? The reason is that you knew what you were
supposed to do as party managers. Being aware of the boundary of your project is important. It helps you
see what lies within your project as part of it and what lies outside it and does not belong to your project.
This enables you to see what you are supposed to do in your project exactly.
Trying to see or define what is to be done to solve a problem is called problem scoping and once it is
defined, then it is called the problem scope.
A problem scope is mutual understanding of all stakeholders about what is to be done to solve that
problem.
Here, stakeholders include those working on the project as well as the beneficiaries of the project and
other related people like financiers, any sponsors, etc.

2
ACTIVITY: PROJECT SCOPING THE SCHOOL FAREWELL PARTY

Scoping covers all activities that are to be done in a project to achieve the intended goals. Think
again about the school farewell party and list the goals to make the party successful and the tasks to
be done to accomplish each goal. Remember, here, goal means what to achieve and task means what
to do to achieve it. For example:
Goal: Invite all guests through an invitation card.
Tasks: Prepare or arrange for invitation cards, get the cards ready, distribute or send cards to all
guests.

Understanding AI Project Cycle


Artificial intelligence has two aspects of development – i. Developing a new system powered by AI from
scratch and ii. Introducing AI to an already functioning system.
The newer projects may not have enough data which they work on or generate but existing/older
systems have all the chances of bulk data. For example, an old bank has a lot of data of its customers,
transactions and employees, etc. while a newly opened bank is yet to start its operations so it will have
no or very less amount of data.
We know that AI systems need immense data used to train them before they really work. So, it is the
older or existing systems or industries which are most suitable for introducing AI to them.

Stages in a Standard AI Project Cycle


Like any planned undertaking, AI project too has certain well-defined stages. These stages are listed
here.
Problem scoping Data acquisition Data exploration
Modelling data Evaluation
Problem scoping: This is the initial stage that defines the goals to be achieved through AI system and
the problems it will address.
Data acquisition: Collecting and compiling the relevant data in such a way that it is used to train and
test the AI system. This data makes the base of the AI system to be developed.
Data exploration: Having the relevant data collected, going through and analysing the data for some
useful information that can be derived out of it. Arranging the information in a proper format or layout.
Modelling data: Using the data to train the various AI systems to be selected. Presenting the data to
the AI systems in such a way that they should perform predictions which meet the goals set in problem
scoping.
Evaluation: Gauge and analyse the outputs (predictions) of the AI systems and select the best suited AI
system to be deployed.
After evaluation, if the model is found to be fit to use, it is deployed in the real life problem-area for all
to use. Over a period of time, its performance is monitored and further improvements are done on the
basis of its real performance and user feedback.

3
2.2 AI Project Cycle: An Introductiontions
Learning Outcomes
Learn problem scoping and ways to set goals for an AI project.
Identify stakeholders involved in the problem scoped.
Implement 4W framework to create problem statement template.

A problem scope is mutual understanding of all stakeholders about what is to be done to solve that
problem.
Problem scoping the very first stage of AI project. Trying to see or define what is to be done to solve a
problem is comes under problem scoping and once it is defined, then it is called the problem scope.
Here, stakeholders include those associated with the project as well as the beneficiaries of the project
and other related people like financiers, any sponsors, etc.
Problem scoping gives a clear vision of the problem. It also distinctly defines what will be the outcomes
of entire problem-solving exercise. So, first thing needed for defining a problem scope is to answer the
question: What are we trying to achieve after solving the problem?
A proper problem scoping achieves the following:
Identification of the stakeholders.
Clearly defined problem through a problem statement.
List of achievable goals.
Inputs required in solving the problem.
4
Resources (people, infrastructure, etc.) required to solve the problem.
Financial requirements (budgeting).
Delivery deadline of the solution.
Challenges in the way of problem solving.

Problem Scoping using 4W Framework


Problem scoping needs a lot of investigation through analysing the system for which solution is to be
developed. You need to interview the people to gather information and read documents involved. The
major scoping activities are described here as part of 4 important parameters defined by 4W framework.
4Ws stand for: Who, What, Where and Why.

Who?
Primarily, as solution developer or provider, we should be familiar with the people who are facing the
problem and the people who will be affected by the solution of the problem directly or indirectly. All such
people are called stakeholders. The stakeholders include people in various capacities such as operators of
the system for which problem needs to be solved, users of the system, investors, managers and owner of
the system etc. As a solution developer we must be familiar with primary stakeholders such as operators
of the system and beneficiaries or users of the system.
There are many sources and means to gather information about the stakeholders such as web site of the
business and meeting and interviewing the persons involved.
Let us understand this through a real-life scenario.

ACTIVITY: PROBLEM IN LIBRARY MANAGEMENT SYSTEM

A college library has several thousand books. The school administration observed in the monthly
feedback from the students that it takes a lot of time for the students to browse through the books and
select the right ones especially for studies. School already has an online library management system
in the form of a mobile app where students can search books. But still it is difficult to browse several
categories of books and find one.
As a solution provider, answer the following questions based on the above scenario:
1. Who are the stakeholders affected by the problem.
2. What details do you have about these stakeholders?
3. What ways do you suggest to gather more information about the stakeholders?

What?
Identifying the problem: The scope of the problem cannot be defined until the problem is understood
completely and correctly. Getting familiar with all aspects of the problem is the prerequisite to defining
problem scope. A problem which is identified and understood could be described in writing. This is called

5
problem statement. The problem statement is short and includes the problem description as well as the
proposed solution. Some samples of problem statements are given below:
How can I improve on my preparation for exams?
The lab time for students needs to be optimised by minimising the activities which can
be done before or after the lab.
How to increase the factory production and utilise of wasted man-hours?
Let us identify the problem in the Library Management System case study.

ACTIVITY: IDENTIFYING THE PROBLEM IN THE LIBRARY MS

A college library has several thousand books. The school administration observed in the monthly
feedback from the students that it takes a lot of time for the students to browse through the books and
select the right ones especially for studies. School has an online library management system app where
students can search books. But still it is difficult to browse several categories of books.
In this scenario, describe the problem faced by the college in your own words.
Write a short and concise problem statement.
What is the evidence that the problem is real?

Identify the goals to achieve: Once the problem is understood, it is easy to set the goals. Describe the
goals in clear language with specific details. For example:

Develop the handwriting analyser within 60 days to analyse 500 samples in 1 minute.
Create the burglar alarm system in 30 days that raises alarm of 120 decibel hearable within 500
metres radius.
Prepare 10,000 records of transactions by this weekend to train the 3 AI-algorithms in the span of
4 weeks' time.
Goals should be set to cover entire purpose of the problem-scoping exercise.

ACTIVITY: GOAL SETTING FOR THE IDENTIFIED PROBLEM

A college library has several thousand books. The school administration observed in the monthly
feedback from the students that it takes a lot of time for the students to browse through the books and
select the right ones especially for studies. School already has an online library management system
in the form of a mobile app where students can search books. But still it is difficult to browse several
categories of books and find one. The IT consultant suggested of an intelligent book recommendation
system that must “know” the reading habit, preferences, and frequency of issuing the books of the
students and on the basis of that, it should recommend the top 5 most relevant book to the students.
Such system could also help the college to buy most useful new books every year and classify the least
read books.

6
List the specific goals to be achieved by solving their problem. Assume that college has given you 60
days’ time to deliver the solution. You are free to judiciously allocate number of days in which each
goal should be achieved.

Where?
This question defines the context of the problem. It shows the exact area or boundary where the problem
is occurring. This helps you identify the location of the problem. It clearly shows you when and where
exactly the real problem arises and helps you pinpoint the affected area of the system.

ACTIVITY: IDENTIFYING THE CONTEXT OF THE PROBLEM

In the Library Management System problem identified earlier, describe the context in which
the problem occurs.

Why?
This question answers the rationale of the solution. It describes the benefits to be drawn from the
implemented solution to the problem at hand. It also helps you describe how valuable will be the solution
to the stakeholders. The answer to this question must inform the stakeholders how the situation will be
improved after implementing the solution.

ACTIVITY: RATIONALE OF THE SOLUTION

In the Library Management System scenario, answer the following questions.


1. What value addition will the solution do to the business?
2. How will the solution improve the situation in which problem occurs?

Our STAKEHOLDERS WHO


are facing a problem that PROBLEM STATEMENT WHAT
Occurs when/while CONTEXT WHERE
The ideal solution would be SOLUTION AND ITS BENEFITS WHY

LIBRARY MANAGEMENT SYSTEM: PROBLEM STATEMENT TEMPLATE


Our College students WHO

are facing a problem that It takes too long to find the desired book in the library. WHAT

Occurs when/while Library (precisely library management system app). WHERE

The ideal solution would be AI-based book recommendation algorithm. It will T


recommend top 5 most relevant books to the students.
It will also help college to figure most useful new books to
buy every year and to classify the least read books.

7
Case Study: The digital screener
Edusoft Academy – A professional training institute, is going to conduct an all India admission test. In
the last admission test they faced the problem of unauthorised candidates appearing for the admission
test at various centres. They need a solution that should raise an alarm if the person entering the
examination hall is not an authorised candidate or any unauthorised person. The academy has the
photographs of all the candidates appearing for the admission test and of all their authorised staff. The
photographs are in hard-copy (printed). Let us fill the problem statement template for this scenario.

Our STAKEHOLDERS
1. Edusoft Academy invigilators and staff involved in the test
process.
WHO
2. Candidates – the admission aspirants.
3. Edusoft Academy management.

are facing a prob- PROBLEM STATEMENT


lem that How to prevent unauthorised candidates from appearing in the
admission test by implementing a solution of facial recognition of WHAT
candidates to identify them?

Occurs when/ CONTEXT


while WHERE
Candidates enter the examination halls at every exam centre.

The ideal solution SOLUTION AND ITS BENEFITS


would be
AI system to recognise the faces of people entering the examination
hall and match them with the available photographs. For any
mismatch, the alarm will be raised. WHY
This allows only genuine candidates in the examination hall.
Minimises efforts and time in identifying right candidates.
Help conduct fair admission test.

8
2.3 AI Project Cycle: Data Acquisition
Learning Outcomes
Learn about data, data features and data formats.
Work around the scenarios to think of the ways to acquire data.
Create System Map of the problem area/context.

Data is the biggest asset for a business, society and economy today.
Data (singular: datum), as we know, are the raw piece of facts that alone do not make any sense. When
a set of data are related logically in a context, they generate information which is meaningful and useful
to meet a purpose. For example, Raj, A, 9, 16 are examples of some data values which make no sense as
such. But if we look at them in the context of a school and relate them together then they make a piece
of information – Raj is 16 years old and studies in class 9, section A.

Data Features
Every piece of data is not the same. If you consider the previous example, Raj and A are text type while
9 and 16 are numbers. If you look around into various systems, you will find basically three features of
data which are also called data types:
Characters or individual letters, symbols, marks. E.g. a, A, @, *, ! etc.
Strings of letters, also called text. E.g. “India”, “Ravi”, “House”.
Numbers. E.g. 10, 1, -9, 200.
There are variations to these three basic data types:
Phrases and sentences – variations of strings.
Numbers with decimal places.
Dates.

Data Formats
How data is presented or stored is determined by various formats?
Numbers with decimal places may have the decimal places defined in a system that they can store as
many decimal places only. For example, monitory figures have 2 decimal places while scientific notations
may have more than 10 decimal places.
Dates can be presented in various formats like 12-29-2022, 29-Dec-2022 and 29-12-2022, etc.
Text can be in various cases like UPPERCASE, lowercase, etc.

Complex Data Types


There are some other complex data types such as audio, video, audio-video, images, 3 dimensional (3D)

9
objects such as the model of a house. Complex data types are presented in a variety of formats like audio
is coded as mp3, wav etc.; video as mpeg, mp4, etc.; images as jpg, gif, png, etc.

AI System and Data


We already know that an artificial intelligence system relies on a lot of data for the training of the machine
algorithm. A business system which has bulk data is the most suitable ground for an AI system to be
implemented.
The bulk data is divided into two parts. One part is used to train the machine algorithm. This bulk data used
to train the algorithm is called training data. The other part of the data is used to test the performance
of the AI model whether it is trained as desired or not. The bulk data used to test the performance of the
AI model after training is called testing data.
So, for an AI system to train and test, bulk data acquisition is a prerequisite.
Let us understand this with a simple example. An AI system can predict the score of a particular batsman
against an opponent team. For this purpose, the data related to batsman’s score in previous matches
played against this team and against other
teams also can be acquired. This data can be
divided into training and testing data sets.
Training data set is used to train the AI model
and after trianing, the testing data set is used
to evaluate its output performance.
The accuracy of prediction is directly dependent on testing data. Testing data depends on the quality
and relevance of training data. For example, if AI system is trained with a data-set having only scores of
matches played by that batsman against that particular opponent team in question only then AI system
may generate more accurate testing data as compared to that training data-set which has scores of
matches played by that batsman against other teams also.

Data Quality
Data which is relevant to the context in which problem is being solved, is said to be quality data or useful
data. Data quality is defined by following features of data:
Relevance: Data should not be out of context. That is the reason, why it is important to identify the
context of the problem during problem scoping stage. This ensures that only the data values relevant to
the problem are acquired.
Age: Data should not be too historic or too recent. There has to be a balance to it. For example, if data for
test matches is being used to train the machine while predictions are to be done for T20 matches then
data is too old for this.
Accuracy: Data values should be correct and in proper format. For example, if data is to be predicted
against the opponent team from Sri Lanka and in several records, the spelling of Sri Lanka is misspelt
then there are chances that those records are missed out to be included in the testing data.
Volume: Higher the volume of data, better would be the training of the machine. That is why, AI algorithms
of E-Commerce and social media web sites get intelligent day-by-day, minute-by-minute since they have a
lot of data to learn from every day.

10
Richness: Richness refers to the variety of data values in the data set. This directly relates to the volume
of data. There are chances of having a variety of data values in a bulk lot of data. For example, more values
of centuries hit, sixers scored, duck (zero) score, number of times not out will make a rich data set and a
robust training of the machine instead of plain total score values.
Format: In many AI applications different data formats also help in better training of the machine.
For example, an AI system using natural language processing, needs letters, text, symbols and voice in
different notations, accent, tones, semantics etc. or a face recognition AI system needs a variety of image
shots of the same person while our example of score prediction needs only numbers and, occasionally,
text (name of the country, bowler, etc.)
Data source: Data accuracy depends on the source from which the data is collected. For example, data
collected from public domain like Internet may not be authentic while data collected from an authorised
source such as a government or certified organisation.
Let us apply our learning so far about the data and its features.

ACTIVITY: DATA FOR THE SOLUTION OF LIBRARY MS PROBLEM

A college library has several thousand books. The school administration observed in the monthly feedback
from the students that it takes a lot of time for the students to browse through the books and select the
right ones especially for studies. School already has an online library management system in the form
of a mobile app where students can search books. But still it is difficult to browse several categories of
books and find one. The IT consultant suggested of an intelligent book recommendation system that
must “know” the reading habit, preferences, and frequency of issuing the books of the students and on
the basis of that, it should recommend the top 5 most relevant book to the students. Such system could
also help the college to buy most useful new books every year and classify the least read books.
The most popular books issued category-wise.
Grouping of students who issued similar books.
The least read books category-wise.
The books which make less than 5% of all the books issued.
In the above scenario, list the sample data values, their data types and possible data formats. Also list
some examples of irrelevant data.

Data Acquisition
After understanding the data, its features and quality features, let us get to the practical aspect of
it – acquiring data to train the machine (machine learning).
Data acquisition has two aspects:
Data sources.
Data acquisition process.
Data sources
Depending on the context of the problem as considered during problem scoping, there could be

11
different sources that may provide training data. Some common examples are:
Database of the company for which solution is being developed.
Customer reviews and feedback.
Business documents – financial statements, business transactions, agreements, etc.
Web page content.
Live data – video recording, satellite imagery, images captured by webcam, chat text, phone calls,
video chat stream, CCTV feeds, weather data, etc.
Raw, flat files – plain text files, comma separated values (csv) files, spreadsheets, maps, images, hard
copies (books, reports), tables, etc.
Software applications – they generate some data of their own while working, which may be useful
sometimes like Windows server is maintaining the log-in information of users in a file. Another
example is registry of operating system that has details of software and hardware installed on the
computer system or simple a virus database of an anti-virus application.
Data acquisition process
Depending on the source of data, there are different methods
or processes of acquiring the data. Let us have a look at different
possible ways of data acquisition:
Certain data sources retain the data in such an organised fashion that
they can be acquired or collected very easily. For example, databases
store data in tables which is very well organised. Spreadsheets also
store most of the data in tabular format.
Databases also allow us to generate data sets by applying queries
on them. Many software applications allow to export data in various
formats which are easier to process.
Data acquisition needs more efforts and sophisticated methods with
the data which is not organised in a particular format. For example,
images, plain text, audio and video are complex data types to be acquired and need different tools
of compilation such as scanners, optical readers, sensors, etc.
Certain kind of data can be generated directly in the form of hard copy. For example, call data
printout from an EPBX machine or call connection exchange.
Another way of acquiring data is through online survey, feedback and review forms. All the data
entered in such forms can be collected in the form of a spreadsheet or CSV files.
Live data is acquired via the device involved such as
webcam, CCTV, Chatbot interface, satellite, sensors in
medical equipment, etc.

12
Programming interfaces are the piece of codes which help one
application to connect with another. Such interfaces are called
Application Programming Interface (API). For example, a
Python program may use a Java API to import data via a
Java program.
Web scraping or web harvesting is
the technique that lets collect the data of a website in an
organised format such as a table, CSV file or spreadsheet.
Scanning symbols, bar codes and QR codes, etc.
The simple, traditional method is using pen-paper to collect
thedata in printed formats filled by hand. Such documents can
be scanned into an Optical Character Recognition (OCR) device and then softcopy can be processed
to extract the data.

ACTIVITY: DATA SOURCES AND ACQUISITION IN LIBRARY MS

Consider the Library Management System problem you have read earlier and list possible sources
of data relevant for AI-powered book recommendation algorithm. Also, list some possible ways to
acquire the data.

Case Study: The digital screener


Edusoft Academy – A professional training institute, is going to conduct an all India admission test.
In the last admission test they faced the problem of unauthorised candidates appearing for the
admission test at various centres. They need a solution that should raise an alarm if the person
entering the examination hall is not an authorised candidate or any unauthorised person. The
academy has the photographs of all the candidates appearing for the admission test and of all their
authorised staff. The photographs are in hard-copy (printed).
Consider the above case study and answer the following questions:
1. List the possible data for digital screener, possible data types (features) and formats.
Ans: Photographs of candidates and authorised personnel of Edusoft Academy.
2. List the possible data sources which provide the data for digital screener.
Ans. Hard copies of images of candidates' photographs and clicked digital copies of Edusoft
Academy personnel.
3. List the process or method of data acquisition for machine learning.
Ans. Scanning of candidates' images by a scanner.

13
System Map of an AI System
When we analyse a problem area
for scoping, we identify various
elements that comprise the
context of problem area. A system
map is a visual tool to show
the relationship among various
elements of a problem area in
a graphical form. It helps us in
thinking of a possible solution to
the problem easily. System map
shows the interconnections of
elements of a system and help us
understand the complex issues
easily. The System Map for Library MS scenario is given here:

Interpreting the System Map


System map depicts the “cause and its effect” on the elements in following terms:
I. How much time the change has taken to occur (shorter or longer) – depicted by a short arrow
(shorter time delay) and a long arrow (longer time delay).
For example, in our Library MS scenario, as a reader (student) gets the book issued, the “number
of issues” fir that book will increase immediately and its “popularity index” will also increase
immediately. See the figure shown here, these relations are shown with a short arrow. On other
hand, the “New book purchase” and “Least read books” will be determined after a period of time
hence their arrows are longer.
II. How elements are related to each other (directly or inversely) – depicted by a +sign (direct
relation which means if one element increases the other will increase too) and a – sign (inverse
relation which means if one element increases the other will decrease).
For example, in our Library MS scenario, more the readers, more the book issues and higher the
popularity index or vice versa (Relationship is shown with a + sign). More the Popularity index,
lesser are the “least read books” (inverse relation i.e. - sign).

How to Draw a System Map?


The system map of Library MS is created using the online tool LOOPY ([Link]/loopy). If you have
worked with any basic drawing software, you will be able to use loopy within a few moments.
• To draw circles, click on Pencil tool and just drag in the desired place over the canvas.
• To move the circles and arrows, click on Move tool and drag-drop with mouse.
• To type text, select Text tool, click in the desired place on the canvas and then, in the properties
panel on the right-hand side, type the text in the text area provided.
• Objects can be deleted using Eraser tool or Delete button in the right property panel.
You can see the 3 examples given on loop website to understand the concept better.

14
Case Study: The digital screener
Create a simple System Map of Edusoft Academy scenario

15

You might also like