AI & ML Question Bank for 5th Sem
AI & ML Question Bank for 5th Sem
A heuristic function in informed search strategies provides an estimate of the cost to reach a goal from a given node. It guides the search algorithm towards a more promising path by evaluating which direction to take based on lower estimated costs, rather than just path costs. For example, in the A* algorithm used for route planning, heuristics might involve the straight-line distance (Euclidean distance) between the current location and the destination, to prioritize paths that seem to shorten the actual travelling distance .
The four approaches to defining AI are: thinking humanly, acting humanly, thinking rationally, and acting rationally. Thinking humanly involves cognitive modeling aimed at achieving human-like reasoning. Acting humanly involves enabling machines to mimic human behavior, such as passing the Turing Test. Thinking rationally is concerned with developing algorithms that represent and reason logically. Acting rationally involves creating agents that make decisions to maximize their performance based on available information. These approaches differ mainly in whether they emphasize processes or outcomes, and the focus on human-like attributes versus logical rationality .
Preprocessing in machine learning involves preparing raw data for modeling, which can significantly affect model performance. This includes handling missing values, scaling features, encoding categorical variables, and splitting data into training and testing sets. Proper preprocessing can enhance model accuracy by ensuring the algorithm doesn't make biased assumptions due to inconsistent data scales or incomplete data. For instance, normalizing data to eliminate scale differences among features is crucial when using distance-based models like K-NN, as it prevents dominance of features with larger ranges over smaller ones .
The Knowledge Pyramid (Data-Information-Knowledge-Wisdom, or DIKW) illustrates the transformation of raw data into actionable wisdom. It starts with data, which is processed to form information, then contextualized into knowledge, and finally applied as wisdom. Machine learning plays a crucial role in this transformation process by automating the extraction of patterns and insights from vast data sets, converting information into knowledge. This becomes increasingly important in fields where data volume is vast and manual analysis is impractical. For instance, machine learning algorithms can detect complex patterns in clinical data that inform medical decisions .
The vacuum world scenario in AI problem-solving involves defining the world as a set of states, actions available to an agent, a goal state, and path costs. Each state specifies the agent's location and the presence of dirt in any room, actions like moving left, right, or sucking dirt, seek to achieve a clean state, and costs can be associated with moving and cleaning. This problem is well-defined as it has clear state representations, specified actions, goals to be reached, and measurable costs, making it suitable for applying search strategies to find optimal solutions .
Class imbalance in machine learning occurs when one class is represented more than the other, which can lead to biased models that perform poorly on the minority class. For example, in fraud detection, there are significantly fewer fraudulent transactions compared to legitimate ones, which can result in a model that simply learns to predict the majority class with high accuracy but fails to detect actual frauds. Similarly, in medical diagnostics, rare diseases are underrepresented, leading classifiers that mainly predict negative results, missing crucial diagnoses. This necessitates techniques like re-sampling, cost-sensitive learning, and using performance metrics that account for imbalance .
Machine learning is intertwined with data mining and artificial intelligence (AI), each contributing distinct yet complementary aspects. Data mining focuses on discovering patterns in large datasets, leveraging machine learning methods for automated pattern recognition. AI uses machine learning to enhance its ability to mimic human decision-making through experience-based learning. The synergy lies in how machine learning provides the predictive accuracy for AI applications and enhances data mining with advanced algorithms, ultimately expanding capacities for intelligent automation across industries, such as intelligent personal assistants and autonomous vehicles .
Breadth-First Search (BFS) uses a queue data structure to explore neighbors level by level, searching all nodes at a given tree depth before moving on. It has a space complexity of O(b^d), where b is the branching factor and d is the depth, and a time complexity of O(b^d). Depth-First Search (DFS), on the other hand, uses a stack or recursion to follow a branch down as far as possible before backtracking, with a space complexity of O(bm) and a time complexity of O(b^m), where m is the maximum depth. BFS is typically used for finding the shortest path in unweighted graphs, while DFS is used in applications like topological sorting and solving puzzles with unique solutions .
The A* search algorithm uses both cost to reach a node and a heuristic to guide its search, aiming to find the least-cost path to a goal. Greedy Best-First Search uses only the heuristic to guide its search, aiming to proceed towards the goal as quickly as possible but without considering the path cost. A* is considered more optimal because it balances the heuristic value with the real path cost, leading to optimal solutions in terms of path cost, as long as the heuristic is admissible (never overestimates the cost to reach the goal), whereas Greedy Best-First Search can get stuck in suboptimal paths because it does not consider path costs .
Data visualization techniques translate complex datasets into graphical formats, making trends and patterns more understandable to enhance analysis. Common methods include histograms for showing frequency distributions, line charts for tracking changes over time, and scatter plots for identifying relationships between variables. For example, a scatter plot might reveal correlations between variables such as sales and advertising spend, while a histogram could highlight distribution skewness in customer age demographics. Such visualizations make it easier to spot anomalies and insights that raw data tables might obscure .