0% found this document useful (0 votes)
23 views46 pages

System Software and Operating System

The document provides an overview of system software and operating systems, detailing the roles of machine, assembly, and high-level languages, as well as compilers, interpreters, and debugging tools. It also covers the structure, operations, and services of operating systems, including system calls and the boot process. Key concepts include the differences between monolithic, layered, microkernel, and modular OS structures, as well as the importance of resource allocation, error detection, and security in OS design.

Uploaded by

ashish.patel2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views46 pages

System Software and Operating System

The document provides an overview of system software and operating systems, detailing the roles of machine, assembly, and high-level languages, as well as compilers, interpreters, and debugging tools. It also covers the structure, operations, and services of operating systems, including system calls and the boot process. Key concepts include the differences between monolithic, layered, microkernel, and modular OS structures, as well as the importance of resource allocation, error detection, and security in OS design.

Uploaded by

ashish.patel2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

System Software and Operating System

1. System Software: Machine, Assembly and High-Level Languages;


Compilers and Interpreters; Loading, Linking and Relocation; Macros,
Debuggers.
System software acts as an intermediary between the hardware and the application software.
It manages and controls the computer hardware and provides a platform for application
software to run.

1.1. Machine Language


●​ Definition: The lowest-level programming language, directly understood by the
computer's CPU. It consists of binary code (0s and 1s).
●​ Characteristics:
○​ Hardware Dependent: Each CPU architecture has its own unique machine language
instruction set.
○​ Difficult to Program: Extremely tedious and error-prone for humans to write and
understand.
○​ Fast Execution: No translation required, leading to very fast execution.
●​ Example: 00101100 00000001 00000010 (could represent adding two numbers,
depending on the architecture).

1.2. Assembly Language


●​ Definition: A low-level programming language that uses mnemonics (short, symbolic
codes) to represent machine language instructions. It's a symbolic representation of
machine code.
●​ Characteristics:
○​ Hardware Dependent: Still specific to a particular CPU architecture.
○​ Easier than Machine Language: More readable and writable for humans than
machine code.
○​ Assembler Required: Needs an assembler to translate assembly code into machine
code.
○​ Fine-grained Control: Allows direct manipulation of hardware registers and memory.
●​ Example:​
Code snippet​
MOV AX, 5 ; Move the value 5 into register AX​
ADD AX, BX ; Add the value in register BX to AX​

1.3. High-Level Languages


●​ Definition: Programming languages that are more abstract and closer to human
language, using natural language elements and mathematical notation. They are
designed to be platform-independent.
●​ Characteristics:
○​ Hardware Independent (mostly): Programs can be written once and run on
different architectures with a suitable compiler/interpreter.
○​ Easier to Program: Much easier for humans to write, read, and maintain.
○​ Higher Abstraction: Abstract away many low-level hardware details.
○​ Require Translation: Need a compiler or interpreter to translate into machine code.
●​ Examples: C, C++, Java, Python, JavaScript.
●​ Example:​
C​
int a = 5;​
int b = 10;​
int sum = a + b;​

1.4. Compilers and Interpreters


Both compilers and interpreters translate high-level language code into a form that the
computer can understand.
●​ Compiler:
○​ Process: Reads the entire source code program at once, translates it into machine
code (or an intermediate code), and then generates an executable file.
○​ Execution: The executable file can be run independently of the compiler.
○​ Speed: Generally faster execution because the entire code is translated beforehand.
○​ Error Reporting: Reports all errors found in the code after compilation.
○​ Examples: C, C++, Java (compiles to bytecode), Go.
●​ Interpreter:
○​ Process: Translates and executes the source code line by line (or statement by
statement).
○​ Execution: Requires the interpreter to be present every time the program runs.
○​ Speed: Generally slower execution due to the line-by-line translation overhead.
○​ Error Reporting: Stops execution and reports an error as soon as it encounters one.
○​ Examples: Python, JavaScript, Ruby, PHP.

1.5. Loading, Linking and Relocation


These are crucial steps in the process of getting an executable program loaded into memory
and ready for execution.
●​ Compilation/Assembly: Source code is converted into object files (.obj or .o), which
contain machine code but might have unresolved references to functions/variables
defined in other files or libraries.
●​ Linking:
○​ Definition: The process of combining multiple object files (compiled from different
source files) and library routines (pre-compiled code, e.g., for printf) into a single
executable file.
○​ Static Linking: All necessary library code is copied directly into the executable file.
■​ Pros: Self-contained executable, no external dependencies at runtime.
■​ Cons: Larger executable files, updates to libraries require recompiling the entire
program.
○​ Dynamic Linking: Library code is not copied into the executable. Instead, the
executable contains references to the shared library files (.dll on Windows, .so on
Linux). The actual linking happens at runtime when the program is loaded.
■​ Pros: Smaller executable files, easier library updates, multiple programs can
share the same library in memory.
■​ Cons: Program depends on the presence of shared libraries, potential "DLL hell"
(version conflicts).
●​ Loading:
○​ Definition: The process of copying the executable file from secondary storage (e.g.,
hard disk) into the computer's main memory (RAM) so that the CPU can execute its
instructions.
○​ Responsibilities of the Loader:
■​ Allocating memory space for the program.
■​ Reading the executable file into the allocated memory.
■​ Setting up necessary CPU registers (e.g., program counter to the starting
address).
●​ Relocation:
○​ Definition: The process of adjusting addresses within a program's code and data
segments to reflect its actual physical memory location when loaded. This is
necessary because the loader might not load the program at the same absolute
memory address every time.
○​ Static Relocation: Addresses are fixed at link time. Requires the program to be
loaded at a specific address or re-linked if moved. (Less common in modern OS).
○​ Dynamic Relocation: Addresses are adjusted at load time or even during execution
(using a relocation register or memory management unit (MMU)). This allows the
program to be loaded anywhere in memory.

1.6. Macros
●​ Definition: A named sequence of instructions or commands that can be defined once
and then invoked multiple times by its name. In programming, macros are often used for
code expansion or text substitution during the pre-processing phase of compilation or
assembly.
●​ Types:
○​ Assembly Language Macros: Used to define a sequence of assembly instructions
that can be inserted into the code with a single macro call.
○​ C/C++ Preprocessor Macros (#define): Perform simple text substitution before
compilation.
●​ Purpose:
○​ Code Reusability: Avoids repetitive typing of common code sequences.
○​ Parameterization: Can accept parameters, making them more flexible.
○​ Conditional Compilation: In C/C++, macros can be used with #ifdef, #ifndef for
including/excluding code based on conditions.
●​ Disadvantages (C/C++ macros):
○​ Debugging Difficulty: Macro expansion happens before the compiler sees the code,
making debugging harder.
○​ No Type Checking: Macros are text substitution, so they don't perform type
checking, which can lead to subtle bugs.
○​ Side Effects: Can lead to unexpected side effects due to repeated evaluation of
arguments.
1.7. Debuggers
●​ Definition: Software tools that assist programmers in finding and fixing bugs (errors) in
their programs.
●​ Functionality:
○​ Breakpoints: Allow the programmer to pause program execution at specific lines of
code.
○​ Step-by-step Execution: Execute code line by line, or step over/into function calls.
○​ Variable Inspection: View and modify the values of variables and memory contents
at runtime.
○​ Call Stack Trace: Show the sequence of function calls that led to the current
execution point.
○​ Register Inspection: View the contents of CPU registers.
○​ Conditional Breakpoints: Break only when a certain condition is met.
●​ How they work: Debuggers typically insert "hooks" or modify the executable temporarily
to gain control over the program's execution, allowing them to monitor its state.
●​ Examples: GDB (GNU Debugger), Visual Studio Debugger, Eclipse Debugger, PyCharm
Debugger.

2. Basics of Operating Systems: Operating System Structure,


Operations and Services; System Calls, Operating-System Design and
Implementation; System Boot.
An Operating System (OS) is a software that manages computer hardware and software
resources and provides common services for computer programs.1

2.1. Operating System Structure


Different OS structures exist, each with its own advantages and disadvantages in terms of
complexity, efficiency, and security.
●​ Monolithic Structure:
○​ Description: The entire OS runs as a single program in kernel mode. All services (file
system, process management, memory management, device drivers) are tightly
integrated and run in the same address space.
○​ Advantages: High performance due to no overhead of inter-process communication.
○​ Disadvantages:
■​ Lack of Modularity: Difficult to develop, debug, and maintain.
■​ Less Robust: A bug in one component can crash the entire system.
■​ Security Risk: If one part is compromised, the entire kernel is vulnerable.
○​ Example: Early Unix, MS-DOS, Linux (though Linux has modular aspects with kernel
modules).
●​ Layered Structure:
○​ Description: The OS is divided into layers, with each layer built on top of lower
layers. A layer can only interact with the layers directly above and below it.
○​ Advantages:
■​ Modularity: Easier to design, implement, and debug.
■​ Abstraction: Each layer provides services to the layer above without revealing
implementation details.
○​ Disadvantages:
■​ Performance Overhead: Each layer crossing adds overhead.
■​ Difficulty in Defining Layers: Hard to strictly define and assign functionalities to
specific layers.
○​ Example: THE multiprogramming system, early Multics.
●​ Microkernel Structure:
○​ Description: The kernel is kept as small as possible, providing only essential services
like inter-process communication (IPC), basic memory management, and process
scheduling. Other OS services (file systems, device drivers, network protocols) run as
user-level processes.
○​ Advantages:
■​ Modularity and Extensibility: Easier to add new services without modifying the
kernel.
■​ Robustness: Failure in a user-level service doesn't crash the entire system.
■​ Security: Services run in user mode, limiting their access to critical kernel data.
■​ Portability: Easier to port to different hardware.
○​ Disadvantages:
■​ Performance Overhead: Increased overhead due to more context switching and
IPC between user-level processes and the kernel.
○​ Examples: Mach, QNX, MINIX.
●​ Modular Structure (Hybrid Kernel):
○​ Description: Combines aspects of monolithic and microkernel architectures. The
core kernel is monolithic, but it allows for dynamic loading and unloading of modules
(e.g., device drivers, file systems) as needed.
○​ Advantages: Flexibility of microkernel with performance closer to monolithic.
○​ Example: Linux (uses loadable kernel modules), Windows.

2.2. Operating System Operations and Services


The OS provides various services to users and programs, making the computer system
convenient to use and efficient.
●​ Operations:
○​ Program Execution: Loading a program into memory and running it.
○​ I/O Operations: Handling input/output devices (keyboard, mouse, printer, disk).
○​ File-system Manipulation: Creating, deleting, reading, writing files and directories.
○​ Communications: Facilitating communication between processes or between
computers.
○​ Error Detection: Detecting and handling hardware or software errors.
○​ Resource Allocation: Allocating CPU time, memory, storage, and I/O devices to
processes.
○​ Accounting: Keeping track of resource usage by users or processes.
○​ Protection and Security: Protecting system resources from unauthorized access
and ensuring data integrity.
●​ Services:
○​ User Interface (UI): Provides a way for users to interact with the OS (CLI, GUI,
Batch).
○​ Program Execution: Loads a program into memory and runs it.
○​ I/O Operations: Manages I/O devices, allowing programs to perform input/output
without direct hardware interaction.
○​ File-system Manipulation: Provides services for creating, deleting, reading, and
writing files and directories.
○​ Communications: Enables processes to exchange information (e.g., shared memory,
message passing).
○​ Error Detection: Monitors the system for errors (hardware, software, user program)
and takes appropriate action.
○​ Resource Allocation: Distributes resources among multiple users or processes.
○​ Accounting: Tracks which users use how much and what kinds of computer
resources.
○​ Protection and Security: Controls access to system resources and protects against
external threats.

2.3. System Calls


●​ Definition: The programmatic interface to the services provided by the operating
system. They are the only way for user-level programs to request services from the
kernel.
●​ Mechanism:
1.​ A user program executes a trap instruction (a special software interrupt).
2.​ This trap switches the CPU from user mode to kernel mode.
3.​ The OS identifies the requested service (via a system call number or table lookup).
4.​ The OS executes the requested service in kernel mode.
5.​ Upon completion, the OS returns control to the user program, switching back to user
mode.
●​ Categories of System Calls:
○​ Process Control: fork(), exec(), exit(), wait(), kill()
○​ File Management: open(), close(), read(), write(), delete()
○​ Device Management: ioctl() (for device-specific operations), read(), write()
○​ Information Maintenance: time(), getpid(), getppid()
○​ Communication: pipe(), shmget() (shared memory), socket()
○​ Protection: chmod() (change file permissions)
●​ System Call Interface: A system call interface provides a wrapper around the actual
system call instruction, making it easier for programmers to use. For example, printf() in C
uses write() system call internally.

2.4. Operating-System Design and Implementation


●​ Design Goals:
○​ User Goals: Convenience, ease of use, reliability, safety, speed.
○​ System Goals: Efficiency, easy to design, implement, and maintain, flexibility,
reliability, error-free.
●​ Design Approaches:
○​ Bottom-Up Approach: Start with hardware and build up layers of software.
○​ Top-Down Approach: Start with user requirements and break them down into
smaller components.
●​ Implementation:
○​ Languages: Historically, OS kernels were written in assembly language for
performance. Modern OS kernels (like Linux, Windows) are primarily written in C, with
some assembly for performance-critical sections (e.g., context switching, boot
code). Higher-level languages like C++ (for parts of Windows) or Rust (emerging in
Linux) are also used.
○​ Modularization: Breaking the OS into smaller, manageable modules is crucial for
complexity management.
○​ Kernel vs. User Mode: Strict separation between kernel (privileged) and user
(unprivileged) mode is fundamental for security and stability.

2.5. System Boot


The process of starting a computer and loading the operating system.
1.​ Bootstrap Loader (BIOS/UEFI):
○​ When the computer is powered on, the CPU executes a small piece of firmware code
stored in ROM (Read-Only Memory), typically the BIOS (Basic Input/Output
System) or its successor, UEFI (Unified Extensible Firmware Interface).
○​ This firmware performs a Power-On Self-Test (POST) to check basic hardware
components.
○​ It then initializes basic devices.
○​ Finally, it looks for a boot device (e.g., hard drive, SSD, USB) based on a predefined
boot order.
2.​ Master Boot Record (MBR) / GUID Partition Table (GPT):
○​ On the boot device, the firmware loads the first sector, which contains the Master
Boot Record (MBR) (for BIOS systems) or a portion of the GUID Partition Table
(GPT) (for UEFI systems).
○​ The MBR/GPT contains a small piece of code called the boot loader (or boot
manager).
3.​ Boot Loader (e.g., GRUB, Windows Boot Manager):
○​ The boot loader's primary job is to find and load the actual operating system kernel
into memory.
○​ It often presents a boot menu, allowing the user to choose which OS to load (in
multi-boot systems).
○​ It also handles initial memory setup and may load other necessary OS components or
drivers.
○​ For example, GRUB (Grand Unified Bootloader) on Linux systems, or Windows Boot
Manager.
4.​ OS Kernel Loading:
○​ The boot loader loads the OS kernel (e.g., vmlinuz for Linux, [Link] for
Windows) into RAM.
5.​ Kernel Initialization:
○​ Once the kernel is in memory, it takes control.
○​ It initializes its data structures, sets up memory management, initializes device
drivers, and creates the first user-level process (often init or systemd on Linux,
[Link] on Windows).
6.​ User-level Services Start:
○​ The initial user-level process then starts other system services, daemons, and
ultimately the login manager or graphical user interface, making the system ready for
user interaction.

3. Process Management: Process Scheduling and Operations;


Interprocess Communication, Communication in Client–Server
Systems, Process Synchronization, Critical-Section Problem,
Peterson’s Solution, Semaphores, Synchronization.
3.1. Process
●​ Definition: A program in execution. It is an active entity, in contrast to a program, which
is a passive entity (just a file on disk).
●​ Components of a Process:
○​ Text Section: The program code.
○​ Data Section: Global variables.
○​ Heap: Dynamically allocated memory during runtime.
○​ Stack: Temporary data (function parameters, return addresses, local variables).
○​ Process Control Block (PCB): A data structure maintained by the OS for each
process, containing information like:
■​ Process state (new, ready, running, waiting, terminated).
■​ Program counter.
■​ CPU registers.
■​ CPU scheduling information.
■​ Memory-management information.
■​ Accounting information.2
■​ I/O status information.3

3.2. Process States


A process typically transitions through several states:
●​ New: The process is being created.
●​ Ready: The process is waiting to be assigned to a CPU.
●​ Running: Instructions are being executed4 by the CPU.
●​ Waiting: The process is waiting for some event to occur (e.g., I/O completion, receiving a
signal).
●​ Terminated: The process has finished execution.

3.3. Process Scheduling


●​ Definition: The activity of the OS to select which process should run next on the CPU.
●​ Scheduler: The part of the OS responsible for selecting processes.
●​ Types of Schedulers:
○​ Long-term Scheduler (Job Scheduler): Selects processes from the job pool and
loads them into memory for execution (controls the degree of multiprogramming).
○​ Short-term Scheduler (CPU Scheduler): Selects from the processes that are ready
to execute and allocates the CPU to one of them. Executes frequently.
○​ Medium-term Scheduler: Used in time-sharing systems to swap processes out of
memory (and back in) to reduce the degree of multiprogramming or improve the mix
of processes.

3.4. Process Operations


●​ Process Creation (fork()):
○​ A parent process creates a child process.
○​ The child process is a copy of the parent (including code, data, stack, heap), but gets
a new PID (Process ID).
○​ The child typically then uses exec() to load a new program.
●​ Process Termination (exit()):
○​ A process can terminate normally (last statement executed) or abnormally (e.g., error,
kill() system call).
○​ The OS reclaims all resources (memory, open files, etc.) allocated to the process.
●​ Process Swapping: Moving a process (or parts of it) from main memory to secondary
storage (swap space) and vice versa, often done by the medium-term scheduler to
manage memory or improve response time.

3.5. Interprocess Communication (IPC)


Mechanisms that allow independent processes to communicate and synchronize their actions.
●​ Shared Memory:
○​ Concept: Processes agree to share a region of memory. One process writes to this
region, and another reads from it.
○​ Advantages: Extremely fast once set up (no kernel involvement for data transfer).
○​ Disadvantages: Requires careful synchronization to avoid race conditions (e.g.,
using semaphores, mutexes). The OS only facilitates the initial setup.
○​ Example: POSIX shared memory (shm_open, mmap), System V shared memory.
●​ Message Passing:
○​ Concept: Processes communicate by exchanging messages. The OS provides send()
and receive() primitives.
○​ Advantages: Simpler to implement for synchronization, inherently provides some
level of synchronization, easier in distributed systems.
○​ Disadvantages: Slower than shared memory due to kernel involvement for each
message transfer.
○​ Implementation:
■​ Direct Communication: Processes explicitly name each other.
■​ Indirect Communication: Messages are sent to and received from
mailboxes/ports.
○​ Buffering:
■​ Zero Capacity: No messages queued (sender waits for receiver).
■​ Bounded Capacity: Fixed size queue.
■​ Unbounded Capacity: Infinite queue.

3.6. Communication in Client–Server Systems


Common communication paradigms in distributed or networked systems:
●​ Sockets:
○​ Concept: An endpoint for sending or receiving data across a network. It's a
programming interface for network communication.
○​ Types: Stream sockets (TCP, reliable, connection-oriented) and Datagram sockets
(UDP, unreliable, connectionless).
○​ Usage: Widely used for client-server communication over IP networks.
●​ Remote Procedure Calls (RPC):
○​ Concept: Allows a program to execute a procedure (function) in a different address
space (typically on a remote computer) as if it were a local procedure.
○​ Mechanism:
1.​ Client makes a local call to a stub function.
2.​ The client stub marshals (packs) parameters into a message.
3.​ The message is sent over the network to the server.
4.​ The server stub unmarshals (unpacks) parameters and calls the actual server
procedure.
5.​ The server procedure executes and returns results.
6.​ The server stub marshals results and sends them back to the client.
7.​ The client stub unmarshals results and returns them to the client program.
○​ Advantages: Simplifies distributed programming by abstracting network details.
○​ Disadvantages: Latency, security issues, error handling can be complex.
●​ Pipes:
○​ Concept: A unidirectional (or sometimes bidirectional) communication channel that
allows two related processes (typically parent and child) to communicate.
○​ Types:
■​ Unnamed Pipes: Only for related processes (e.g., | in shell commands).
■​ Named Pipes (FIFOs): Can communicate between unrelated processes and
appear as a file in the file system.
○​ Usage: Simple, often used for streaming data between processes.

3.7. Process Synchronization


●​ Concept: The coordination of processes to ensure that they execute correctly and
without unintended interactions, especially when accessing shared resources.
●​ Race Condition: A situation where multiple processes access and manipulate shared
data concurrently, and the final outcome depends on the specific order in which the
accesses take place. This leads to unpredictable and incorrect results.

3.8. Critical-Section Problem


●​ Definition: The problem of designing a protocol that ensures that when one process is
executing in its critical section (the part of the code that accesses shared resources), no
other process can be executing in its critical section.
●​ Requirements for a Solution:
1.​ Mutual Exclusion: Only one process can be in its critical section at any given time.
2.​ Progress: If no process is in its critical section and5 some processes want to enter,
then only those processes that are not in their remainder sections can participate in
the decision of which will enter its critical section next, and this selection cannot be
postponed indefinitely.
3.​ Bounded Waiting:6 There must be a limit on the number of times other processes
are allowed to enter their critical sections7 after a process has made a request to
enter its critical section and before that request is granted.
3.9. Peterson’s Solution
●​ Context: A classic software-based solution to the critical-section problem for two
processes.
●​ Variables:
○​ boolean flag[2]; // flag[i] is true if process i wants to enter its critical section.
○​ int turn; // Indicates whose turn it is to enter the critical section.
●​ Algorithm for Process Pi​(Process Pj​is the other process):​
C​
do {​
flag[i] = TRUE;​
turn = j; // Give preference to the other process​
while (flag[j] && turn == j); // Wait if Pj wants to enter AND it's Pj's turn​
// Critical Section​
flag[i] = FALSE; // Indicate that Pi is done with critical section​
// Remainder Section​
} while (TRUE);​

●​ Proof: Satisfies all three requirements (Mutual Exclusion, Progress, Bounded Waiting).
●​ Limitation: Only works for two processes and requires busy waiting.

3.10. Semaphores
●​ Definition: A synchronization tool that provides a more sophisticated way for processes
to synchronize than simple flags. It's an integer variable that, apart from initialization, is
accessed only through two standard atomic operations: wait() (or P) and signal() (or V).
●​ Types:
○​ Counting Semaphore: Can take any non-negative integer value. Used for controlling
access to a resource with multiple instances.
○​ Binary Semaphore (Mutex): Can only be 0 or 1. Used for mutual exclusion (like a
lock).
●​ Operations:
○​ wait(S):​
C​
wait(S) {​
while (S <= 0); // Busy wait (spin lock) if S is not positive​
S--;​
}​
(Modern semaphores use a waiting queue instead of busy waiting to avoid wasting
CPU cycles).
○​ signal(S):​
C​
signal(S) {​
S++;​
}​

●​ Usage for Mutual Exclusion:


○​ Initialize mutex = 1.
○​ wait(mutex) before critical section.
○​ signal(mutex) after critical section.
●​ Usage for Synchronization (e.g., Producer-Consumer Problem):
○​ empty (counting semaphore, initialized to buffer size)
○​ full (counting semaphore, initialized to 0)
○​ mutex (binary semaphore, initialized to 1)
○​ Producer:​
C​
wait(empty);​
wait(mutex);​
// Add item to buffer​
signal(mutex);​
signal(full);​

○​ Consumer:​
C​
wait(full);​
wait(mutex);​
// Remove item from buffer​
signal(mutex);​
signal(empty);​

●​ Disadvantages:
○​ Deadlock potential: Incorrect use can lead to deadlocks.
○​ Programming errors: Easy to misuse (e.g., forgetting signal() or wait(), swapped
order).

3.11. Classic Synchronization Problems


●​ Bounded-Buffer Problem (Producer-Consumer Problem): Producers produce items
and put them into a shared buffer, consumers consume items from the buffer. Need to
ensure producers don't write to a full buffer and consumers don't read from an empty
buffer.
●​ Readers-Writers Problem: Multiple readers can access shared data concurrently, but
only one writer at a time can access it. No reader can access if a writer is active.
●​ Dining-Philosophers Problem: Illustrates the problem of deadlocks and resource
allocation in a concurrent system. Five philosophers sit around a table, each needing two
chopsticks to eat.

4. Threads: Multicore Programming, Multithreading Models, Thread


Libraries, Implicit Threading, Threading Issues.
4.1. Threads
●​ Definition: A thread, sometimes called a lightweight process (LWP), is a basic unit of
CPU utilization. It comprises a thread ID, a program counter, a register set, and a stack. It8
shares the code section, data section, and other OS resources (like open files and
signals) with other threads belonging to the same process.
●​ Process vs. Thread:
○​ Process: Independent, has its own separate address space, heavy-weight, context
switching is expensive.
○​ Thread: Shares address space with other threads of the same process, lightweight,
context switching is less expensive.
●​ Benefits of Multithreading:
○​ Responsiveness: A program can remain responsive even if part of it is blocked or
performing a long operation (e.g., GUI applications).
○​ Resource Sharing: Threads within the same process share code and data, making
communication easier and faster than IPC.
○​ Economy: Cheaper to create and context-switch threads than processes.
○​ Scalability: Can take advantage of multi-core processors by running threads in
parallel.

4.2. Multicore Programming


●​ Concept: Utilizing multiple CPU cores available on a single chip to achieve parallel
execution.
●​ Challenges:
○​ Dividing Activities: How to partition computation into independent tasks.
○​ Balance: Ensuring tasks perform equal work to avoid one core waiting for another.
○​ Data Splitting: How to divide data accessed by tasks.
○​ Data Dependency: Ensuring tasks that depend on each other are synchronized.
○​ Testing and Debugging: More complex due to non-deterministic execution paths.
●​ Types of Parallelism:
○​ Data Parallelism: Distributing subsets of the same data across multiple cores, each
performing the same operation.
○​ Task Parallelism: Distributing tasks (threads) across multiple cores, each performing
a different operation on the same or different data.

4.3. Multithreading Models


Mapping user-level threads to kernel-level threads.
●​ Many-to-One Model:
○​ Concept: Many user-level threads are mapped to a single kernel thread.
○​ Advantages: Efficient (thread management done by user-level library).
○​ Disadvantages:
■​ If one thread makes a blocking system call, the entire process blocks.
■​ Cannot run in parallel on multi-core systems.
○​ Example: Green threads (early Java), Solaris Green threads.
●​ One-to-One Model:
○​ Concept: Each user-level thread maps to a separate kernel thread.
○​ Advantages:
■​ Allows true concurrency on multi-core systems.
■​ A blocking system call by one thread does not block the entire process.
○​ Disadvantages: Overhead of creating kernel threads (each kernel thread requires
kernel resources).
○​ Examples: Linux (Pthreads), Windows, Solaris 9+.
●​ Many-to-Many Model:
○​ Concept: Multiple user-level threads are mapped to a smaller or equal number of
kernel threads. Allows the OS to create enough kernel threads for the application to
run efficiently.
○​ Advantages: Combines benefits of both: efficient user-level thread management
and concurrency on multi-core systems.
○​ Disadvantages: More complex to implement.
○​ Example: Solaris prior to version 9.

4.4. Thread Libraries


Provide an API for creating and managing threads.
●​ Pthreads (POSIX Threads):
○​ Standard: A POSIX standard for thread creation and synchronization.
○​ Usage: Common in Unix-like operating systems (Linux, macOS).
○​ API: Includes functions for pthread_create(), pthread_exit(), pthread_join(),
pthread_mutex_lock(), pthread_cond_wait(), etc.
○​ Implementation: Typically implemented as a one-to-one model.
●​ Windows Threads:
○​ API: Native thread API provided by the Windows operating system.
○​ Usage: Specific to Windows platforms.
○​ API: Includes functions like CreateThread(), ExitThread(), WaitForSingleObject(),
WaitForMultipleObjects(), etc.
○​ Implementation: One-to-one model.
●​ Java Threads:
○​ Concept: Managed by the Java Virtual Machine (JVM).
○​ Implementation: The JVM typically maps Java threads to underlying native OS
threads (one-to-one model).
○​ API: Thread class, Runnable interface, synchronized keyword, wait(), notify(),
notifyAll().

4.5. Implicit Threading


Approaches where the compiler and/or runtime libraries handle the creation and management
of threads, abstracting much of the explicit thread management from the programmer.
●​ Thread Pools:
○​ Concept: A pool of pre-created threads that are waiting for tasks to be assigned to
them. When a task arrives, a thread from the pool is used; when the task is complete,
the thread returns to the pool.
○​ Advantages:
■​ Reduces overhead of thread creation/destruction.
■​ Limits the number of active threads, preventing resource exhaustion.
■​ Better performance for tasks arriving frequently.
○​ Usage: Common in web servers, application servers, and concurrent programming
frameworks.
●​ OpenMP:
○​ Concept: An API for parallel programming on shared-memory multi-core
architectures. It uses compiler directives (pragmas in C/C++) to specify parallel
regions of code.
○​ Usage: For parallelizing loops and sections of code.
○​ Example: #pragma omp parallel for
●​ Grand Central Dispatch (GCD):
○​ Concept: A technology by Apple for managing concurrent operations in macOS and
iOS. It uses dispatch queues to schedule tasks.
○​ Usage: Simplifies concurrent programming by abstracting away the underlying
thread management.
●​ Intel Threading Building Blocks (TBB):
○​ Concept: A C++ template library for parallel programming that focuses on
task-based parallelism.
○​ Usage: Provides high-level constructs for parallel algorithms (e.g., parallel_for,
parallel_reduce).

4.6. Threading Issues


●​ fork() and exec() System Calls:
○​ fork():
■​ fork() behavior with threads: Some UNIX systems have two versions of fork():
one that duplicates all threads (heavy-weight) and one that duplicates only the
calling thread.
■​ If all threads are duplicated, it can be inefficient. If only the calling thread is
duplicated, the new process might be in an inconsistent state if other threads
held locks or were accessing shared resources.
○​ exec(): When a process calls exec(), the program specified in the parameter to
exec() replaces the entire process, including all threads.
●​ Signal Handling:
○​ Concept: Signals are used to notify a process of an event (e.g., Ctrl+C, illegal
memory access).
○​ Issues in Multithreaded Programs:
■​ To which thread should the signal be delivered? (To the specific thread that
caused the event, to all threads, to certain threads, or to a specific designated
thread).
■​ Synchronous vs. Asynchronous signals.
○​ Solutions: Often, a dedicated thread is assigned to handle all signals for a process,
or signals are delivered to the thread that caused the event.
●​ Thread Cancellation:
○​ Concept: Terminating a thread before it has completed its task.
○​ Asynchronous Cancellation: One thread immediately terminates the target thread.
Can be unsafe if the target thread holds resources or is in a critical section.
○​ Deferred Cancellation: The target thread periodically checks if it should terminate
and terminates itself safely (e.g., releasing resources). This is generally preferred.
●​ Thread-Local Storage (TLS):
○​ Concept: Allows each thread to have its own copy of data, even if that data is
normally shared by all threads in the process. Useful for storing thread-specific
context.
○​ Purpose: Addresses the issue of thread-safety for global or static variables that are
not explicitly protected by synchronization mechanisms.
●​ Scheduler Activations (Many-to-Many Model):
○​ Concept: A communication scheme between the kernel and the user-level thread
library in a many-to-many model.
○​ Purpose: Allows the kernel to notify the user-level library about events (e.g., a kernel
thread blocking), and allows the user-level library to create or activate new kernel
threads as needed, optimizing the mapping.

5. CPU Scheduling: Scheduling Criteria and Algorithms; Thread


Scheduling, Multiple-Processor Scheduling, Real-Time CPU
Scheduling.
5.1. Basic Concepts
●​ CPU-I/O Burst Cycle: Process execution consists of a cycle of CPU execution (CPU
burst) and I/O wait (I/O burst). Most processes alternate between these.
●​ CPU Scheduler (Short-Term Scheduler): Selects a process from the ready queue and
allocates the CPU to it.
●​ Preemptive Scheduling: A running process can be interrupted and moved to the ready
state (e.g., if a higher-priority process arrives, or after a time slice expires).
●​ Non-Preemptive Scheduling: A running process keeps the CPU until it explicitly
releases it (e.g., terminates or enters a waiting state).
●​ Dispatcher: The module that gives control of the CPU to the process selected by the
short-term scheduler. Involves context switching.
●​ Dispatch Latency: The time it takes for the dispatcher to stop one process and start
another.

5.2. Scheduling Criteria


Metrics used to evaluate and compare CPU scheduling algorithms.
●​ CPU Utilization: Keep the CPU as busy as possible (range 0-100%).
●​ Throughput: Number of processes completed per unit time.
●​ Turnaround Time: Total time from submission to completion of a process (including
waiting in ready queue, execution, I/O).
●​ Waiting Time: Total time a process spends waiting in the ready queue.
●​ Response Time: Time from submission of a request until the first response is produced9
(for interactive systems).
●​ Fairness: Ensure each process gets a fair share of the CPU.

5.3. Scheduling Algorithms


●​ First-Come, First-Served (FCFS):
○​ Concept: Processes are served in the order they arrive in the ready queue (FIFO).
○​ Type: Non-preemptive.
○​ Advantages: Simple to implement.
○​ Disadvantages:
■​ Convoy Effect: A short process stuck behind a long process can lead to long
average waiting times.
■​ Not suitable for time-sharing systems.
●​ Shortest-Job-First (SJF):
○​ Concept: The CPU is allocated to the process with the smallest next CPU burst.
○​ Type: Can be non-preemptive (once started, runs to completion) or preemptive (if
a new process arrives with a shorter burst than the remaining time of the current
process, the current process is preempted). Preemptive SJF is also called
Shortest-Remaining-Time-First (SRTF).
○​ Advantages: Provably optimal in terms of minimum average waiting time and
minimum average turnaround time.
○​ Disadvantages:
■​ Starvation: Long processes may never get to run if there's a continuous stream
of short processes.
■​ Difficulty in Predicting Next CPU Burst: Impossible to know the exact future
CPU burst duration. (Often estimated using exponential averaging).
●​ Priority Scheduling:
○​ Concept: Each process is assigned a priority, and the CPU is allocated to the
process with the highest priority.
○​ Type: Can be preemptive or non-preemptive.
○​ Advantages: Can prioritize important tasks.
○​ Disadvantages:
■​ Starvation (Indefinite Blocking): Low-priority processes may never execute if
there's a continuous stream of higher-priority processes.
■​ Solution: Aging: Gradually increasing the priority of processes that have been
waiting for a long time.
●​ Round Robin (RR):
○​ Concept: Each process gets a small unit of CPU time, called a time quantum (or
time slice), typically 10-100 milliseconds. After this time, the process is preempted
and added to the end of the ready queue.
○​ Type: Preemptive.
○​ Advantages: Fair, provides good response time for interactive systems.
○​ Disadvantages:
■​ High Context-Switching Overhead: If time quantum is too small.
■​ Poor Throughput: If time quantum is too small.
■​ Longer Average Turnaround Time: Compared to SJF.
○​ Choice of Time Quantum: Crucial. Too large degenerates to FCFS. Too small leads
to excessive overhead.
●​ Multilevel Queue Scheduling:
○​ Concept: Divides the ready queue into multiple separate queues, each with its own
scheduling algorithm. Processes are permanently assigned to a queue (e.g.,
foreground/interactive, background/batch).
○​ Example: Foreground queue (RR), Background queue (FCFS).
○​ Scheduling Between Queues: Can be fixed-priority preemptive scheduling (e.g.,
foreground always has priority) or time slicing (e.g., 80% CPU to foreground, 20% to
background).
●​ Multilevel Feedback Queue Scheduling:
○​ Concept: Allows processes to move between queues based on their CPU burst
characteristics. This prevents starvation and allows for different scheduling policies.
○​ Mechanism:
■​ Processes entering the system go to a high-priority queue with a small time
quantum.
■​ If a process uses its entire time quantum, it's moved to a lower-priority queue
with a larger quantum (or FCFS).
■​ Processes that wait too long in a lower-priority queue can be moved to a
higher-priority queue (aging).
○​ Advantages: Most flexible, can be tuned for various workloads.
○​ Disadvantages: Most complex to implement and configure.

5.4. Thread Scheduling


●​ Contention Scope:
○​ Process-Contention Scope (PCS): Scheduling competition happens among
threads within the same process. The user-level thread library schedules threads
onto available Lightweight Processes (LWPs) in the Many-to-Many model.
○​ System-Contention Scope (SCS): Scheduling competition happens among all
kernel threads on the system. The operating system's CPU scheduler directly
schedules kernel threads onto available CPUs. This is the default in One-to-One
models (e.g., Pthreads on Linux, Windows threads).

5.5. Multiple-Processor Scheduling


●​ Asymmetric Multiprocessing: One processor (master server) handles all scheduling
decisions, I/O, and other system activities. Other processors only execute user code.
Simple but creates a bottleneck.
●​ Symmetric Multiprocessing (SMP): Each processor is self-scheduling. All processors
have access to the ready queue and schedule processes concurrently.
○​ Challenges:
■​ Synchronization: Need to protect the shared ready queue from race conditions.
■​ Load Balancing: Distributing workload evenly across processors.
■​ Push Migration: A specific task moves overloaded processes to idle
processors.
■​ Pull Migration: Idle processors pull processes from busy processors.
■​ Processor Affinity: A process running on a specific processor populates its
cache. Migrating it to another processor invalidates this cache, leading to
performance penalties.
■​ Soft Affinity: OS tries to keep a process on the same processor but doesn't
guarantee it.
■​ Hard Affinity: A process can specify that it should run only on a specific set
of processors.

5.6. Real-Time CPU Scheduling


●​ Concept: Scheduling for systems where correctness depends not only on the logical
result but also on the time at which the results are produced. Used in industrial control,
robotics, medical imaging.
●​ Characteristics:
○​ Timeliness: Tasks must meet their deadlines.
○​ Predictability: Guarantees on worst-case execution time.
●​ Types:
○​ Hard Real-Time Systems: Strict deadlines. Missing a deadline is a system failure.
Requires static scheduling and guaranteed resources.
○​ Soft Real-Time Systems: Less strict deadlines. Missing a deadline results in
degraded performance, but not system failure. Priorities are typically used.
●​ Real-Time Scheduling Algorithms:
○​ Rate-Monotonic Scheduling (RMS):
■​ Concept: A static-priority algorithm where priorities are assigned inversely
proportional to their periods (tasks with shorter periods have higher priorities).
■​ Preemptive: Always preempts a lower-priority task if a higher-priority task
becomes ready.
■​ Utilization Bound: Guarantees schedulability up to a certain CPU utilization
(e.g., ln(2)≈69% for a large number of tasks).
○​ Earliest-Deadline-First (EDF) Scheduling:
■​ Concept: A dynamic-priority algorithm where the task with the earliest deadline
has the highest priority.
■​ Optimality: Optimal in the sense that if a set of tasks can be scheduled by any
algorithm, it can be scheduled by EDF.
■​ Overhead: More overhead due to dynamic priority changes.
○​ Priority Inversion:
■​ Concept: A higher-priority task gets blocked by a lower-priority task holding a
resource that the higher-priority task needs.
■​ Solution: Priority-Inheritance Protocol: A lower-priority process inheriting the
priority of a higher-priority process when accessing a shared resource that the
higher-priority process needs.

6. Deadlocks: Deadlock Characterization, Methods for Handling


Deadlocks, Deadlock Prevention, Avoidance and Detection; Recovery
from Deadlock.
6.1. Deadlock Characterization
A set of processes is in a deadlock state when every process in the set is waiting for a
resource that is held by another process in the set. For a deadlock to occur, all four of the
following conditions must hold simultaneously:
1.​ Mutual Exclusion: At least one resource must be held in a non-sharable mode; only one
process at a time can use the resource. If another process requests that resource,10 the
requesting process must be delayed until the resource has been released.
2.​ Hold and Wait: A process must be holding at least one resource and waiting to acquire
additional resources that are currently being11 held by other processes.
3.​ No Preemption: Resources cannot be preempted; that is, a resource can be released
only voluntarily by the process holding it, after that process has completed its task.
4.​ Circular Wait: A set12 of waiting processes {P0​,P1​,…,Pn​} must exist such that P0​is
waiting for a resource held by P1​, P1​is waiting for a resource held by P2​, ..., Pn−1​is
waiting for a resource held by Pn​, and Pn​is waiting for a resource held by13 P0​.

6.2. Resource-Allocation Graph


A directed graph used to describe deadlocks.
●​ Nodes: Processes (P) and Resource Types (R).
●​ Edges:
○​ Request Edge: Pi​→Rj​(Process Pi​is requesting an instance of resource type Rj​).
○​ Assignment Edge: Rj​→Pi​(An instance of resource type Rj​has been allocated to
process Pi​).
●​ Circles in the graph indicate potential deadlocks. If a cycle exists and each resource
type has only one instance, then a deadlock exists. If resource types have multiple
instances, a cycle indicates a possibility of deadlock.

6.3. Methods for Handling Deadlocks


There are three main approaches:
1.​ Deadlock Prevention: Design the system to prevent one of the four necessary
conditions from holding.
2.​ Deadlock Avoidance: Require the OS to have some a priori information about resource
requests and allocation to avoid deadlocks.
3.​ Deadlock Detection and Recovery: Allow deadlocks to occur, detect them, and then
recover from them.

6.4. Deadlock Prevention


●​ Prevent Mutual Exclusion:
○​ Not generally possible for truly non-sharable resources (e.g., a printer).
○​ Can be avoided for sharable resources (e.g., read-only files).
●​ Prevent Hold and Wait:
○​ Strategy 1: A process must request and be allocated all its resources before it begins
execution.
■​ Disadvantage: Low resource utilization, possible starvation.
○​ Strategy 2: A process can request resources only when it has none. It must release
all current resources before requesting new ones.
■​ Disadvantage: Increased waiting time, starvation.
●​ Prevent No Preemption:
○​ Strategy 1: If a process holding resources requests another resource that cannot be
immediately allocated, it must release all resources it currently holds.
○​ Strategy 2: If a process requests a resource and that resource is held by another
waiting process, the requested resource is preempted from the waiting process.
○​ Disadvantage: Difficult to implement, especially for resources whose state is hard to
save and restore (e.g., printers).
●​ Prevent Circular Wait:
○​ Strategy: Impose a total ordering of all resource types, and require each process to
request resources in an increasing order of enumeration.14
○​ Example: Assign numbers to resource types (e.g., R1=1, R2=2, R3=3). A process can
only request Rj if it's not holding Ri where i >= j.
○​ Advantages: Simple to implement.
○​ Disadvantages: May not be efficient, requires knowing all resources upfront.

6.5. Deadlock Avoidance


Requires the system to have some additional information about the future resource requests
of each process. The OS ensures that the system never enters an unsafe state (a state from
which deadlock can potentially occur, even if no deadlock currently exists).
●​ Safe State: A state is safe if there exists a safe sequence of processes. A safe sequence
is an order of processes ⟨P1​,P2​,…,Pn​⟩ such that for each Pi​, the resources that Pi​can still
request can be satisfied by the currently available resources plus the resources held by
all Pj​where j<i.15 If the system is in a safe state, there is no deadlock. If it is in an unsafe
state, a deadlock might occur.
●​ Banker's Algorithm:
○​ Concept: A deadlock avoidance algorithm that works for a single instance of each
resource type or multiple instances.
○​ Information required:
■​ Available: Vector indicating the number of available instances of each resource
type.
■​ Max: Matrix defining the maximum demand of each process.
■​ Allocation: Matrix defining resources currently allocated to each process.
■​ Need: Matrix indicating remaining resources needed by each process (Max -
Allocation).
○​ Safety Algorithm: Determines if the current state is safe.
○​ Resource-Request Algorithm: Determines if a request can be granted immediately.
If granting the request leads to a safe state, it's granted; otherwise, the process
waits.
○​ Advantages: Less restrictive than prevention.
○​ Disadvantages: Requires knowing future resource needs, high overhead.

6.6. Deadlock Detection and Recovery


●​ Detection: Allow deadlocks to occur, then periodically run an algorithm to detect if a
deadlock exists.
○​ Detection Algorithm: Similar to the safety algorithm of Banker's, but it looks for a
cycle in the resource-allocation graph when resources are not pre-allocated.
○​ When to invoke: Periodically (e.g., every few hours, or when CPU utilization drops
below a threshold), or whenever a resource request cannot be granted immediately.
●​ Recovery from Deadlock: Once a deadlock is detected, a recovery strategy is needed.
1.​ Process Termination:
■​ Terminate all deadlocked processes: Simple, but all work done by these
processes is lost.
■​ Terminate one process at a time until the deadlock is eliminated: More
complex, requires choosing which process to terminate.
■​ Criteria for choosing: Priority, time consumed, resources held, resources
needed, whether it's interactive or batch.
2.​ Resource Preemption:
■​ Preempt resources from one or more processes: The preempted resources
are given to other processes.
■​ Issues:
■​ Selecting a victim: Which resources/processes to preempt? (Similar criteria
as process termination).
■​ Rollback: If a resource is preempted, the process might need to be rolled
back to a safe state (e.g., checkpoints) and restarted from there. This
requires the ability to save and restore process states.
■​ Starvation: A process might be repeatedly chosen as a victim. (Solution:
Incorporate the number of rollbacks into the cost factor).

7. Memory Management: Contiguous Memory Allocation, Swapping,


Paging, Segmentation, Demand Paging, Page Replacement, Allocation
of Frames, Thrashing, Memory-Mapped Files.
7.1. Basic Concepts
●​ Memory Management Unit (MMU): Hardware device that maps virtual addresses to
physical addresses.
●​ Logical Address (Virtual Address): Address generated by the CPU.
●​ Physical Address: Address seen by the memory unit.
●​ Address Binding: The process of associating logical addresses with physical addresses.
Can happen at compile time, load time, or execution time. Execution time binding
requires hardware support (MMU) and allows for dynamic relocation.

7.2. Contiguous Memory Allocation


●​ Concept: Each process is loaded into a single, contiguous block of memory.
●​ Memory Partitions:
○​ Fixed-Sized Partitions: Memory divided into a fixed number of partitions.
■​ Advantages: Simple.
■​ Disadvantages: Internal fragmentation (unused space within a partition), limits
the number of processes, program size limit.
○​ Variable-Sized Partitions (Dynamic Allocation): Partitions are created dynamically
as processes arrive.
■​ Advantages: No internal fragmentation.
■​ Disadvantages: External fragmentation (small, scattered free blocks that are too
small for new processes), requires compaction.
●​ Memory Allocation Strategies for Variable-Sized Partitions:
○​ First Fit: Allocate the first hole that is big enough.
○​ Best Fit: Allocate the smallest hole that is big enough (leaves the smallest remaining
hole).
○​ Worst Fit: Allocate the largest hole (leaves the largest remaining hole).
○​ External Fragmentation: Total free memory exists to satisfy a request, but it's not
contiguous.
○​ Internal Fragmentation: Allocated memory is slightly larger than requested memory.

7.3. Swapping
●​ Concept: A process can be temporarily swapped out of main memory to a backing store
(secondary storage) and then swapped back into main memory for continued execution.
●​ Purpose: Allows multiple processes to share main memory, supports multiprogramming.
●​ Roll Out, Roll In: Swapping used for priority-based scheduling; lower-priority process is
swapped out to allow a higher-priority process to be loaded.
●​ Considerations:
○​ Swap Space: Dedicated area on disk.
○​ Transfer Time: Major component of swapping time.

7.4. Paging
●​ Concept: A non-contiguous memory allocation scheme that eliminates external
fragmentation. Physical memory is divided into fixed-sized blocks called frames. Logical
memory is divided into blocks of the same size called pages.16
●​ Mechanism:
○​ When a process is loaded, its pages are scattered across available frames in physical
memory.
○​ A page table is used to translate logical addresses (page number, offset) to physical
addresses (frame number, offset).
○​ Page number: Used as an index into the page table.
○​ Page offset: Combined with the frame address to define the physical address.
●​ Advantages:
○​ No external fragmentation.
○​ Easy to allocate memory.
●​ Disadvantages:
○​ Internal Fragmentation: Occurs on the last page of a process (if the process size
isn't a multiple of page size).
○​ Overhead of Page Table: Can be large.
○​ Translation Lookaside Buffer (TLB): A small, fast hardware cache for page table
entries, used to speed up address translation.
○​ Hierarchical Paging: Breaking the page table into multiple levels to save memory.
○​ Hashed Page Tables: For large address spaces (e.g., 64-bit).
○​ Inverted Page Tables: One entry per physical frame, mapping frame to (process,
page).

7.5. Segmentation
●​ Concept: A memory-management scheme that supports the user's view of memory. A
program is divided into segments, which are logical units (e.g., code, data, stack,
subroutines, arrays). Each segment can have a different size.
●​ Mechanism:
○​ A segment table is used to map 2D logical addresses (segment number, offset) to
physical addresses. Each entry in the segment table has a base address and a limit
(length) for the segment.
●​ Advantages:
○​ Supports user's view of memory.
○​ Allows for protection and sharing at the segment level.
●​ Disadvantages:
○​ External Fragmentation: Can occur because segments are variable-sized.
○​ More complex memory allocation and deallocation.

7.6. Demand Paging


●​ Concept: A variation of paging where pages are loaded into memory only when they are
needed (demanded) during program execution.
●​ Mechanism:
○​ When a process starts, only a few pages (or none) are loaded.
○​ If a process tries to access a page that is not in memory, a page fault occurs.
○​ The OS handles the page fault by bringing the required page from secondary storage
(swap space/disk) into a free frame in main memory.
○​ Pure Demand Paging: Start a process with no pages in memory.
●​ Advantages:
○​ Less I/O needed (only necessary pages are loaded).
○​ Less physical memory required.
○​ Faster response times.
○​ More users can be supported (higher degree of multiprogramming).
●​ Lazy Swapper: Demand paging can be seen as a swapper that never swaps a page into
memory unless it will be used.

7.7. Page Replacement


●​ Concept: When a page fault occurs and there are no free frames available, the OS must
choose a victim page in memory to be replaced (swapped out) to make space for the
new page.
●​ Goal: Minimize the number of page faults.
●​ Page Replacement Algorithms:
○​ First-In, First-Out (FIFO): Replaces the page that has been in memory the longest.
■​ Anomaly: Belady's Anomaly (more frames can lead to more page faults).
○​ Optimal Page Replacement (OPT): Replaces the page that will not be used for the
longest period of time in the future.
■​ Advantages: Optimal (minimum page faults).
■​ Disadvantages: Impossible to implement in practice as it requires future
knowledge. Used as a benchmark.
○​ Least Recently Used (LRU): Replaces the page that has not been used for the
longest period of time.
■​ Advantages: Good approximation of OPT, based on the assumption that past
behavior predicts future behavior.
■​ Disadvantages: Requires hardware support (counters or stack) to track usage,
which can be expensive.
○​ Least Frequently Used (LFU): Replaces the page that has been used the least
often.
○​ Most Frequently Used (MFU): Replaces the page that has been used the most
often (based on the idea that a high frequency of use indicates it will soon be out of
use).
○​ Approximations of LRU:
■​ Reference Bit (Second-Chance/Clock Algorithm): Uses a reference bit for
each page. When a page is referenced, its bit is set. When a page needs to be
replaced, the OS scans pages; if the bit is 0, it's replaced; if 1, the bit is cleared,
and the page is given a second chance.
■​ Enhanced Second-Chance Algorithm: Considers both reference bit and
modify (dirty) bit to prefer replacing pages that are not modified and not recently
used.

7.8. Allocation of Frames


●​ Concept: How to distribute the fixed number of available frames among competing
processes.
●​ Strategies:
○​ Fixed Allocation: Each process gets a fixed number of frames.
■​ Equal Allocation: All processes get an equal share of frames.
■​ Proportional Allocation: Processes get frames proportional to their size or
priority.
○​ Priority Allocation: If a process has a higher priority, it can steal frames from a
lower-priority process.
○​ Global vs. Local Replacement:
■​ Global Replacement: A process can select a replacement victim from all frames
in the system (i.e., from any process). Generally results in higher throughput.
■​ Local Replacement: A process can only select a replacement victim from its
own allocated frames. More consistent performance for individual processes.

7.9. Thrashing
●​ Definition: A phenomenon in virtual memory systems where a process spends more time
paging (swapping pages between memory and disk) than executing instructions. It
occurs when a process does not have enough frames to hold its actively used pages (its
working set).
●​ Symptoms: High page fault rate, low CPU utilization, a sharp drop in system throughput.
●​ Causes: Too many processes competing for memory, or a process requiring more frames
than it's currently allocated.
●​ Solution:
○​ Provide more frames: Increase the physical memory or decrease the degree of
multiprogramming.
○​ Working-Set Model: Tracks the set of pages a process is actively using (its working
set). The OS tries to ensure a process has enough frames for its working set before
allowing it to run.
○​ Page-Fault Frequency (PFF): If PFF is too high, allocate more frames; if too low,
deallocate frames.

7.10. Memory-Mapped Files


●​ Concept: Allows a part of a file on disk to be treated as if it were a region of main
memory. The file's contents are directly mapped into the process's virtual address space.
●​ Mechanism:
○​ The OS creates a virtual memory region corresponding to a file.
○​ When the process accesses an address within this region, the OS automatically
handles the page faults to bring the relevant file content into memory.
○​ Writes to the memory-mapped region are automatically flushed back to the disk file
(eventually).
●​ Advantages:
○​ Simplified I/O: Can read/write files using simple memory operations, avoiding explicit
read()/write() system calls.
○​ Efficient IPC: Can be used for interprocess communication by mapping the same file
into multiple processes' address spaces.
○​ Reduced Copying: Avoids multiple copies of data between user buffer, kernel buffer,
and disk.
●​ Usage: Often used for loading dynamic libraries, for efficient file I/O, and for shared
memory IPC.

8. Storage Management: Mass-Storage Structure, Disk Structure,


Scheduling and Management, RAID Structure.
8.1. Mass-Storage Structure
●​ Overview: Secondary storage devices (hard drives, SSDs) are essential for long-term
storage of data and programs, as main memory is volatile.
●​ Characteristics: Non-volatile, cheaper per bit than main memory, slower access than
main memory.
●​ Types of Mass Storage:
○​ Hard Disk Drives (HDDs): Magnetic disks, spinning platters, read/write heads.
○​ Solid State Drives (SSDs): Flash memory, no moving parts, faster access times,
more durable, more expensive.
○​ Magnetic Tapes: Used for backup and archival storage, sequential access only.
○​ Optical Disks (CD/DVD/Blu-ray): Read-only or write-once, slower.

8.2. Disk Structure


●​ Magnetic Disks:
○​ Platters: Circular metal or glass plates coated with magnetic material.
○​ Spindle: Rotates the platters at high speed (RPM).
○​ Tracks: Concentric circles on the surface of each platter.
○​ Sectors: Tracks are divided into sectors, which are the smallest unit of data transfer.
○​ Cylinders: A set of tracks that are at the same radial position on all platters.
○​ Read/Write Heads: One head per platter surface, mounted on a common arm,
moving in unison across the platters.
●​ Disk Addressing: Traditionally, addressed by Cylinder, Head, Sector (CHS). Modern disks
use Logical Block Addressing (LBA) to simplify addressing, where the disk controller
maps LBAs to physical CHS.
●​ Disk Performance Parameters:
○​ Seek Time: Time for the disk arm to move the heads to the cylinder containing the
desired sector.
○​ Rotational Latency: Time for the desired sector to rotate under the read/write head.
○​ Transfer Time: Time to actually transfer the data.
○​ Disk Bandwidth: Total number of bytes transferred divided by the total time
between the first request and the completion of the last transfer.17

8.3. Disk Scheduling and Management


Disk Scheduling: The process of deciding the order in which disk I/O requests are serviced to
minimize seek time and improve overall performance.
●​ Algorithms:
○​ FCFS (First-Come, First-Served): Serves requests in the order they arrive. Simple
but may result in long seek times.
○​ SSTF (Shortest-Seek-Time-First): Selects the request with the minimum seek time
from the current head position.
■​ Advantages: Good performance.
■​ Disadvantages: Can lead to starvation for requests far from the head.
○​ SCAN (Elevator Algorithm): The disk arm starts at one end of the disk and moves
toward the other end, servicing requests along the way. When it reaches the other
end, it reverses direction.
■​ Advantages: Prevents starvation, good for heavy load.
○​ C-SCAN (Circular SCAN): Similar to SCAN, but when the arm reaches one end, it
immediately returns to the beginning of the disk without servicing requests on the
return trip.
■​ Advantages: More uniform wait time compared to SCAN.
○​ LOOK and C-LOOK: Variations of SCAN and C-SCAN where the arm only goes as far
as the furthest request in the current direction, then reverses immediately (or jumps
back to the start for C-LOOK). Avoids unnecessary travel to the very end of the disk.

Disk Management:
●​ Disk Formatting (Low-Level Formatting): Divides the disk into sectors, creating the
physical structure.
●​ Partitioning: Dividing the disk into one or more logical partitions, each appearing as a
separate disk.
●​ Logical Formatting (File System Creation): Creating a file system (e.g., FAT, NTFS,
ext4) on a partition, which involves creating data structures like free-space maps and
directory structures.
●​ Boot Block: Contains a small program (bootstrap loader) that loads the operating
system.
●​ Bad Blocks: Sectors that are permanently damaged.
○​ Bad-block recovery: Sector sparing/forwarding (remapping bad sectors to good
ones), sector slipping.

8.4. RAID Structure (Redundant Array of Independent Disks)


●​ Concept: A technology that combines multiple physical disk drives into a single logical
unit for data redundancy, performance improvement, or both.18
●​ Goal: Improve reliability and/or performance over a single disk.
●​ Levels of RAID:
○​ RAID 0 (Stripping):
■​ Concept: Data is split into blocks and written across multiple disks in parallel.
■​ Advantages: High performance (read/write), increased storage capacity.
■​ Disadvantages: No redundancy; failure of one disk means loss of all data.
○​ RAID 1 (Mirroring):
■​ Concept: Data is duplicated (mirrored) on two or more disks.
■​ Advantages: High data redundancy (fault tolerance), excellent read
performance.
■​ Disadvantages: High cost (requires double the disk space), slower write
performance.
○​ RAID 5 (Stripping with Distributed Parity):
■​ Concept: Data is stripped across multiple disks, and parity information is
distributed among all disks. If one disk fails, data can be reconstructed using the
parity.
■​ Advantages: Good balance of performance, capacity, and redundancy.
■​ Disadvantages: Write performance can be slower due to parity calculation and
updates.
○​ RAID 6 (Stripping with Dual Parity):
■​ Concept: Similar to RAID 5, but with two independent parity blocks.
■​ Advantages: Can tolerate the failure of two disks simultaneously, higher data
reliability.
■​ Disadvantages: Higher overhead than RAID 5 (more disk space for parity, slower
writes).
○​ RAID 10 (RAID 1+0 or 1&0):
■​ Concept: A nested RAID level combining mirroring and stripping. Data is stripped
across mirrored pairs of disks.
■​ Advantages: High performance and high fault tolerance (can tolerate multiple
disk failures as long as they are not in the same mirrored pair).
■​ Disadvantages: High cost (requires double the disk space like RAID 1).

9. File and Input/Output Systems: Access Methods, Directory and Disk


Structure; File-System Mounting, File Sharing, File-System Structure
and Implementation; Directory Implementation, Allocation Methods,
Free-Space Management, Efficiency and Performance; Recovery, I/O
Hardware, Application I/O Interface, Kernel I/O Subsystem,
Transforming I/O Requests to Hardware Operations.
9.1. File Concepts
●​ File: A named collection of related information that is recorded on secondary storage. It's
the smallest unit of logical data that can be written to secondary storage.
●​ File Attributes: Name, type, location, size, time created/modified/accessed, protection
(permissions).
●​ File Operations: Create, write, read, reposition (seek), delete, truncate.

9.2. Access Methods


How applications access data within a file.
●​ Sequential Access:
○​ Concept: Data is accessed in order, one record after another. Like reading a tape.
○​ Operations: read_next(), write_next(), reset().
○​ Usage: Common for batch processing, text files, logs.
●​ Direct Access (Random Access):
○​ Concept: Any record can be accessed directly without reading preceding records.
○​ Operations: read(n), write(n), seek(n) (n is block/record number).
○​ Usage: Databases, random data access.
●​ Indexed Sequential Access:
○​ Concept: Builds an index to the file, which allows for both sequential access and
direct access to specific records by searching the index.
○​ Usage: Large files where both direct access and sequential access are needed.

9.3. Directory and Disk Structure


●​ Directory: A collection of nodes containing information about all files.
●​ Directory Operations: Search for a file, create a file, delete a file, list a directory, rename
a file, traverse the file system.
●​ Directory19 Structures:
○​ Single-Level Directory: All files in one directory. Simple but problems with naming
and grouping.
○​ Two-Level Directory: Each user has their own separate directory. Solves naming
conflicts, but no sharing between users.
○​ Tree-Structured Directories: Most common. Hierarchical structure (root directory,
subdirectories, files).
■​ Path names: Absolute (/a/b/c) and relative (b/c).
■​ Current Directory: The directory the user is currently "in."
■​ Acyclic-Graph Directories: Allows files/directories to have multiple parent
directories (using links).
■​ Hard Links: Multiple directory entries point to the same inode (file data).
■​ Soft (Symbolic) Links: A special file that contains the path to another file.
■​ Cycles: Can introduce cycles, making directory traversal and garbage
collection more complex.
○​ General Graph Directory: Allows arbitrary links, including cycles. Most flexible but
hardest to manage.

9.4. File-System Mounting


●​ Concept: The process by which an operating system makes files and directories on a
storage device (or network share) available for users to access through the computer's
file system.
●​ Mount Point: An empty directory in the existing file system where the new file system is
attached.
●​ Mechanism: The OS records which device is mounted at which mount point. When a
path is resolved, if it enters a mount point, the OS redirects the lookup to the root of the
mounted file system.
●​ Unmounting: Detaching a file system from its mount point.

9.5. File Sharing


●​ Local File Sharing:
○​ User IDs (UIDs) and Group IDs (GIDs): Determine ownership and access
permissions.
○​ Access Control Lists (ACLs): More fine-grained permissions than traditional UNIX
permissions (rwx).
●​ Remote File Sharing (Network File Systems):
○​ NFS (Network File System): A distributed file system protocol developed by Sun
Microsystems, allowing users to access files over a network as if they were local.
○​ CIFS (Common Internet File System) / SMB (Server Message Block): Protocols
primarily used by Windows for network file sharing.
○​ Distributed File Systems: Provide transparent access to files stored on remote
servers (e.g., Google File System, HDFS).

9.6. File-System Structure and Implementation


●​ On-Disk Structures:
○​ Boot Control Block (per volume): Contains information needed to boot an OS from
that volume.
○​ Volume Control Block (superblock/master file table): Contains volume details
(number of blocks, block size, free-block count, etc.).
○​ Directory Structure: Organizes file names and their corresponding inodes/file
control blocks.
○​ File Control Block (FCB) / Inode (per file): Contains file attributes (permissions,
dates), file size, and pointers to data blocks.
●​ In-Memory Structures:
○​ Mount Table: Information about mounted file systems.
○​ System-Wide Open File Table: Contains a copy of the FCB for each open file, plus
an open count.
○​ Per-Process Open File Table: Contains a pointer to the entry in the system-wide
table, plus current file pointer.
●​ Layered File System Structure:
○​ Application Programs: Access files.
○​ Logical File System: Manages file control blocks, directory operations, protection.
○​ File-Organization Module: Maps logical blocks to physical blocks.
○​ Basic File System: Handles I/O requests for physical blocks.
○​ I/O Control: Device drivers.
○​ Devices: Hardware.

9.7. Directory Implementation


●​ Linear List: A list of file names and pointers to data blocks.
○​ Advantages: Simple.
○​ Disadvantages: Slow for searching, especially for large directories.
●​ Hash Table: A hash table is used to organize directory entries.
○​ Advantages: Faster lookups.
○​ Disadvantages: Collisions, fixed size.

9.8. Allocation Methods


How disk blocks are allocated to files.
●​ Contiguous Allocation:
○​ Concept: Each file occupies a set of contiguous blocks on the disk.
○​ Advantages: Simple, excellent for sequential access, good for direct access.
○​ Disadvantages: External fragmentation, difficult to grow files, requires contiguous
space.
●​ Linked Allocation:
○​ Concept: Each file is a linked list of disk blocks. Each block contains a pointer to the
next block.
○​ Advantages: No external fragmentation, files can grow dynamically.
○​ Disadvantages: Slow for direct access, unreliable (if a pointer is lost, the rest of the
file is lost), space overhead for pointers.
○​ File Allocation Table (FAT): A variation where pointers are stored in a separate table
(FAT) at the beginning of the volume. Improves direct access somewhat.
●​ Indexed Allocation:
○​ Concept: Each file has a separate index block that contains pointers to all its data
blocks.
○​ Advantages: Supports direct access, no external fragmentation, files can grow.
○​ Disadvantages: Space overhead for index blocks, potential for large index blocks
(solved by linked scheme for index blocks, multi-level indexing, or combined
scheme).
○​ Combined Scheme (e.g., UNIX Inodes): Uses a small number of direct pointers,
then single indirect, double indirect, and triple indirect pointers to handle very large
files.

9.9. Free-Space Management


How the OS keeps track of available disk blocks.
●​ Bit Map (Bit Vector): A bit vector where each bit represents a disk block. 0 for free, 1 for
allocated.
○​ Advantages: Simple, efficient for finding contiguous blocks.
○​ Disadvantages: Can be large for large disks.
●​ Linked List (Free List): A linked list of all free blocks. Each free block contains a pointer
to the next free block.
○​ Advantages: No space overhead for storing the list if blocks are already free.
○​ Disadvantages: Inefficient for finding contiguous blocks.
●​ Grouping: Store the addresses of a group of n free blocks in the first free block. The last
block in the group contains the address of the next group.
●​ Counting: Store the address of the first free block and the number of contiguous free
blocks following it.

9.10. Efficiency and Performance


●​ Disk Caching: Use main memory as a cache for disk blocks.
●​ Disk Buffers: Read-ahead caching, write-behind caching.
●​ Block Size: Larger blocks reduce internal fragmentation but increase I/O transfer size.
●​ File System Mounting Options: noatime, async, sync.
●​ Defragmentation: Reorganizing files on disk to make them contiguous and reduce seek
times.
●​ Metadata: Cached in memory to speed up file system operations.

9.11. Recovery (File System Consistency)


●​ Consistency Checking:
○​ Concept: Tools (e.g., fsck on UNIX, chkdsk on Windows) check for inconsistencies in
file system metadata (e.g., inconsistent free-block counts, invalid pointers).
○​ Mechanism: Scans directory and block allocation information to build a consistent
view and repair discrepancies.
●​ Journaling (Write-Ahead Logging):
○​ Concept: Before applying changes to the actual file system, the OS writes a record
of the intended changes to a special log (journal) on disk.
○​ Advantages: Fast recovery after a crash (only need to replay/rollback the journal, not
scan the entire file system).
○​ Types: Metadata journaling (journaling only metadata changes), data journaling
(journaling data and metadata).
●​ Snapshots: Creating a point-in-time copy of a file system, allowing for quick recovery to
a previous state.

9.12. I/O Hardware


●​ Device Controllers: Electronic circuits for managing a specific type of I/O device. They
have local buffer, registers, and specific logic to interact with the device.
●​ Ports: Interface for connecting devices to the system bus.
●​ Buses: Shared communication pathways (e.g., PCIe, USB, SATA).
●​ Memory-Mapped I/O: Device registers are mapped to memory addresses, allowing the
CPU to interact with devices using regular memory load/store instructions.
●​ I/O Instructions: Special instructions for interacting with I/O ports.
●​ Polling (Busy Waiting): CPU repeatedly checks a device's status register. Simple but
inefficient.
●​ Interrupts: Device notifies the CPU when I/O is complete or an error occurs.
○​ Interrupt Request Line (IRQ): Hardware line.
○​ Interrupt Vector: Table of pointers to interrupt service routines (ISRs).
○​ Interrupt Chaining: Multiple devices share an IRQ.
●​ Direct Memory Access (DMA):
○​ Concept: A special-purpose processor (DMA controller) handles data transfers
between I/O devices and main memory directly, without CPU intervention.
○​ Advantages: Frees up the CPU for other tasks, essential for high-speed I/O.
○​ Mechanism: CPU programs the DMA controller, then the DMA controller manages
the transfer, interrupting the CPU only when the entire transfer is complete.

9.13. Application I/O Interface


●​ System Calls: The primary way for applications to request I/O services from the OS (e.g.,
open(), close(), read(), write()).
●​ Device Drivers: Software modules that bridge the gap between the OS and specific
hardware devices. They present a uniform interface to the OS kernel, abstracting
device-specific details.
●​ Blocking I/O: A process requesting I/O is blocked until the I/O operation completes.
Simple to program.
●​ Non-Blocking I/O: I/O call returns immediately, even if the I/O is not yet complete. The
application must poll for completion.
●​ Asynchronous I/O: I/O call returns immediately, and the OS notifies the application (e.g.,
via a signal or callback) when the I/O completes. Allows concurrent computation and I/O.
●​ Buffering: Using memory regions to temporarily store data during I/O transfers,
smoothing out differences in data transfer rates.
●​ Caching: Storing copies of data in a faster storage medium (e.g., RAM for disk data) for
quicker access.
●​ Spooling: Holding output for a device (e.g., printer) that can only handle one job at a
time. A daemon places output in a buffer, and the device retrieves it from there.
9.14. Kernel I/O Subsystem
The part of the OS that manages and orchestrates I/O operations.
●​ Scheduling: Ordering I/O requests for optimal performance (e.g., disk scheduling).
●​ Buffering: Managing buffers for I/O data.
●​ Caching: Managing caches for I/O data.
●​ Spooling and Device Reservation: Handling print queues, exclusive device access.
●​ Error Handling: Detecting and recovering from I/O errors.
●​ Device Drivers: Loading, managing, and interacting with device-specific drivers.
●​ Device-Status Table: Tracks the state of each I/O device.
●​ Open-File Table: Keeps track of all open files.

9.15. Transforming I/O Requests to Hardware Operations


A high-level I/O request (e.g., read(file_descriptor, buffer, count)) goes through multiple layers
of abstraction to reach the hardware:
1.​ Application: Calls read() system call.
2.​ User-level Library: Standard C library (libc) might provide a wrapper for read().
3.​ System Call Interface: Traps to kernel mode.
4.​ Kernel I/O Subsystem:
○​ Checks permissions.
○​ Translates file descriptor to file control block (FCB).
○​ Determines logical block addresses from FCB.
○​ May involve buffering/caching.
○​ Calls appropriate device driver.
5.​ Device Driver:
○​ Translates logical block addresses to physical device addresses (e.g., cylinder, head,
sector for disk).
○​ Generates commands specific to the device controller.
○​ Programs the device controller registers.
○​ Initiates DMA transfer (if applicable).
6.​ Device Controller:
○​ Receives commands.
○​ Interacts with the actual hardware (e.g., moves disk arm, activates read/write heads).
○​ Transfers data to/from its internal buffer.
7.​ Hardware Device: Performs the physical operation.
8.​ Interrupt/Completion: Device controller generates an interrupt upon completion (or
error).
9.​ Interrupt Handler: Kernel's interrupt service routine processes the interrupt, updates
status, and potentially wakes up the blocked application process.
10.​Return to Application: Data is now in the user's buffer, and the read() system call
returns.

10. Security: Protection, Access Matrix, Access Control, Revocation of


Access Rights, Program Threats, System and Network Threats;
Cryptography as a Security Tool, User Authentication, Implementing
Security Defenses.
10.1. Protection
●​ Concept: Mechanisms that control access to resources (files, memory, CPU, devices) in a
computer system. It ensures that processes only access what they are authorized to
access.
●​ Goal: Enforce policies regarding resource usage, prevent accidental or malicious
interference, and maintain system integrity and data confidentiality.
●​ Domain of Protection: A set of (resource, access-right) pairs. Each process operates
within a protection domain, which specifies the resources it can access and the
operations it can perform.

10.2. Access Matrix


●​ Concept: A general model for representing protection in a system. It's a matrix where:
○​ Rows: Represent domains (subjects, processes, users).
○​ Columns: Represent objects (resources, files, devices, processes).
○​ Entries: Contain the set of access rights that a domain has for an object (e.g., read,
write, execute, delete).
●​ Implementation:
○​ Global Table: A single table storing all (domain, object, rights) triples. Too large and
sparse.
○​ Access Lists (for objects): Each object has a list of (domain, rights) pairs that
specify who can access it and what operations they can perform. Easier for
revocation. (Used for file permissions).
○​ Capability Lists (for domains): Each domain has a list of (object, rights) pairs that it
possesses. Easier to manage delegation of rights. (Less common in general-purpose
OS).

10.3. Access Control


●​ Definition: The process of granting or denying specific requests to obtain or use a
resource.
●​ Discretionary Access Control (DAC):
○​ Concept: The owner of a resource can grant or revoke access rights to other users.
Most common model (e.g., Unix file permissions).
○​ Flexibility: High flexibility.
○​ Potential Issue: A malicious user can grant access to unauthorized users.
●​ Mandatory Access Control (MAC):
○​ Concept: The system enforces access policies based on security labels (e.g., "top
secret," "confidential") and clearances. No user or program can override these
policies.
○​ Usage: High-security environments (military, government).
○​ Strictness: More rigid, less flexible than DAC.
●​ Role-Based Access Control (RBAC):
○​ Concept: Permissions are associated with roles, and users are assigned to roles.
Users inherit the permissions of the roles they are assigned to.
○​ Advantages: Simplified management for large systems, easier to comply with
compliance policies.
○​ Usage: Enterprise systems.

10.4. Revocation of Access Rights


How to remove access rights from a user or process.
●​ Immediate Revocation: The access right is removed immediately.
○​ Access List: Simply delete the entry from the access list of the object.
○​ Capability List: More complex, as capabilities might be widely distributed. Requires
global invalidation, reacquisition, or selective revocation.
●​ Delayed Revocation: Access remains until the capability expires or the system is
rebooted.

10.5. Program Threats


Malicious programs designed to cause harm or gain unauthorized access.
●​ Virus: Attaches itself to legitimate programs and spreads to others, often causing
damage or disruption.
●​ Worm: Self-replicating malicious program that spreads across networks without human
intervention, often consuming network bandwidth.
●​ Trojan Horse: A seemingly legitimate program that performs malicious actions (e.g.,
backdoors, data theft) in addition to its advertised function.
●​ Logic Bomb: Malicious code embedded in a program that triggers when certain
conditions are met (e.g., specific date, user action).
●​ Time Bomb: A type of logic bomb activated by a specific date or time.
●​ Trap Door (Backdoor): A hidden entry point in a program that allows unauthorized
access, often left by developers for debugging or maintenance.
●​ Rootkit: A set of software tools that hides the presence of malware and unauthorized
access on a computer system.
●​ Spyware: Gathers information about a user's activity without their knowledge or
consent.
●​ Ransomware: Encrypts a user's files and demands a ransom for decryption.

10.6. System and Network Threats


Threats targeting the entire system or network infrastructure.
●​ Denial of Service (DoS) / Distributed DoS (DDoS): Overwhelming a system or network
with traffic to make it unavailable to legitimate users.
●​ Port Scanning: Scanning a network to identify open ports and services, looking for
vulnerabilities.
●​ Sniffing: Intercepting and analyzing network traffic to capture sensitive information.
●​ Man-in-the-Middle Attack: An attacker secretly relays and possibly alters the
communication between two parties who believe they are20 communicating directly.
●​ Spoofing: Faking identity (e.g., IP address, email address) to trick users or systems.
●​ Buffer Overflow: Exploiting a programming error where a program tries to write more
data into a buffer than it can hold, overwriting adjacent memory and potentially executing
malicious code.
●​ Privilege Escalation: An attacker gaining higher-level access than they are authorized
for.
●​ Social Engineering: Manipulating people into divulging confidential information or
performing actions that compromise security.21

10.7. Cryptography as a Security Tool


●​ Concept: The practice and study of techniques for secure communication in the
presence of adversaries.
●​ Encryption: The process of converting information or data into a code to prevent
unauthorized access.
●​ Decryption: The process of converting encrypted data back into its original form.22
●​ Algorithms:
○​ Symmetric-key Cryptography: Uses the same key for both encryption and
decryption (e.g., AES, DES). Fast but key distribution is a challenge.
○​ Asymmetric-key (Public-key) Cryptography: Uses a pair of keys: a public key for
encryption (and verification) and a private key for decryption (and signing). (e.g.,
RSA, ECC). Slower but solves key distribution and enables digital signatures.
●​ Hashing:
○​ Concept: A one-way function that takes an input and produces a fixed-size string of
characters (hash value/digest). Impossible to reverse.
○​ Usage: Data integrity verification (detect changes), password storage (store hash,
not actual password).
●​ Digital Signatures: Uses public-key cryptography to verify the authenticity and integrity
of a message or document.
●​ Certificates: Digital documents that bind a public key to an entity (e.g., person,
organization), issued by a trusted Certificate Authority (CA). Used to verify identity.

10.8. User Authentication


●​ Concept: Verifying the identity of a user or process.
●​ Methods:
○​ Passwords: Most common, but vulnerable to guessing, brute force, dictionary
attacks. Needs strong policies (length, complexity, regular changes).
○​ Two-Factor Authentication (2FA) / Multi-Factor Authentication (MFA): Requires
two or more distinct types of credentials (e.g., something you know (password) +
something you have (phone, token) + something you are (biometrics)).
○​ Biometrics: Using unique physical or behavioral characteristics (fingerprints, facial
recognition, iris scan).
○​ Smart Cards / Tokens: Physical devices that store cryptographic keys or generate
one-time passwords.
●​ Password Hashing: Passwords should always be stored as hashes (preferably with salt)
to prevent them from being recovered if the database is compromised.

10.9. Implementing Security Defenses


●​ Prevention:
○​ Software Design: Secure coding practices (input validation, avoiding buffer
overflows).
○​ Operating System Security: Access control mechanisms, memory protection,
kernel hardening.
○​ Firewalls: Network security system that monitors and controls incoming and
outgoing network traffic based on predefined security rules.23
○​ Antivirus/Anti-malware: Detects and removes malicious software.
○​ Intrusion Prevention Systems (IPS): Actively block detected threats.
●​ Detection:
○​ Intrusion Detection Systems (IDS): Monitor network or system activities for
malicious activity or policy violations and generate alerts.
○​ Auditing and Logging: Recording security-related events for later review and
analysis.
●​ Response:
○​ Incident Response Plan: A predefined plan for handling security incidents.
○​ Backup and Recovery: Regular backups to restore data after an attack.
●​ System Updates and Patching: Regularly applying security patches to fix known
vulnerabilities.
●​ Principle of Least Privilege: Granting users and processes only the minimum necessary
permissions to perform their tasks.
●​ Security Auditing: Regular reviews of system configurations and security logs.

11. Virtual Machines: Types of Virtual Machines and Implementations;


Virtualization.
11.1. Virtual Machines
●​ Concept: An emulation of a computer system. Virtual machines (VMs) provide a
complete, isolated execution environment, including a virtual CPU, memory, storage, and
network interfaces.
●​ Purpose:
○​ Isolation: Each VM is isolated from others and from the host system, improving
security and stability.
○​ Resource Management: Resources can be allocated and managed dynamically.
○​ Portability: VMs can be easily moved between physical hosts.
○​ Consolidation: Run multiple OS instances on a single physical machine.
○​ Development & Testing: Create isolated environments for testing new software,
operating systems.
○​ Legacy Applications: Run older software requiring specific OS versions.

11.2. Virtualization
●​ Definition: The creation of a virtual (rather than actual) version of something, such as an
operating system, a server, a storage device, or network resources.24
●​ Hypervisor (Virtual Machine Monitor - VMM):
○​ Concept: A layer of software that sits between the hardware and the virtual
machines. It creates and manages the virtual machines and allocates resources to
them.
○​ Types:
■​ Type 1 Hypervisor (Bare-Metal Hypervisor):
■​ Concept: Runs directly on the host hardware, managing the hardware
resources and providing a platform for VMs. No underlying host OS.
■​ Advantages: High performance, low overhead, greater security.
■​ Examples: VMware ESXi, Microsoft Hyper-V, Citrix XenServer.
■​ Type 2 Hypervisor (Hosted Hypervisor):
■​ Concept: Runs as an application on top of a conventional host operating
system. The host OS provides hardware access for the hypervisor.
■​ Advantages: Easier to set up, good for development/testing on a desktop.
■​ Disadvantages: Performance overhead due to host OS layer, less secure
than Type 1.
■​ Examples: VMware Workstation, VirtualBox, Parallels Desktop.

11.3. Implementations of Virtual Machines


●​ Process Virtual Machine (Application Virtualization):
○​ Concept: Provides a platform-independent runtime environment for a single
application. It creates a virtual instruction set and memory space for the application.
○​ Purpose: Allows software to run on different operating systems without modification.
○​ Examples: Java Virtual Machine (JVM), .NET Common Language Runtime (CLR).
●​ System Virtual Machine (Hardware Virtualization):
○​ Concept: Emulates an entire hardware system, allowing a complete operating system
(Guest OS) to run on it.
○​ Techniques:
■​ Full Virtualization:
■​ Concept: The hypervisor completely simulates the hardware, making it
unnecessary to modify the guest OS. The guest OS runs unaware that it's
virtualized.
■​ Advantages: Guest OS doesn't need modification, wide compatibility.
■​ Disadvantages: Performance overhead due to trapping and emulation,
especially for privileged instructions.
■​ Hardware-Assisted Virtualization: Modern CPUs (Intel VT-x, AMD-V)
provide hardware support to accelerate full virtualization by handling
privileged instructions directly, significantly reducing overhead.
■​ Paravirtualization:
■​ Concept: The guest OS is modified (ported) to include calls (hypercalls) to
the hypervisor, allowing it to directly communicate with the hypervisor for
privileged operations, rather than relying on emulation.
■​ Advantages: Near-native performance.
■​ Disadvantages: Requires modification of the guest OS.
■​ Examples: Xen (originally), KVM (Linux Kernel-based Virtual Machine) uses
both hardware-assisted and paravirtualization techniques.
■​ Operating-System-Level Virtualization (Containerization):
■​ Concept: Not a true VM, but provides isolation at the OS level. Multiple
isolated user-space instances (containers) run on a single shared kernel.
■​ Advantages: Very lightweight, fast startup, high density (many containers on
one host), low overhead.
■​ Disadvantages: Less isolation than full VMs (shared kernel), all containers
must run the same OS kernel.
■​ Examples: Docker, LXC (Linux Containers), Kubernetes (orchestration).
12. Linux Operating Systems: Design Principles, Kernel Modules,
Process Management, Scheduling, Memory Management, File
Systems, Input and Output; Interprocess Communication, Network
Structure.
12.1. Design Principles
●​ Monolithic Kernel (with Loadable Modules): While the core Linux kernel is monolithic
(most services run in kernel space), it supports dynamic loading and unloading of kernel
modules (e.g., device drivers, file system drivers). This provides flexibility without
sacrificing too much performance.
●​ Open Source: Developed and distributed under the GNU General Public License (GPL).
Community-driven development.
●​ UNIX-like: Adheres to POSIX standards, providing a familiar environment for Unix users
and compatibility for Unix applications.
●​ Portability: Designed to run on a wide range of hardware architectures.
●​ Multitasking and Multiuser: Supports multiple processes running concurrently and
multiple users logged in simultaneously.
●​ Modularity: Despite being monolithic, it's highly modular internally.
●​ Security: Strong emphasis on security through permissions, user/group management,
and security modules like SELinux/AppArmor.

12.2. Kernel Modules


●​ Concept: Pieces of code that can be loaded into and unloaded from the kernel on the fly,
without requiring a system reboot.
●​ Purpose: Extend kernel functionality (e.g., new device drivers, file systems, network
protocols).
●​ Advantages:
○​ Reduces kernel size.
○​ Dynamic addition/removal of features.
○​ Easier development and debugging of new drivers.
●​ Tools: insmod (insert module), rmmod (remove module), lsmod (list loaded modules),
modprobe (intelligent module loading/unloading).

12.3. Process Management


●​ Process Representation: Each process is represented by a task_struct data structure in
the kernel.
●​ Process Creation: Uses the fork() system call (creates a copy of the parent process)
followed by execve() (replaces the process image with a new program).
●​ Process States: Similar to general OS concepts (running, waiting, stopped, zombie,
exited).
●​ Process Hierarchy: Processes form a tree-like hierarchy, with init (or systemd) as the
root.

12.4. Scheduling
●​ Scheduler: The Linux scheduler has evolved over time.
○​ O(1) Scheduler: Optimized for large numbers of processes, but had issues with
interactivity.
○​ Completely Fair Scheduler (CFS): (Since kernel 2.6.23)
■​ Concept: Aims to provide fair allocation of CPU time to all processes. It doesn't
use fixed time slices, but rather a "virtual runtime" (vruntime) for each task.
■​ Mechanism: Tasks with lower vruntime (meaning they've run less) are scheduled
next. It effectively tries to give every task an equal share of CPU proportionally to
its weight (priority).
■​ Red-Black Tree: The ready queue is maintained as a red-black tree, allowing for
efficient insertion and finding the "most in need" task.
●​ Priorities: Linux uses both static (nice values) and dynamic priorities. Lower nice values
mean higher priority.
●​ Preemptive: Linux scheduler is preemptive.

12.5. Memory Management


●​ Virtual Memory: Uses demand paging. Each process has its own virtual address space.
●​ Paging: Divides logical memory into pages and physical memory into frames. Uses a
multi-level page table.
●​ Swap Space: Uses dedicated swap partitions or swap files on disk to extend physical
memory.
●​ Page Replacement: Primarily uses a variation of the LRU algorithm, often based on two
lists (active and inactive lists) to manage pages.
●​ Buddy System: A kernel memory allocation algorithm used for allocating physical
memory pages to kernel components.
●​ Slab Allocator: Used for efficient allocation and deallocation of frequently used kernel
data structures.

12.6. File Systems


●​ Virtual File System (VFS):
○​ Concept: A software layer in the kernel that provides a common interface for
different concrete file systems. It abstracts the underlying file system
implementations.
○​ Purpose: Allows Linux to support a wide variety of file systems (ext4, XFS, Btrfs,
NTFS, FAT, NFS, etc.) transparently.
●​ Common File Systems:
○​ ext2/ext3/ext4: Extended file systems, default for many Linux distributions. ext3 and
ext4 add journaling for faster recovery.
○​ XFS: High-performance, journaling file system, good for large files and directories.
○​ Btrfs (B-tree file system): Modern file system with advanced features like
snapshots, checksums, volume management, and self-healing.
●​ Inode: A data structure that stores information about a file or directory (metadata) in a
Unix-like file system. It contains pointers to data blocks.
●​ Mounting: File systems are mounted at specific mount points in the directory tree.

12.7. Input and Output


●​ Device Files: All devices are represented as files in the /dev directory (character devices
for sequential access, block devices for random access).
●​ Device Drivers: Kernel modules that control specific hardware devices.
●​ Buffered I/O: Data is buffered in kernel memory during I/O operations.
●​ DMA: Used for high-speed data transfers between devices and memory.

12.8. Interprocess Communication (IPC)


●​ Pipes: Unnamed pipes (for related processes) and named pipes (FIFOs, for unrelated
processes).
●​ Message Queues: Allow processes to exchange messages through a kernel-managed
queue.
●​ Shared Memory: Allows processes to share a region of memory, providing fast
communication (requires explicit synchronization).
●​ Semaphores: For synchronization.
●​ Sockets: For network-based communication, both local and remote.
●​ Signals: For notifying processes of events.

12.9. Network Structure


●​ TCP/IP Stack: Implements the full TCP/IP protocol suite.
●​ Network Interfaces: Managed by device drivers.
●​ Sockets: The primary API for network programming.
●​ Routing: Kernel handles routing of network packets.
●​ Netfilter (iptables/nftables): Firewall framework within the kernel for packet filtering
and manipulation.

13. Windows Operating Systems: Design Principles, System


Components, Terminal Services and Fast User Switching; File System,
Networking.
13.1. Design Principles
●​ Hybrid Kernel: Combines aspects of monolithic and microkernel design. Many services
run in kernel mode for performance, but key OS services like file systems and device
drivers can also be implemented as user-mode processes for robustness.
●​ Layered Architecture: Structurally designed in layers, though not strictly enforced as in
a classic layered model.
●​ Object-Oriented: Designed around objects (processes, threads, files, etc.) and object
managers for consistency and security.
●​ Portability: Designed to run on various CPU architectures (initially Intel, now primarily
x86/x64, ARM).
●​ Extensibility: Designed to be easily extended through new APIs and drivers.
●​ Reliability & Security: Strong emphasis on stability, robustness, and a comprehensive
security model from the ground up.
●​ Compatibility: Backward compatibility with older Windows applications.

13.2. System Components


●​ Hardware Abstraction Layer (HAL):
○​ Concept: A layer of software that hides hardware differences from the kernel and
device drivers.
○​ Purpose: Improves portability by abstracting platform-specific hardware details.
●​ Kernel ([Link]): The core of the operating system, responsible for:
○​ Executive: Provides core OS services (object manager, process manager, memory
manager, I/O manager, security reference monitor, cache manager, plug-and-play
manager, power manager).
○​ Kernel proper: Basic services like thread scheduling, interrupt handling, and
low-level synchronization.
●​ Environment Subsystems: Provide an API to run applications from different operating
system environments (e.g., Win32 subsystem for native Windows applications, POSIX
subsystem for Unix-like applications, OS/2 subsystem - largely deprecated).
●​ User Mode Processes:
○​ System Processes: Critical processes for OS functioning (e.g., [Link] - Session
Manager Subsystem, [Link] - Local Security Authority Subsystem Service).
○​ Service Processes: Long-running background processes (e.g., Print Spooler,
Network services).
○​ User Applications: User-level programs.
●​ Device Drivers: Loadable modules that interact with hardware, running in kernel mode.

13.3. Terminal Services and Fast User Switching


●​ Terminal Services (Remote Desktop Services):
○​ Concept: Allows multiple users to simultaneously access and run applications on a
single Windows server remotely. Each user gets an independent desktop session.
○​ Purpose: Centralized application deployment, remote work, thin client environments.
○​ Mechanism: Uses the Remote Desktop Protocol (RDP) to transmit screen updates,
keyboard, and mouse input.
●​ Fast User Switching:
○​ Concept: Allows multiple users to be logged on to a single machine concurrently,
switching between active user sessions without requiring one user to log off before
another can log on.
○​ Purpose: Convenience for multiple users sharing a single PC.
○​ Mechanism: Each user's session remains active in memory, allowing quick switching.

13.4. File System (NTFS - New Technology File System)


●​ Key Features:
○​ Journaling: All metadata changes are recorded in a log (journal) before being
applied to the file system, ensuring data integrity and fast recovery after crashes.
○​ Security: Supports fine-grained access control lists (ACLs) for files and directories.
○​ Reliability: Self-healing capabilities, transaction logging.
○​ Large File and Volume Support: Handles very large files and disk volumes
efficiently.
○​ File Compression and Encryption: Built-in support for compressing and encrypting
files.
○​ Hard Links and Junction Points: Similar to Unix hard links and symbolic links.
○​ Alternate Data Streams (ADS): Allows multiple data streams to be associated with a
single file.
○​ Change Journal: Tracks changes to files and directories.
○​ Master File Table (MFT): The core data structure of NTFS, storing metadata about
all files and directories on the volume.
●​ Comparison to FAT: NTFS is far more robust, secure, and feature-rich than the older FAT
(File Allocation Table) file systems.

13.5. Networking
●​ Integrated Networking: Networking capabilities are deeply integrated into the OS.
●​ Network Stack: Implements the TCP/IP protocol suite (and others like NetBEUI, IPX/SPX
historically).
●​ Sockets (Winsock): The primary API for network programming on Windows.
●​ Network Device Interface Specification (NDIS): A standard API for network card
drivers, providing a uniform interface to the network stack.
●​ DNS (Domain Name System): Used for name resolution.
●​ Active Directory: A directory service developed by Microsoft for Windows domain
networks. Manages permissions and network access to resources.
●​ Firewall (Windows Defender Firewall): Built-in firewall for packet filtering.
●​ Remote Access: Supports various VPN protocols and remote access technologies.

14. Distributed Systems: Types of Network based Operating Systems,


Network Structure, Communication Structure and Protocols;
Robustness, Design Issues, Distributed File Systems.
14.1. Distributed Systems
●​ Definition: A collection of independent computers that appears to its users as a single
coherent system.25 They communicate with each other by passing messages over a
communication network.
●​ Characteristics:
○​ Transparency: Users are unaware of the distribution of resources.
○​ Concurrency: Multiple tasks can execute simultaneously.
○​ Openness: Easy to extend and integrate new components.
○​ Scalability: Can be easily expanded to handle increasing loads.
○​ Fault Tolerance: Can continue to function even if some components fail.
○​ Resource Sharing: Resources are shared across the network.

14.2. Types of Network-Based Operating Systems


●​ Network Operating Systems (NOS):
○​ Concept: Each computer in the network runs its own independent operating system
(e.g., Windows, Linux). The OS is network-aware, providing services for sharing files,
printers, etc., over the network.
○​ Transparency: Low transparency; users explicitly know they are accessing remote
resources.
○​ Management: Management is typically per-machine.
○​ Examples: Most modern OSs when used in a networked environment, such as a
traditional client-server setup where clients run Windows and servers run Windows
Server or Linux.
●​ Distributed Operating Systems (DOS):
○​ Concept: A truly integrated operating system that manages a collection of
networked computers as a single, unified system. Users are unaware of the physical
distribution.
○​ Transparency: High transparency; resources are accessed as if they are local.
○​ Management: Single system image, central management.
○​ Challenges: Extremely complex to design and implement, difficult to achieve true
transparency and fault tolerance.
○​ Examples: Amoeba, Mach (kernel for some distributed OSs). Research systems, less
common in commercial use as fully integrated DOS.
●​ Middleware-Based Systems:
○​ Concept: The most common approach today. Applications run on local OSs, but a
middleware layer (e.g., CORBA, Java RMI, web services, message queues) provides a
uniform interface and abstraction for distributed communication and resource
access.
○​ Advantages: Combines the robustness of individual OSs with the benefits of
distributed computing.
○​ Transparency: Provides a good level of transparency at the application level.
○​ Examples: Cloud computing platforms, enterprise application integration using
technologies like Kafka, RabbitMQ, gRPC.

14.3. Network Structure, Communication Structure and Protocols


●​ Network Structure (Topology):
○​ LAN (Local Area Network): Connects devices in a limited geographical area.
○​ WAN (Wide Area Network): Connects devices over a large26 geographical area.
○​ Protocols: Ethernet, Wi-Fi.
○​ Topologies: Bus, Star, Ring, Mesh.
●​ Communication Structure:
○​ Client-Server: The most common paradigm. Clients request services, servers
provide them.
○​ Peer-to-Peer (P2P): All nodes are equal, can act as both clients and servers. No
centralized control. (e.g., BitTorrent).
○​ Message Passing: Fundamental communication primitive.
●​ Protocols:
○​ TCP/IP (Transmission Control Protocol/Internet Protocol): The fundamental suite
of protocols for the internet and27 most modern networks.
■​ TCP: Connection-oriented, reliable, ordered, flow control, congestion control.
■​ UDP (User Datagram Protocol): Connectionless, unreliable, faster.
○​ HTTP (Hypertext Transfer Protocol): For web communication.
○​ RPC (Remote Procedure Call): Allows a program to execute a function on a remote
machine as if it were local.
○​ Message Queues: Asynchronous communication mechanism where messages are
stored in a queue until consumed.

14.4. Robustness
●​ Concept: The ability of a system to cope with errors during execution and gracefully
handle unexpected conditions or failures.
●​ Techniques for Robustness in Distributed Systems:
○​ Redundancy: Duplicating components (data, services, hardware) to provide backup
in case of failure.
■​ Data Redundancy: Replicating data across multiple nodes.
■​ Service Redundancy: Running multiple instances of a service.
○​ Fault Tolerance: Designing systems to continue operating correctly despite faults.
■​ Fail-stop: A component stops cleanly on failure, notifying other components.
■​ Fail-fast: A component quickly detects and reports failure.
○​ Consistency Models: Ensuring data consistency across replicated data in the face
of failures (e.g., strong consistency, eventual consistency).
○​ Failure Detection: Mechanisms to detect failed nodes or processes (e.g.,
heartbeats).
○​ Recovery: Procedures to restore a failed component to a consistent state.
○​ Checkpointing and Rollback: Periodically saving the state of processes/system to
allow rollback to a consistent state after a failure.

14.5. Design Issues in Distributed Systems


●​ Transparency:
○​ Location Transparency: Users don't know where resources are located.
○​ Access Transparency: Accessing local and remote resources is the same.
○​ Migration Transparency: Resources can move without affecting users.
○​ Concurrency Transparency: Users don't know about concurrent access to shared
resources.
○​ Failure Transparency: Hide failures and recover from them.
●​ Concurrency Control: Managing concurrent access to shared resources across multiple
nodes to maintain consistency.
○​ Distributed Mutual Exclusion: Ensuring only one process can access a shared
resource at a time across a distributed system.
○​ Distributed Deadlock: Detecting and resolving deadlocks that span multiple nodes.
●​ Fault Tolerance and Recovery: As discussed above, crucial for reliability.
●​ Security: Secure communication, authentication, authorization in a distributed
environment.
●​ Scalability: Designing the system to handle increasing loads and number of nodes.
●​ Heterogeneity: Dealing with different hardware, OS, and network types.
●​ Clock Synchronization: Ensuring consistent time across distributed nodes (important
for ordering events).
●​ Load Balancing: Distributing workload evenly across nodes.

14.6. Distributed File Systems (DFS)


●​ Concept: A file system that allows multiple clients to access files stored on servers
across a computer network. Provides transparent access to remote files.
●​ Goals:
○​ Transparency: Access to remote files should be indistinguishable from local files.
○​ Availability: Accessible even if some servers fail.
○​ Reliability: Data integrity, fault tolerance.
○​ Performance: Fast access.
○​ Scalability: Handle large numbers of users and files.
●​ Components:
○​ Client Module: Intercepts file system calls and redirects them to the appropriate
server.
○​ Server Module: Manages the actual storage and responds to client requests.
●​ Naming and Transparency:
○​ Location Transparency: File names don't reveal their physical location.
○​ Location Independence: Files can be moved without changing their names.
●​ Caching:
○​ Client-side Caching: Clients cache file data locally to reduce network traffic and
improve performance.
○​ Cache Consistency: A major challenge; ensuring that cached data is up-to-date
with the master copy on the server.
■​ Write-through: Writes are immediately propagated to the server.
■​ Write-back: Writes are cached locally and propagated later.
■​ Leasing: Granting a client a "lease" on a file, allowing it to cache, and notifying
the client if the file changes.
●​ Stateful vs. Stateless Servers:
○​ Stateless: Server doesn't maintain any state about its clients (e.g., NFS). Easier for
recovery, but requires clients to resend context with each request.
○​ Stateful: Server maintains information about client interactions (e.g., SMB). Can offer
better performance but more complex recovery.
●​ Replication: Storing multiple copies of files on different servers for fault tolerance and
improved read performance.
●​ Examples:
○​ NFS (Network File System): Widely used, stateless protocol.
○​ SMB/CIFS: Common in Windows environments, stateful.
○​ AFS (Andrew File System): Focuses on scalability and strong client-side caching.
○​ HDFS (Hadoop Distributed File System): Designed for very large files and batch
processing in big data environments.
○​ Google File System (GFS): Proprietary, designed for large-scale data storage by
Google.

You might also like