0% found this document useful (0 votes)
12 views9 pages

Machine Learning

The document introduces various data repository sites such as Kaggle, UCI Machine Learning Repository, and Google Dataset Search, explaining their features and types. It also covers the basics of Python programming, including its applications, data types, file handling, and provides examples of basic Python scripts. Overall, it serves as a guide for understanding data repositories and learning Python programming.

Uploaded by

sanjanabhola11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views9 pages

Machine Learning

The document introduces various data repository sites such as Kaggle, UCI Machine Learning Repository, and Google Dataset Search, explaining their features and types. It also covers the basics of Python programming, including its applications, data types, file handling, and provides examples of basic Python scripts. Overall, it serves as a guide for understanding data repositories and learning Python programming.

Uploaded by

sanjanabhola11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Program – 01

Program :Introduction to different repository sites like Kaggle , UCI etc.


Data repository :

A data repository is a centralized storage location where datasets are collected, stored, and
managed. It is used for organizing and sharing data for research, machine learning, and
analysis.
Types of Data Repositories :

1. Public Repositories – Open to everyone, e.g., Kaggle, UCI ML Repository, Google


Dataset Search.
2. Private Repositories – Used by organizations for internal data storage.
3. Cloud-Based Repositories – Hosted on platforms like AWS, Google Cloud, and
Azure.
Features of a Data Repository :

• Stores structured and unstructured data.


• Provides metadata (information about the dataset).
• Supports data sharing and collaboration.
• Ensures data security and integrity.
1. Kaggle

• One of the most popular platforms for data science and ML.

• Provides datasets, code notebooks, and competitions.

• Has an active community for discussions and learning.

• Example: Titanic dataset for classification problems.


2. UCI Machine Learning Repository

• A collection of well-documented datasets for research and education.

• Datasets are manually curated and often used in academic papers.

• Example: Iris dataset for classification.

3. Google Dataset Search

• A search engine to find datasets from various sources.

• Helps discover datasets from government sites, research institutes, etc.


• Shows dataset details (size, format, source, date)

• Saves time in finding reliable data

• Useful for all levels (beginner to advanced)

• No account required
4. AWS Open Data Registry

• Amazon’s open dataset collection.

• Useful for big data and cloud computing applications.

• Platform by Amazon Web Services (AWS)


• Provides free, large-scale public datasets

• Data is stored on AWS cloud (S3 buckets)

• Designed for big data, AI, and research projects.

5. Papers with Code


Papers with Code (PwC) is a free platform that connects research papers with their official
implementations (code), datasets, and benchmarks.
• Provides datasets along with research papers and code implementations.

• Good for state-of-the-art ML research.


Program-02
Program :To study basics of python .
Python is a versatile, high-level, general-purpose programming language known for its
readability and ease of use, making it a popular choice for beginners and experienced
developers alike. It's used in various fields, including web development, data analysis,
machine learning, and more

Why Learn Python?

• Beginner-Friendly: Its simple syntax and clear error messages make it an excellent
language for those new to programming.

• Career Opportunities: Python is in high demand across various industries, offering


ample career opportunities for developers.

• Wide Range of Applications: From web development and data science to automation
and scripting, Python's versatility allows you to tackle a wide range of problems.

• Active Community: The large and supportive Python community provides ample
resources, tutorials, and support for learners and developers.

Data Types in Python :

Data types in Python are a way to classify data items. They represent the kind of value, which
determines what operations can be performed on that data. Since everything is an object in
Python programming, Python data types are classes and variables are instances (objects) of
these classes.

i= 50 # int
f= 60.5 # float
s = "Hello World" # string
l = ["Hello", "World"]# list
t = ("Python", "is", "easy") # tuple
dict = {"name" : "Aryan",
"Branch" : "CSE"
}
print(type(i))
print(type(f))
print(type(s))
print(type(l))
print(type(t))
print(type(dict))

File Handling :
File handling in Python involves a standard process: open a file,
perform read or write operations, and then close it. The most common and recommended
approach uses the with statement, which automatically handles file closure, even if errors
occur.

File Modes :

When opening a file, you specify a mode to indicate how you intend to use it. Common
modes include:

• 'r' (Read): Default mode for reading.

• 'w' (Write): For writing, overwriting existing content or creating a new file.
• 'a' (Append): For adding data to the end of a file.

• 'x' (Exclusive Creation): Creates a file only if it doesn't already exist.

• 't' (Text): Default mode for text files.

• 'b' (Binary): For non-text files.

• '+' (Update): Allows both reading and writing.


Python Applications:

• Web Development: Building websites and web applications using frameworks like
Django and Flask.

• Data Analysis: Analysing and visualizing data using libraries like NumPy, Pandas,
and Matplotlib.

• Machine Learning: Developing and deploying machine learning models using


libraries like Scikit-learn and TensorFlow.

• Automation: Automating tasks and processes using Python scripts.

• Game Development: Creating simple and complex games using libraries like
Pygame.
Program – 03
Program :Write some basic script in python .
• Script – 1: A Basic Calculator
num1 = float(input("Enter first number: "))

num2 = float(input("Enter second number: "))

print("Sum:", num1 + num2)

print("Difference:", num1 - num2)

print("Product:", num1 * num2)

print("Quotient:", num1 / num2 if num2 != 0 else "Undefined (division by zero)")

Output :

• Script -2 : For finding odd or even number


num = int(input("Enter a number: "))

if num % 2 == 0:

print(f"{num} is Even")

else:

print(f"{num} is Odd")

Output :
• Script – 3 : For finding factorial of a number using function .
def factorial(n):

return 1 if n == 0 else n * factorial(n - 1)

num = int(input("Enter a number: "))

print(f"Factorial of {num} is {factorial(num)}")


Output :

• Script – 4 : For printing multiplication table of given number n using loops .


n = int(input(“Enter n :”)
for i in range(1,11,1) :
print(n*i)
Output :

• Script – 5 : For generating a random number .


import random
print("Random number between 1 and 100:", [Link](1, 100))

Output :

You might also like