Live API Starter

This project provides a starter kit for building applications that interact with the Gemini API in real-time. It supports audio and video input and provides a set of function tools for interacting with the user's system.

Installation

Install the required dependencies:
```
pip install -r requirements.txt
```
Rename the .env.example file to .env
Obtain a Gemini API key from Google AI Studio
Replace your_api_key_here in .env with your actual API key.

Important: Use headphones when running the script to prevent audio feedback loops.

Usage

To run the script:

python main.py

The script takes a video-mode flag --mode, which can be "camera", "screen", or "none". The default is "screen". To share your screen, run:

python main.py --mode screen

You can also specify the modality to use with the --modality flag, which can be "AUDIO" or "TEXT". The default is "AUDIO".

Function Tools

The function_tools directory contains a set of Python scripts that provide various functionalities for interacting with the user's system. These tools can be called by the Gemini model to perform actions such as:

click_mouse.py: Performs a mouse click at the current cursor position.
copy_and_paste.py: Inputs text to the screen by simulating typing.
copy_to_clipboard.py: Copies text to the system clipboard.
execute_js_in_brave.py: Executes JavaScript code in the currently active chromium based browser window.
function_hub.py: Manages and executes the available function tools.
get_clipboard.py: Retrieves the current text from the system clipboard.
move_mouse.py: Moves the mouse cursor to specified coordinates.
output_text_to_screen.py: Displays a message on the screen using an alert box.
press_keys.py: Simulates pressing a single key or a combination of keys.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
function_tools		function_tools
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Live API Starter

Installation

Usage

Function Tools

About

Uh oh!

Releases

Packages

Languages

License

dmdavidkov/computerusegemini

Folders and files

Latest commit

History

Repository files navigation

Live API Starter

Installation

Usage

Function Tools

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages