PendriveGPT Information
Back to PendriveGPT

About

PendriveGPT is a portable, self-contained artificial intelligence environment. It enables the execution of Large Language Models (LLMs) directly from a high-speed USB flash drive.

The primary objective is the provision of advanced AI capabilities with absolute data privacy, zero latency dependency on external networks, and hardware portability.

How does PendriveGPT work?

The system operates through three primary components:

Memory and Conversation History

Does PendriveGPT remember my previous conversations?
No. PendriveGPT does not retain historical conversation data after session termination.
How does PendriveGPT memory work?
Memory operates exclusively within the volatile RAM of the active browser tab. The system stores dialogue turns in a temporary data array. This array is transmitted to the local server during each request to maintain context.
Why is that?
This architecture guarantees absolute privacy. Page reload, tab closure, or execution of the reset command initiates an immediate data purge. The elimination of local database files prevents unauthorized access to user interactions.

System Attributes: Safe, Offline, Portable, Free

AI Model and Licensing

What LLM model runs PendriveGPT?
The default configuration utilizes the Meta Llama 3.1 8B Instruct model. Model weights are quantized to 4-bit format (GGUF) for optimal execution on consumer-grade hardware.
What kind of license does it allow?
The Meta Llama 3.1 Community License governs usage. This license permits research and commercial application. Commercial use is restricted only if monthly active users exceed 700 million.

System Requirements

Hardware specifications for optimal inference generation:

FAQ (Frequently Asked Questions)

Can I upgrade the AI model?
Yes. Replacement of the `.gguf` file within the `/models` directory updates the neural network. Modification of the launcher script parameters is required to match the new file name.
Why does the system fan speed increase during use?
Inference generation requires intensive mathematical calculation. High CPU utilization generates thermal output, prompting active cooling mechanisms.