About
PendriveGPT is a portable, self-contained artificial intelligence environment. It enables the execution of Large Language Models (LLMs) directly from a high-speed USB flash drive.
The primary objective is the provision of advanced AI capabilities with absolute data privacy, zero latency dependency on external networks, and hardware portability.
How does PendriveGPT work?
The system operates through three primary components:
- Hardware: High-speed NVMe-based USB interface facilitates rapid model weight transfer to system RAM.
- Inference Engine: Utilizes a compiled C/C++ engine (llama.cpp) to execute tensor calculations directly on the host CPU. No dedicated GPU is required.
- Model Weights: Employs quantized GGUF format files. This compression technique reduces memory footprint while preserving statistical accuracy.
Memory and Conversation History
Does PendriveGPT remember my previous conversations?
No. PendriveGPT does not retain historical conversation data after session termination.
How does PendriveGPT memory work?
Memory operates exclusively within the volatile RAM of the active browser tab. The system stores dialogue turns in a temporary data array. This array is transmitted to the local server during each request to maintain context.
Why is that?
This architecture guarantees absolute privacy. Page reload, tab closure, or execution of the reset command initiates an immediate data purge. The elimination of local database files prevents unauthorized access to user interactions.
System Attributes: Safe, Offline, Portable, Free
- 100% Safe: Zero telemetry or data extraction. Information processing occurs exclusively on the host hardware. External data transmission is disabled.
- Offline: The server process binds to the local loopback address (`localhost`). Internet connectivity is not required for inference generation.
- Portable: The engine binaries, HTML user interface, and model weights reside entirely on the USB storage device. Host operating system installation is unnecessary.
- Free: The architecture utilizes open-source software and open-weights models. Operation incurs zero subscription fees or API costs.
AI Model and Licensing
What LLM model runs PendriveGPT?
The default configuration utilizes the Meta Llama 3.1 8B Instruct model. Model weights are quantized to 4-bit format (GGUF) for optimal execution on consumer-grade hardware.
What kind of license does it allow?
The Meta Llama 3.1 Community License governs usage. This license permits research and commercial application. Commercial use is restricted only if monthly active users exceed 700 million.
System Requirements
Hardware specifications for optimal inference generation:
- Processor (CPU): Multi-core processor with AVX2 instruction set support.
- Memory (RAM): 16 GB system memory recommended.
- Interface: USB-C 3.2 Gen 2 (10 Gbps) or Thunderbolt 3/4 port.
- Operating System: Windows 10/11 (64-bit) or macOS (11.0 or newer).
FAQ (Frequently Asked Questions)
Can I upgrade the AI model?
Yes. Replacement of the `.gguf` file within the `/models` directory updates the neural network. Modification of the launcher script parameters is required to match the new file name.
Why does the system fan speed increase during use?
Inference generation requires intensive mathematical calculation. High CPU utilization generates thermal output, prompting active cooling mechanisms.