mirror of
https://github.com/ParisNeo/ollama_proxy_server.git
synced 2025-09-05 20:40:07 +00:00
Compare commits
32 Commits
boosted_ve
...
dbc9e6bfb1
Author | SHA1 | Date | |
---|---|---|---|
![]() |
dbc9e6bfb1 | ||
![]() |
cfcdf96882 | ||
![]() |
5d43b26c4c | ||
![]() |
cc0a93127f | ||
![]() |
7797677b15 | ||
![]() |
b71d853e22 | ||
![]() |
92192a0263 | ||
![]() |
9f9f4b68ef | ||
![]() |
98805b7991 | ||
![]() |
5ae23dab5f | ||
![]() |
0a17f97a84 | ||
![]() |
bfc87eda85 | ||
![]() |
f4336890cd | ||
![]() |
9d15a040e9 | ||
![]() |
3ecac22486 | ||
![]() |
3076bbf392 | ||
![]() |
2691533b43 | ||
![]() |
7f6faadc4d | ||
![]() |
c4c0de3f4d | ||
![]() |
bbc343a8e1 | ||
![]() |
52c2568060 | ||
![]() |
a8e50f83b6 | ||
![]() |
61989c7db9 | ||
![]() |
37d8c3f865 | ||
![]() |
6b63597b8b | ||
![]() |
72699065a1 | ||
![]() |
4a320f0929 | ||
![]() |
618cb57dc9 | ||
![]() |
6880c40d7a | ||
![]() |
28ebc14020 | ||
![]() |
c923a7860e | ||
![]() |
349cd117b8 |
4
.gitignore
vendored
4
.gitignore
vendored
@@ -159,4 +159,6 @@ cython_debug/
|
||||
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
|
||||
#.idea/
|
||||
|
||||
.vscode
|
||||
.vscode
|
||||
config.ini
|
||||
authorized_users.txt
|
||||
|
@@ -21,5 +21,8 @@ COPY authorized_users.txt .
|
||||
# Start the proxy server as entrypoint
|
||||
ENTRYPOINT ["ollama_proxy_server"]
|
||||
|
||||
# Do not buffer output, e.g. logs to stdout
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
|
||||
# Set command line parameters
|
||||
CMD ["--config", "./config.ini", "--users_list", "./authorized_users.txt", "--port", "8080"]
|
||||
|
258
README.md
258
README.md
@@ -1,75 +1,225 @@
|
||||
# Ollama Proxy Server
|
||||
|
||||
Ollama Proxy Server is a lightweight reverse proxy server designed for load balancing and rate limiting. It is licensed under the Apache 2.0 license and can be installed using pip. This README covers setting up, installing, and using the Ollama Proxy Server.
|
||||
[](LICENSE)
|
||||
[](https://www.python.org/downloads/release/python-311/)
|
||||
[](https://github.com/ParisNeo/ollama_proxy_server)
|
||||
|
||||
## Prerequisites
|
||||
Make sure you have Python (>=3.8) and Apache installed on your system before proceeding.
|
||||
Ollama Proxy Server is a lightweight, secure proxy server designed to add a security layer to one or multiple Ollama servers. It routes incoming requests to the backend server with the lowest load, minimizing server strain and improving responsiveness. Built with Python, this project is ideal for managing distributed Ollama instances with authentication and logging capabilities.
|
||||
|
||||
**Author:** ParisNeo
|
||||
|
||||
**License:** Apache 2.0
|
||||
|
||||
**Repository:** [https://github.com/ParisNeo/ollama_proxy_server](https://github.com/ParisNeo/ollama_proxy_server)
|
||||
|
||||
## Features
|
||||
|
||||
* **Load Balancing:** Routes requests to the Ollama server with the fewest ongoing requests.
|
||||
* **Security:** Implements bearer token authentication using a `user:key` format.
|
||||
* **Asynchronous Logging:** Logs access and errors to a CSV file without blocking request handling.
|
||||
* **Connection Pooling:** Uses persistent HTTP connections for faster backend communication.
|
||||
* **Streaming Support:** Properly forwards streaming responses from Ollama servers.
|
||||
* **Command-Line Tools:** Includes utilities to run the server and manage users.
|
||||
* **Cross-Platform:** Runs on any OS supporting Python 3.11.
|
||||
|
||||
## Project Structure
|
||||
|
||||
```plaintext
|
||||
ollama_proxy_server/
|
||||
|- add_user.py # Script to add users to the authorized list
|
||||
|- main.py # Main proxy server script
|
||||
example.authorized_users.txt # Example authorized users file
|
||||
example.config.ini # Example configuration file
|
||||
.gitignore # Git ignore file
|
||||
Dockerfile # Docker configuration
|
||||
LICENSE # Apache 2.0 license text
|
||||
requirements.txt # Runtime dependencies
|
||||
requirements_dev.txt # Development dependencies
|
||||
setup.py # Setup script for installation
|
||||
README.md # This file
|
||||
```
|
||||
|
||||
## Installation
|
||||
1. Clone or download the `ollama_proxy_server` repository from GitHub: https://github.com/ParisNeo/ollama_proxy_server
|
||||
2. Navigate to the cloned directory in the terminal and run `pip install -e .`
|
||||
|
||||
## Installation using Dockerfile
|
||||
1. Clone this repository as described above.
|
||||
2. Build your Container-Image with the Dockerfile provided by this repository
|
||||
### Prerequisites
|
||||
|
||||
### Podman
|
||||
`cd ollama_proxy_server`
|
||||
`podman build -t ollama_proxy_server:latest .`
|
||||
* Python 3.11 or higher
|
||||
* Git (optional, for cloning the repository)
|
||||
|
||||
### Docker
|
||||
`cd ollama_proxy_server`
|
||||
`docker build -t ollama_proxy_server:latest .`
|
||||
### Option 1: Install from PyPI (Not Yet Published)
|
||||
|
||||
Once published, install using pip:
|
||||
|
||||
```bash
|
||||
pip install ollama_proxy_server
|
||||
```
|
||||
|
||||
### Option 2: Install from Source
|
||||
|
||||
Clone the repository:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/ParisNeo/ollama_proxy_server.git
|
||||
cd ollama_proxy_server
|
||||
```
|
||||
|
||||
Install dependencies:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Install the package:
|
||||
|
||||
```bash
|
||||
pip install .
|
||||
```
|
||||
|
||||
### Option 3: Use Docker
|
||||
|
||||
Build the Docker image:
|
||||
|
||||
```bash
|
||||
docker build -t ollama_proxy_server .
|
||||
```
|
||||
|
||||
Run the container:
|
||||
|
||||
```bash
|
||||
docker run -p 8080:8080 -v $(pwd)/config.ini:/app/config.ini -v $(pwd)/authorized_users.txt:/app/authorized_users.txt ollama_proxy_server
|
||||
```
|
||||
|
||||
Test that it works:
|
||||
|
||||
```bash
|
||||
curl localhost:8080 -H "Authorization: Bearer user1:0XAXAXAQX5A1F"
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Servers configuration (config.ini)
|
||||
Create a file named `config.ini` in the same directory as your script, containing server configurations:
|
||||
```makefile
|
||||
[DefaultServer]
|
||||
### `config.ini`
|
||||
|
||||
Copy `config.ini.example` to `config.ini` and edit it:
|
||||
|
||||
```ini
|
||||
[server0]
|
||||
url = http://localhost:11434
|
||||
queue_size = 5
|
||||
|
||||
[SecondaryServer]
|
||||
url = http://localhost:3002
|
||||
queue_size = 3
|
||||
|
||||
# Add as many servers as needed, in the same format as [DefaultServer] and [SecondaryServer].
|
||||
# Add more servers as needed
|
||||
# [server1]
|
||||
# url = http://another-server:11434
|
||||
```
|
||||
Replace `http://localhost:11434/` with the URL and port of the first server. The `queue_size` value indicates the maximum number of requests that can be queued at a given time for this server.
|
||||
|
||||
### Authorized users (authorized_users.txt)
|
||||
Create a file named `authorized_users.txt` in the same directory as your script, containing a list of user:key pairs, separated by commas and each on a new line:
|
||||
```text
|
||||
user1:key1
|
||||
user2:key2
|
||||
```
|
||||
Replace `user1`, `key1`, `user2`, and `key2` with the desired username and API key for each user.
|
||||
You can also use the `ollama_proxy_add_user` utility to add user and generate a key automatically:
|
||||
```makefile
|
||||
ollama_proxy_add_user --users_list [path to the authorized `authorized_users.txt` file]
|
||||
* `url`: The URL of an Ollama backend server.
|
||||
|
||||
### `authorized_users.txt`
|
||||
|
||||
Copy `authorized_users.txt.example` to `authorized_users.txt` and edit it:
|
||||
|
||||
```plaintext
|
||||
user:key
|
||||
another_user:another_key
|
||||
```
|
||||
|
||||
## Usage
|
||||
### Starting the server
|
||||
Start the Ollama Proxy Server by running the following command in your terminal:
|
||||
```bash
|
||||
python3 ollama_proxy_server/main.py --config [configuration file path] --users_list [users list file path] --port [port number to access the proxy]
|
||||
```
|
||||
The server will listen on port 808x, with x being the number of available ports starting from 0 (e.g., 8080, 8081, etc.). The first available port will be automatically selected if no other instance is running.
|
||||
|
||||
### Client requests
|
||||
To send a request to the server, use the following command:
|
||||
```bash
|
||||
curl -X <METHOD> -H "Authorization: Bearer <USER_KEY>" http://localhost:<PORT>/<PATH> [--data <POST_DATA>]
|
||||
```
|
||||
Replace `<METHOD>` with the HTTP method (GET or POST), `<USER_KEY>` with a valid user:key pair from your `authorized_users.txt`, `<PORT>` with the port number of your running Ollama Proxy Server, and `<PATH>` with the target endpoint URL (e.g., "/api/generate"). If you are making a POST request, include the `--data <POST_DATA>` option to send data in the body.
|
||||
### Running the Server
|
||||
|
||||
For example:
|
||||
```bash
|
||||
curl -X POST -H "Authorization: Bearer user1:key1" http://localhost:8080/api/generate --data '{'model':'mixtral:latest,'prompt': "Once apon a time,","stream":false,"temperature": 0.3,"max_tokens": 1024}'
|
||||
```
|
||||
### Starting the server using the created Container-Image
|
||||
To start the proxy in background with the above created image, you can use either
|
||||
1) docker: `docker run -d --name ollama-proxy-server -p 8080:8080 ollama_proxy_server:latest`
|
||||
2) podman: `podman run -d --name ollama-proxy-server -p 8080:8080 ollama_proxy_server:latest`
|
||||
python main.py --config config.ini --users_list authorized_users.txt
|
||||
```
|
||||
|
||||
### Managing Users
|
||||
|
||||
Use the `add_user.py` script to add new users.
|
||||
|
||||
```bash
|
||||
python add_user.py <username> <key>
|
||||
```
|
||||
|
||||
Alternatively, you can use the newly created `ops` command:
|
||||
|
||||
```bash
|
||||
sudo ops add_user username:password
|
||||
```
|
||||
|
||||
## Setup as a Service
|
||||
|
||||
### Using `setup_service.sh`
|
||||
|
||||
The repository includes a script called `setup_service.sh` to set up Ollama Proxy Server as a systemd service. This allows it to run in the background and start on boot.
|
||||
|
||||
1. **Download the Repository:**
|
||||
|
||||
```bash
|
||||
git clone https://github.com/ParisNeo/ollama_proxy_server.git
|
||||
cd ollama_proxy_server
|
||||
```
|
||||
|
||||
2. **Make `setup_service.sh` Executable:**
|
||||
|
||||
```bash
|
||||
chmod +x setup_service.sh
|
||||
```
|
||||
|
||||
3. **Run the Script with sudo Privileges:**
|
||||
|
||||
```bash
|
||||
sudo ./setup_service.sh /path/to/working/directory
|
||||
```
|
||||
|
||||
Replace `/path/to/working/directory` with the path where you want to set up your proxy server.
|
||||
|
||||
4. **Follow Prompts:**
|
||||
- You will be prompted to provide a port number (default is 11534) and log path.
|
||||
- You'll also add users and their passwords which will populate `/etc/ops/authorized_users.txt`.
|
||||
|
||||
5. **Start the Service:**
|
||||
|
||||
```bash
|
||||
sudo systemctl start ollama-proxy-server
|
||||
```
|
||||
|
||||
6. **Enable the Service to Start on Boot:**
|
||||
|
||||
```bash
|
||||
sudo systemctl enable ollama-proxy-server
|
||||
```
|
||||
|
||||
7. **Check the Status of the Service:**
|
||||
|
||||
```bash
|
||||
sudo journalctl -u ollama-proxy-server -f
|
||||
```
|
||||
|
||||
### Managing Users with `ops` Command
|
||||
|
||||
After setting up the service, you can add more users using the new `ops` command:
|
||||
|
||||
```bash
|
||||
sudo ops add_user username:password
|
||||
```
|
||||
|
||||
## Contributing
|
||||
|
||||
Contributions are welcome! Please follow these steps:
|
||||
|
||||
1. Fork the repository.
|
||||
2. Create a feature branch (`git checkout -b feature/your-feature`).
|
||||
3. Commit your changes (`git commit -am 'Add your feature'`).
|
||||
4. Push to the branch (`git push origin feature/your-feature`).
|
||||
5. Open a Pull Request.
|
||||
|
||||
See `CONTRIBUTING.md` for more details (to be added).
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the Apache License 2.0. See the `LICENSE` file for details.
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
Built by ParisNeo.
|
||||
|
||||
Thanks to the open-source community for tools like `requests` and `ascii_colors`.
|
||||
|
||||
See you soon!
|
||||
|
@@ -1,10 +1,12 @@
|
||||
[DefaultServer]
|
||||
url = http://localhost:11434
|
||||
queue_size = 5
|
||||
max_parallel_connections = 4
|
||||
queue_size = 100
|
||||
|
||||
[SecondaryServer]
|
||||
url = http://localhost:3002
|
||||
queue_size = 3
|
||||
max_parallel_connections = 3
|
||||
queue_size = 100
|
||||
|
||||
# Add more servers as you need.
|
||||
|
@@ -17,11 +17,24 @@ from ascii_colors import ASCIIColors
|
||||
from pathlib import Path
|
||||
import csv
|
||||
import datetime
|
||||
from ascii_colors import ASCIIColors, trace_exception
|
||||
|
||||
def get_config(filename):
|
||||
config = configparser.ConfigParser()
|
||||
config.read(filename)
|
||||
return [(name, {'url': config[name]['url'], 'queue': Queue()}) for name in config.sections()]
|
||||
return [
|
||||
(
|
||||
name,
|
||||
{
|
||||
'url': config[name]['url'],
|
||||
'max_parallel_connections': int(config[name].get('max_parallel_connections', 10)),
|
||||
'queue_size': int(config[name].get('queue_size', 100)), # Default queue size of 100
|
||||
'queue': Queue(maxsize=int(config[name].get('queue_size', 100))),
|
||||
'active_requests': 0
|
||||
}
|
||||
)
|
||||
for name in config.sections()
|
||||
]
|
||||
|
||||
# Read the authorized users and their keys from a file
|
||||
def get_authorized_users(filename):
|
||||
@@ -29,47 +42,66 @@ def get_authorized_users(filename):
|
||||
lines = f.readlines()
|
||||
authorized_users = {}
|
||||
for line in lines:
|
||||
if line=="":
|
||||
if line == "":
|
||||
continue
|
||||
try:
|
||||
user, key = line.strip().split(':')
|
||||
authorized_users[user] = key
|
||||
except:
|
||||
ASCIIColors.red(f"User entry broken:{line.strip()}")
|
||||
ASCIIColors.red(f"User entry broken: {line.strip()}")
|
||||
return authorized_users
|
||||
def display_config(args, servers, authorized_users):
|
||||
print("\n🌟 Current Configuration 🌟")
|
||||
ASCIIColors.blue(f"📁 Config File: {args.config}")
|
||||
ASCIIColors.blue(f"🗄️ Log Path: {args.log_path}")
|
||||
ASCIIColors.blue(f"👤 Users List: {args.users_list}")
|
||||
ASCIIColors.blue(f"🔢 Port Number: {args.port}")
|
||||
ASCIIColors.yellow(f"⚠️ Deactivate Security: {'Yes 🚫' if args.deactivate_security else 'No ✅'}")
|
||||
|
||||
# Additional config details
|
||||
if servers:
|
||||
print("\n🌐 Servers Configuration:")
|
||||
for server in servers:
|
||||
ASCIIColors.green(f" {server[0]}: {server[1]}")
|
||||
|
||||
|
||||
print("\n🔑 Authorized Users:")
|
||||
for user in authorized_users:
|
||||
ASCIIColors.yellow(f" - 👤 {user}")
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--config', default="config.ini", help='Path to the authorized users list')
|
||||
parser.add_argument('--config', default="config.ini", help='Path to the config file')
|
||||
parser.add_argument('--log_path', default="access_log.txt", help='Path to the access log file')
|
||||
parser.add_argument('--users_list', default="authorized_users.txt", help='Path to the config file')
|
||||
parser.add_argument('--port', type=int, default=8000, help='Port number for the server')
|
||||
parser.add_argument('--users_list', default="authorized_users.txt", help='Path to the authorized users list')
|
||||
parser.add_argument('--port', type=int, default=11534, help='Port number for the server (default is 100 + default ollama port number)')
|
||||
parser.add_argument('-d', '--deactivate_security', action='store_true', help='Deactivates security')
|
||||
args = parser.parse_args()
|
||||
servers = get_config(args.config)
|
||||
authorized_users = get_authorized_users(args.users_list)
|
||||
deactivate_security = args.deactivate_security
|
||||
ASCIIColors.red("Ollama Proxy server")
|
||||
ASCIIColors.red("Author: ParisNeo")
|
||||
|
||||
args = parser.parse_args()
|
||||
servers = get_config(args.config)
|
||||
authorized_users = get_authorized_users(args.users_list)
|
||||
|
||||
ASCIIColors.red("Ollama Proxy Server")
|
||||
ASCIIColors.multicolor(["Author:", "ParisNeo"], [ASCIIColors.color_red, ASCIIColors.color_magenta])
|
||||
|
||||
# Display the current configuration
|
||||
display_config(args, servers, authorized_users)
|
||||
class RequestHandler(BaseHTTPRequestHandler):
|
||||
def add_access_log_entry(self, event, user, ip_address, access, server, nb_queued_requests_on_server, error=""):
|
||||
log_file_path = Path(args.log_path)
|
||||
|
||||
if not log_file_path.exists():
|
||||
with open(log_file_path, mode='w', newline='') as csvfile:
|
||||
try:
|
||||
if not log_file_path.exists():
|
||||
with open(log_file_path, mode='w', newline='') as csvfile:
|
||||
fieldnames = ['time_stamp', 'event', 'user_name', 'ip_address', 'access', 'server', 'nb_queued_requests_on_server', 'error']
|
||||
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
|
||||
writer.writeheader()
|
||||
|
||||
with open(log_file_path, mode='a', newline='') as csvfile:
|
||||
fieldnames = ['time_stamp', 'event', 'user_name', 'ip_address', 'access', 'server', 'nb_queued_requests_on_server', 'error']
|
||||
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
|
||||
writer.writeheader()
|
||||
|
||||
with open(log_file_path, mode='a', newline='') as csvfile:
|
||||
fieldnames = ['time_stamp', 'event', 'user_name', 'ip_address', 'access', 'server', 'nb_queued_requests_on_server', 'error']
|
||||
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
|
||||
row = {'time_stamp': str(datetime.datetime.now()), 'event':event, 'user_name': user, 'ip_address': ip_address, 'access': access, 'server': server, 'nb_queued_requests_on_server': nb_queued_requests_on_server, 'error': error}
|
||||
writer.writerow(row)
|
||||
|
||||
row = {'time_stamp': str(datetime.datetime.now()), 'event': event, 'user_name': user, 'ip_address': ip_address, 'access': access, 'server': server, 'nb_queued_requests_on_server': nb_queued_requests_on_server, 'error': error}
|
||||
writer.writerow(row)
|
||||
except Exception as ex:
|
||||
trace_exception(ex)
|
||||
def _send_response(self, response):
|
||||
self.send_response(response.status_code)
|
||||
for key, value in response.headers.items():
|
||||
@@ -105,7 +137,7 @@ def main():
|
||||
return False
|
||||
token = auth_header.split(' ')[1]
|
||||
user, key = token.split(':')
|
||||
|
||||
|
||||
# Check if the user and key are in the list of authorized users
|
||||
if authorized_users.get(user) == key:
|
||||
self.user = user
|
||||
@@ -115,10 +147,10 @@ def main():
|
||||
return False
|
||||
except:
|
||||
return False
|
||||
|
||||
|
||||
def proxy(self):
|
||||
self.user = "unknown"
|
||||
if not deactivate_security and not self._validate_user_and_key():
|
||||
if not args.deactivate_security and not self._validate_user_and_key():
|
||||
ASCIIColors.red(f'User is not authorized')
|
||||
client_ip, client_port = self.client_address
|
||||
# Extract the bearer token from the headers
|
||||
@@ -126,16 +158,15 @@ def main():
|
||||
if not auth_header or not auth_header.startswith('Bearer '):
|
||||
self.add_access_log_entry(event='rejected', user="unknown", ip_address=client_ip, access="Denied", server="None", nb_queued_requests_on_server=-1, error="Authentication failed")
|
||||
else:
|
||||
token = auth_header.split(' ')[1]
|
||||
token = auth_header.split(' ')[1]
|
||||
self.add_access_log_entry(event='rejected', user=token, ip_address=client_ip, access="Denied", server="None", nb_queued_requests_on_server=-1, error="Authentication failed")
|
||||
self.send_response(403)
|
||||
self.end_headers()
|
||||
return
|
||||
return
|
||||
url = urlparse(self.path)
|
||||
path = url.path
|
||||
get_params = parse_qs(url.query) or {}
|
||||
|
||||
|
||||
if self.command == "POST":
|
||||
content_length = int(self.headers['Content-Length'])
|
||||
post_data = self.rfile.read(content_length)
|
||||
@@ -143,20 +174,28 @@ def main():
|
||||
else:
|
||||
post_params = {}
|
||||
|
||||
|
||||
# Find the server with the lowest number of queue entries.
|
||||
min_queued_server = servers[0]
|
||||
# Find the server with the lowest number of active requests.
|
||||
min_active_server = servers[0]
|
||||
for server in servers:
|
||||
cs = server[1]
|
||||
if cs['queue'].qsize() < min_queued_server[1]['queue'].qsize():
|
||||
min_queued_server = server
|
||||
if cs['active_requests'] < min_active_server[1]['active_requests']:
|
||||
min_active_server = server
|
||||
|
||||
# Apply the queuing mechanism only for a specific endpoint.
|
||||
if path == '/api/generate' or path == '/api/chat' or path == '/v1/chat/completions':
|
||||
que = min_queued_server[1]['queue']
|
||||
cs = min_active_server[1]
|
||||
client_ip, client_port = self.client_address
|
||||
self.add_access_log_entry(event="gen_request", user=self.user, ip_address=client_ip, access="Authorized", server=min_queued_server[0], nb_queued_requests_on_server=que.qsize())
|
||||
que.put_nowait(1)
|
||||
try:
|
||||
# Try to acquire the queue slot for this request.
|
||||
cs['queue'].put_nowait(1)
|
||||
self.add_access_log_entry(event="gen_request", user=self.user, ip_address=client_ip, access="Authorized", server=min_active_server[0], nb_queued_requests_on_server=cs['active_requests'])
|
||||
except Queue.Full:
|
||||
# If the queue is full, log and return a 503 Service Unavailable response.
|
||||
self.add_access_log_entry(event="gen_error", user=self.user, ip_address=client_ip, access="Authorized", server=min_active_server[0], nb_queued_requests_on_server=cs['active_requests'], error="Queue is full")
|
||||
self.send_response(503)
|
||||
self.end_headers()
|
||||
return
|
||||
|
||||
try:
|
||||
post_data_dict = {}
|
||||
|
||||
@@ -164,22 +203,19 @@ def main():
|
||||
post_data_str = post_data.decode('utf-8')
|
||||
post_data_dict = json.loads(post_data_str)
|
||||
|
||||
response = requests.request(self.command, min_queued_server[1]['url'] + path, params=get_params, data=post_params, stream=post_data_dict.get("stream", False))
|
||||
response = requests.request(self.command, cs['url'] + path, params=get_params, data=post_params, stream=post_data_dict.get("stream", False))
|
||||
self._send_response(response)
|
||||
except Exception as ex:
|
||||
self.add_access_log_entry(event="gen_error",user=self.user, ip_address=client_ip, access="Authorized", server=min_queued_server[0], nb_queued_requests_on_server=que.qsize(),error=ex)
|
||||
finally:
|
||||
que.get_nowait()
|
||||
self.add_access_log_entry(event="gen_done",user=self.user, ip_address=client_ip, access="Authorized", server=min_queued_server[0], nb_queued_requests_on_server=que.qsize())
|
||||
cs['queue'].get_nowait()
|
||||
self.add_access_log_entry(event="gen_done", user=self.user, ip_address=client_ip, access="Authorized", server=min_active_server[0], nb_queued_requests_on_server=cs['active_requests'])
|
||||
else:
|
||||
# For other endpoints, just mirror the request.
|
||||
response = requests.request(self.command, min_queued_server[1]['url'] + path, params=get_params, data=post_params)
|
||||
response = requests.request(self.command, min_active_server[1]['url'] + path, params=get_params, data=post_params)
|
||||
self._send_response(response)
|
||||
|
||||
class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
|
||||
pass
|
||||
|
||||
|
||||
print('Starting server')
|
||||
server = ThreadedHTTPServer(('', args.port), RequestHandler) # Set the entry port here.
|
||||
print(f'Running server on port {args.port}')
|
||||
|
43
pyproject.toml
Normal file
43
pyproject.toml
Normal file
@@ -0,0 +1,43 @@
|
||||
[build-system]
|
||||
requires = ["setuptools", "wheel"]
|
||||
build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "ollama_proxy_server"
|
||||
version = "7.1.0"
|
||||
description = "A fastapi server for petals decentralized text generation"
|
||||
readme = { file = "README.md", content-type = "text/markdown" }
|
||||
authors = [
|
||||
{ name = "ParisNeo", email = "parisneoai@gmail.com" },
|
||||
]
|
||||
dependencies = [
|
||||
"ascii-colors>=0.11.3",
|
||||
"certifi==2024.7.4",
|
||||
"charset-normalizer==3.3.2",
|
||||
"configparser==6.0.1",
|
||||
"idna==3.6",
|
||||
"queues==0.6.3",
|
||||
"requests==2.31.0",
|
||||
"urllib3==2.2.1"
|
||||
]
|
||||
requires-python = ">=3.11"
|
||||
keywords = ["fastapi", "petals"]
|
||||
classifiers = [
|
||||
"Programming Language :: Python :: 3.11",
|
||||
"License :: OSI Approved :: Apache Software License",
|
||||
"Operating System :: OS Independent",
|
||||
]
|
||||
|
||||
[project.urls]
|
||||
Homepage = "https://github.com/ParisNeo/ollama_proxy_server"
|
||||
|
||||
[tool.setuptools.package-data]
|
||||
"*" = ["*"] # Include all package data
|
||||
|
||||
[project.scripts]
|
||||
ollama_proxy_server = "ollama_proxy_server.main:main"
|
||||
ollama_proxy_add_user = "ollama_proxy_server.add_user:main"
|
||||
|
||||
[project.optional-dependencies]
|
||||
dev = [
|
||||
]
|
7
run.sh
Normal file
7
run.sh
Normal file
@@ -0,0 +1,7 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Activate the virtual environment
|
||||
source ./venv/bin/activate
|
||||
|
||||
# Run the Python script with all passed arguments
|
||||
python ollama_proxy_server/main.py "$@"
|
192
setup_service.sh
Normal file
192
setup_service.sh
Normal file
@@ -0,0 +1,192 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Configuration with parameters
|
||||
SERVICE_NAME="ollama-proxy-server"
|
||||
USER="ops"
|
||||
|
||||
if [ "$#" -ne 1 ]; then
|
||||
echo "Usage: $0 <working_directory>"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
WORKING_DIR=$1
|
||||
LOG_DIR="$WORKING_DIR/logs"
|
||||
SCRIPT_PATH="$WORKING_DIR/ollama-proxy-server/main.py"
|
||||
CONFIG_FILE="/etc/ops/config.ini"
|
||||
AUTHORIZED_USERS_FILE="/etc/ops/authorized_users.txt"
|
||||
|
||||
# Default port and log path; these can be customized by the user
|
||||
DEFAULT_PORT=11534
|
||||
DEFAULT_LOG_PATH="$LOG_DIR/server.log"
|
||||
|
||||
echo "Setting up Ollama Proxy Server..."
|
||||
|
||||
# Create dedicated user if it doesn't exist already
|
||||
if ! id "$USER" &>/dev/null; then
|
||||
echo "Creating user $USER..."
|
||||
sudo useradd -r -s /bin/false "$USER"
|
||||
fi
|
||||
|
||||
# Ensure the working directory is writable by the dedicated user
|
||||
sudo mkdir -p "$WORKING_DIR"
|
||||
sudo cp -r * "$WORKING_DIR/"
|
||||
sudo chown -R "$USER:$USER" "$WORKING_DIR"
|
||||
|
||||
# Set permissions for logs and reports directories
|
||||
echo "Setting up directories and files..."
|
||||
sudo mkdir -p "$LOG_DIR"
|
||||
sudo mkdir -p "$WORKING_DIR/reports"
|
||||
sudo chown -R "$USER:$USER" "$LOG_DIR"
|
||||
|
||||
# Create systemd service file
|
||||
echo "Creating systemd service..."
|
||||
|
||||
read -p "Enter the port number (default: $DEFAULT_PORT): " PORT
|
||||
PORT=${PORT:-$DEFAULT_PORT}
|
||||
|
||||
read -p "Enter the log path (default: $DEFAULT_LOG_PATH): " LOG_PATH
|
||||
LOG_PATH=${LOG_PATH:-$DEFAULT_LOG_PATH}
|
||||
|
||||
sudo tee /etc/systemd/system/$SERVICE_NAME.service > /dev/null << EOF
|
||||
[Unit]
|
||||
Description=Ollama Proxy Server
|
||||
After=network.target
|
||||
Wants=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=$USER
|
||||
Group=$USER
|
||||
WorkingDirectory=$WORKING_DIR
|
||||
ExecStart=/bin/bash $WORKING_DIR/run.sh --log_path $LOG_PATH --port $PORT --config $CONFIG_FILE --users_list $AUTHORIZED_USERS_FILE
|
||||
Restart=always
|
||||
RestartSec=10
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
|
||||
# Environment
|
||||
Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
|
||||
Environment=PYTHONUNBUFFERED=1
|
||||
|
||||
# Security settings
|
||||
NoNewPrivileges=true
|
||||
PrivateTmp=true
|
||||
ProtectHome=true
|
||||
ProtectSystem=strict
|
||||
ReadWritePaths=$WORKING_DIR $LOG_DIR
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
EOF
|
||||
|
||||
# Install Python dependencies with proper permissions and environment variables preserved
|
||||
echo "Installing Python dependencies..."
|
||||
sudo -u "$USER" python3 -m venv $WORKING_DIR/venv
|
||||
sudo chown -R "$USER:$USER" $WORKING_DIR/venv
|
||||
|
||||
# Activate the virtual environment and install dependencies as user without --user flag
|
||||
echo "Activating virtualenv and installing Python packages..."
|
||||
sudo -H -u "$USER" bash << EOF
|
||||
source $WORKING_DIR/venv/bin/activate && pip install --no-cache-dir $WORKING_DIR
|
||||
EOF
|
||||
|
||||
# Create logrotate config
|
||||
echo "Setting up log rotation..."
|
||||
sudo tee /etc/logrotate.d/$SERVICE_NAME > /dev/null << EOF
|
||||
$LOG_DIR/*.log {
|
||||
daily
|
||||
rotate 15
|
||||
compress
|
||||
delaycompress
|
||||
missingok
|
||||
notifempty
|
||||
create 644 $USER $USER
|
||||
postrotate
|
||||
systemctl reload-or-restart $SERVICE_NAME
|
||||
endscript
|
||||
}
|
||||
EOF
|
||||
|
||||
# Create and populate config.ini and authorized_users.txt files
|
||||
echo "Creating configuration files..."
|
||||
sudo mkdir -p /etc/ops
|
||||
sudo tee $CONFIG_FILE > /dev/null << EOF
|
||||
[DefaultServer]
|
||||
url = http://localhost:11434
|
||||
EOF
|
||||
sudo chown $USER:$USER $CONFIG_FILE
|
||||
|
||||
echo "Adding authorized users to the list. Type 'done' when finished."
|
||||
while true; do
|
||||
read -p "Enter user:password or type 'done': " input
|
||||
if [ "$input" == "done" ]; then
|
||||
break
|
||||
fi
|
||||
echo "$input" | sudo tee -a $AUTHORIZED_USERS_FILE > /dev/null
|
||||
sudo chown $USER:$USER $AUTHORIZED_USERS_FILE
|
||||
done
|
||||
|
||||
echo "You can add more users to the authorized_users.txt file if needed."
|
||||
|
||||
# Create ops command script
|
||||
echo "Creating 'ops' command..."
|
||||
sudo tee /usr/local/bin/ops > /dev/null << 'EOF'
|
||||
#!/bin/bash
|
||||
|
||||
# Define usage function to display help message
|
||||
usage() {
|
||||
echo "Usage: $0 add_user username:password"
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Check if exactly one argument is provided and it's 'add_user'
|
||||
if [ "$#" -ne 2 ] || [ "$1" != "add_user" ]; then
|
||||
usage
|
||||
fi
|
||||
|
||||
USER_PAIR="$2"
|
||||
|
||||
# Extract the user and password from the input
|
||||
IFS=':' read -r USER PASSWORD <<< "$USER_PAIR"
|
||||
if [ -z "$USER" ] || [ -z "$PASSWORD" ]; then
|
||||
echo "Invalid username:password format."
|
||||
usage
|
||||
fi
|
||||
|
||||
AUTHORIZED_USERS_FILE="/etc/ops/authorized_users.txt"
|
||||
|
||||
# Check if the authorized_users file exists, create it otherwise
|
||||
sudo mkdir -p /etc/ops
|
||||
if [ ! -f "$AUTHORIZED_USERS_FILE" ]; then
|
||||
sudo touch $AUTHORIZED_USERS_FILE
|
||||
fi
|
||||
|
||||
# Append the new user:password pair to the file
|
||||
echo "$USER:$PASSWORD" | sudo tee -a $AUTHORIZED_USERS_FILE > /dev/null
|
||||
|
||||
# Ensure correct permissions for the file
|
||||
sudo chown ops:ops $AUTHORIZED_USERS_FILE
|
||||
|
||||
echo "User '$USER' added successfully."
|
||||
EOF
|
||||
|
||||
# Make ops command executable
|
||||
sudo chmod +x /usr/local/bin/ops
|
||||
|
||||
# Reload systemd and enable service
|
||||
echo "Enabling service..."
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable "$SERVICE_NAME"
|
||||
|
||||
echo "Service setup complete!"
|
||||
echo ""
|
||||
echo "Commands:"
|
||||
echo " Start: sudo systemctl start $SERVICE_NAME"
|
||||
echo " Stop: sudo systemctl stop $SERVICE_NAME"
|
||||
echo " Status: sudo journalctl -u $SERVICE_NAME -f"
|
||||
echo " Logs: sudo journalctl -u $SERVICE_NAME -f"
|
||||
echo " Reports: ls $WORKING_DIR/reports/"
|
||||
|
||||
echo ""
|
||||
echo "How to use the new 'ops' command:"
|
||||
echo " To add a user, run: ops add_user username:password"
|
Reference in New Issue
Block a user