Tutorial: Deploy on-premises server (via docker)

Welcome to the tutorial on deploying on-premises BeeCR AI and API servers. This guide will walk you through the process of downloading and running these servers on your own infrastructure.

📄 Prerequirements

To deploy the solution on a server, the following is required:

A server based on Linux (ideally Ubuntu or Debian).
A graphics card with at least 24 GB of memory or more.
NVidia driver.
Hint: You can ensure that the driver is present and working by executing a command such as nvidia-smi (the command should list the available GPUs).
Docker along with Docker Compose.
NVIDIA Container Toolkit

1️⃣ Get a Docker Compose file

Download the Docker Compose configuration file to a convenient location:

curl -sLO 'https://docs.beecr.ai/on-premises/compose.yml'

2️⃣ Prepare a license

This tutorial assumes that you have a license file issued by our team. Please contact us to obtain/purchase the license for BeeCR if you do not have it yet.

Once you have obtained the license file, please place it in the same directory where you downloaded compose.yml in the previous step.

3️⃣ Pull Docker images (optional)

If your target server has an Internet connection, then you may skip this step because Docker will pull images automatically. Otherwise, you may want to pull and save images on a computer with an Internet connection, transfer them to the target server, and load them.

To pull the latest images on a computer with an Internet connection and save them:

docker pull "cr.yandex/crpnppdjigsa1876k1eq/beecr/ai"
docker pull "cr.yandex/crpnppdjigsa1876k1eq/beecr/api"
docker save "cr.yandex/crpnppdjigsa1876k1eq/beecr/ai" -o "beecr-ai.tar"
docker save "cr.yandex/crpnppdjigsa1876k1eq/beecr/api" -o "beecr-api.tar"

Transfer the image archives using a convenient method, such as the scp command:

scp beecr*.tar USER@SERVER

On the target server, load the Docker images from the archives:

docker load -i "beecr-ai.tar"
docker load -i "beecr-api.tar"

4️⃣ Edit Docker Compose configuration (optional)

The default options set in the provided compose.yml satisfy the needs of most users. However, you may wish to adjust some settings.

For example, by default, the provided Docker Compose configuration exposes the API on port 8000. If you wish to use another port, please make changes in the relevant section of compose.yml.

Another example of an option you may want to change is the path to the license file. As mentioned above, the expected location for this file is the same directory as the one containing compose.yml. However, if you wish to keep it somewhere else, simply change ./beecr.lic to the appropriate path on your file system or Docker volume.

Links:

If you need more information about the provided configuration, please refer to the documentation for Code Review Docker Compose configuration.
If you are not familiar with Docker Compose, consider reading "Getting started" and the Docker Compose documentation.
Here are the relevant Docker documentation about volumes and bind-mounts: Volumes, Bind Mounts.

5️⃣ Run containers

Run the following command in the directory containing compose.yml to start up AI and API containers in daemon mode:

docker-compose up -d

6️⃣ Check the logs

Once you have started the containers, it is a good idea to check the logs to ensure everything is functioning correctly.

Check the logs of the container with the BeeCR API server:

docker logs "beecr-api"

The output should resemble the following:

...
{"loglevel": "info", "workers": 4, "bind": "0.0.0.0:80", "graceful_timeout": 120, "timeout": 3600, "keepalive": 5, "errorlog": "-", "accesslog": "-", "workers_per_core": 1.0, "use_max_workers": 4, "host": "0.0.0.0", "port": "80"}
🟢 License check passed. License info: GPU based license.
INFO  [BeeCR] Starting gunicorn 20.1.0
INFO  [BeeCR] Listening at: http://0.0.0.0:80 (1)
INFO  [BeeCR] Using worker: app.gunicorn_conf.MyUvicornWorker
INFO  [BeeCR] Booting worker with pid: 10
INFO  [BeeCR] Booting worker with pid: 12
...

Pay close attention to the logs related to the license check. If everything is working correctly, you should see a 🟢 License check passed log message with additional details about the license.

Also, check the logs of the container that serves the AI model:

docker logs "beecr-ai"

Obtaining the GPU name in the logs is a good sign, indicating that the AI server is utilizing it.

...
time=2024-06-06T15:49:27.338Z level=INFO source=types.go:98 msg="inference compute" id=GPU-96b9ff66-b2f6-e4c5-123a-eac0416a2789 library=cuda compute=8.9 driver=12.2 name="NVIDIA GeForce RTX 4090" total="23.6 GiB" available="23.3 GiB"
...

7️⃣ Check if the API server is responding

Lastly, let's verify that the API server is responding by sending a GET HTTP request to the /version endpoint.

⚠️ Note: It is advisable to execute this request from outside the server to confirm that your firewall (if configured) permits incoming requests.

Use the following command to send the request:

curl 'http://SERVER:8000/version'

Replace SERVER:8000 in the example above with the address of your target server and port.

If everything is functioning correctly, you should receive a text response displaying the API server's version, for example:

v1.27.1

👍 Conclusion

In conclusion, you have successfully learned how to deploy BeeCR AI and API servers on-premises. By following the steps outlined in this tutorial, you have set up the necessary environment, obtained the required license, pulled Docker images, configured the Docker Compose settings, and started the containers. Checking the logs ensures that the servers are running as expected. Should you encounter any issues or have further questions, feel free to reach out for assistance. Thank you for following along with this tutorial.