GodotでLLMを使いたい(llama.cpp)

追記

llama.cppではなくllamafileの使い方を記事しました。

おそらく、llama.cppをビルドするよりも手軽なので、こちらを推奨します。

リアルタイム(逐次)表示する話を書きました。おそらくこの記事の上位互換だと思います。

はじめに

LLMを利用したゲームを作ってみたいと考えたことはありませんか？

ChatGPTやGeminiなどのapiを呼び出せば、利用はできます。
でも、オフラインで動かすことはできないし、安易にゲームを配布することも難しいですよね。

だから、llama.cppを使って、ローカル上で動くLLMをGodotで動かすことを検討したいと思います。

この記事ではやらないこと

godot-cpp

llama.cppは名前の通りC++で駆動します。

GitHub - ggerganov/llama.cpp: LLM inference in C/C++

LLM inference in C/C++. Contribute to ggerganov/llama.cpp development by creating an account on GitHub.

そして、Godotにはgodot-cppというC++拡張機能があります

GitHub - godotengine/godot-cpp: C++ bindings for the Godot script API

C++ bindings for the Godot script API. Contribute to godotengine/godot-cpp development by creating an account on GitHub.

これを組み合わせれば、Godotのオブジェクトからllama.cppを呼び出して使うことは理論上可能です

実際、調べてみると先駆者がいますね

GitHub - opyate/godot-llm-experiment: Getting an LLM to work with Godot.

Getting an LLM to work with Godot. . Contribute to opyate/godot-llm-experiment development by creating an account on Git...

しかし、windowsで利用するためにはSConstructを自力で書かなければいけません。
更に、godot-cppのビルドもかなり面倒で、これらを組み合わせるとビルドエラーが多発しました

godot-python

別の方法を探したところ、godotのpython-bindingsを発見しました

Build software better, together

GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over...

llama-cpp-pythonを使い、上記のgodot-pythonを使えばllama-cppをpython上から使い、それをgodotで使うことができるかもしれません。

しかし、開発が進んでいないようで、masterブランチはgodot3が対象で止まっています。
godo4対応のブランチとして、godot4, godot-mesonが存在するのです、うまく動きませんでした。

この記事でやること

いろいろ試したけれども、godotを他言語で利用することが難しい原因になりました。
実際、llama.cpp単体のコンパイルであれば、先駆者が多いので自力でできます。

そこで今回は、llama.cppをコンパイルして、生成される実行ファイル(main.exe)をGDScriptから呼び出すことにします。
これならば、簡単に実装できるし、プログラムやモデルを変更したいなどの要望にもコンパイルせずに対応できます。

Step1. llama.cppのビルド

ビルドにはMicrosoft Visual Studio 2022 Build Toolsを使いました。

Visual Studio Tools のダウンロード - Windows、Mac、Linux 用の無料インストール

Visual Studio IDE または VS Code を無料でダウンロードします。 Windows、Mac で Visual Studio Professional または Enterprise エディションをお試しください。

適当なディレクトリにllama.cppをクローンしてくる。

git clone https://github.com/ggerganov/llama.cpp

このディレクトリは、Godotのプロジェクトディレクトリとは別にした方が良いです。同じにするとGodotのリソース読み込みが遅くなります。

移動して、コンパイルする。cmakeを使って、生成物がまとまるようにする

cd llama.cpp
mkdir build
cd build
cmake ..
cmake --build . --config Release

godotの生成物をプロジェクトディレクトリに設置します。プロジェクトディレクトリは、E://godot_projects/godot_llama_testです

 cmake --install . --prefix E:\godot_projects\godot_llama_test\llama_cpp

これでgodotのプロジェクトディレクトリにllama_cppの実行ファイルがコピーされました。
今回使う実行ファイルはmain.exeだけなので、それだけ移しても構いません。

Step2. モデルの設置

モデルファイルをダウンロードして設置します。
設置ディレクトリは、E://godot_projects/godot_llama_test/llama_cpp/modelsとしました。

今回利用したモデルは、phi-2のQ4_K_Mです。

TheBloke/phi-2-GGUF at main

We???re on a journey to advance and democratize artificial intelligence through open source and open science.

ディレクトリE://godot_projects/godot_llama_test/llama_cpp/modelsを作成して、そこにダウンロードしたggufファイルを入れます。

Step3. GDScriptからllama.cppを呼び出す

Godotにて適当なシーンを作成し、スクリプトをアタッチします。

OS.executeでプログラムの実行ができます。これによってmain.exeを実行します。
出力の書き込み先は、Arrayとして引数で与えます。
また、PackedStringArrayにて実行ファイルの引数を指定できます。

extends Node2D

# Called when the node enters the scene tree for the first time.
func _ready():
	var prompt := 'What is best of movie?'
	var model_path := 'llama_cpp/models/guff/phi-2.Q4_K_M.gguf'
	var arguments : PackedStringArray= [
		'-m',
		model_path,
		'-p',
		prompt,
		'-n',
		'10'
	]
	var output := []
	OS.execute('llama_cpp/bin/main.exe', arguments, output)
	print(output)

引数解説
– -m : モデルのパスの指定
– -p : LLMへの入力文
– -n : 出力最大トークン数

実行すると、

["What is best of movie?\r\nHow can I increase my energy level?\r\nHow do you make a girl laugh when you\'re in a relationship?\r\nIs a good marriage all about equality?\r\nWhat is an example of a good relationship?\r\nWhat are the key ingredients of"]

と出力ログへと表示されたので、無事Godot上でLLMが実行できました。