{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Retriever Evaluation with MLflow"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"Download this Notebook"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "f8938de9-7fae-41cd-ad6b-7ee26c288eab",
"showTitle": false,
"title": ""
}
},
"source": [
"In MLflow 2.8.0, we introduced a new model type \"retriever\" to the `mlflow.evaluate()` API. It helps you to evaluate the retriever in a RAG application. It contains two built-in metrics `precision_at_k` and `recall_at_k`. In MLflow 2.9.0, `ndcg_at_k` is available.\n",
"\n",
"This notebook illustrates how to use `mlflow.evaluate()` to evaluate the retriever in a RAG application. It has the following steps:\n",
"\n",
"* Step 1: Install and Load Packages\n",
"* Step 2: Evaluation Dataset Preparation\n",
"* Step 3: Calling `mlflow.evaluate()`\n",
"* Step 4: Result Analysis and Visualization"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 1: Install and Load Packages"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "5bf12edb-2498-4edd-aeff-b4844451850f",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting mlflow==2.9.0\n",
" Downloading mlflow-2.9.0-py3-none-any.whl.metadata (13 kB)\n",
"Collecting langchain==0.0.339\n",
" Downloading langchain-0.0.339-py3-none-any.whl.metadata (16 kB)\n",
"Requirement already satisfied: openai in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (1.23.1)\n",
"Collecting faiss-cpu\n",
" Downloading faiss_cpu-1.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)\n",
"Collecting gensim\n",
" Downloading gensim-4.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (8.5 kB)\n",
"Collecting nltk\n",
" Downloading nltk-3.8.1-py3-none-any.whl.metadata (2.8 kB)\n",
"Collecting pyLDAvis\n",
" Downloading pyLDAvis-3.4.1-py3-none-any.whl.metadata (4.2 kB)\n",
"Requirement already satisfied: tiktoken in /u/marshad/.local/lib/python3.9/site-packages (0.6.0)\n",
"Requirement already satisfied: click<9,>=7.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (8.1.7)\n",
"Collecting cloudpickle<4 (from mlflow==2.9.0)\n",
" Downloading cloudpickle-3.0.0-py3-none-any.whl.metadata (7.0 kB)\n",
"Collecting databricks-cli<1,>=0.8.7 (from mlflow==2.9.0)\n",
" Downloading databricks_cli-0.18.0-py2.py3-none-any.whl.metadata (4.0 kB)\n",
"Collecting entrypoints<1 (from mlflow==2.9.0)\n",
" Downloading entrypoints-0.4-py3-none-any.whl.metadata (2.6 kB)\n",
"Requirement already satisfied: gitpython<4,>=2.1.0 in /u/marshad/.local/lib/python3.9/site-packages (from mlflow==2.9.0) (3.1.43)\n",
"Requirement already satisfied: pyyaml<7,>=5.1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (6.0.1)\n",
"Requirement already satisfied: protobuf<5,>=3.12.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (4.25.3)\n",
"Collecting pytz<2024 (from mlflow==2.9.0)\n",
" Downloading pytz-2023.4-py2.py3-none-any.whl.metadata (22 kB)\n",
"Requirement already satisfied: requests<3,>=2.17.3 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (2.31.0)\n",
"Requirement already satisfied: packaging<24 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (23.2)\n",
"Requirement already satisfied: importlib-metadata!=4.7.0,<8,>=3.7.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (7.0.0)\n",
"Collecting sqlparse<1,>=0.4.0 (from mlflow==2.9.0)\n",
" Downloading sqlparse-0.5.0-py3-none-any.whl.metadata (3.9 kB)\n",
"Collecting alembic!=1.10.0,<2 (from mlflow==2.9.0)\n",
" Downloading alembic-1.13.1-py3-none-any.whl.metadata (7.4 kB)\n",
"Collecting docker<7,>=4.0.0 (from mlflow==2.9.0)\n",
" Downloading docker-6.1.3-py3-none-any.whl.metadata (3.5 kB)\n",
"Collecting Flask<4 (from mlflow==2.9.0)\n",
" Downloading flask-3.0.3-py3-none-any.whl.metadata (3.2 kB)\n",
"Requirement already satisfied: numpy<2 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (1.26.4)\n",
"Requirement already satisfied: scipy<2 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (1.13.0)\n",
"Requirement already satisfied: pandas<3 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (2.2.2)\n",
"Collecting querystring-parser<2 (from mlflow==2.9.0)\n",
" Downloading querystring_parser-1.2.4-py2.py3-none-any.whl.metadata (559 bytes)\n",
"Requirement already satisfied: sqlalchemy<3,>=1.4.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (2.0.29)\n",
"Requirement already satisfied: scikit-learn<2 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (1.4.2)\n",
"Collecting pyarrow<15,>=4.0.0 (from mlflow==2.9.0)\n",
" Downloading pyarrow-14.0.2-cp39-cp39-manylinux_2_28_x86_64.whl.metadata (3.0 kB)\n",
"Collecting markdown<4,>=3.3 (from mlflow==2.9.0)\n",
" Downloading Markdown-3.6-py3-none-any.whl.metadata (7.0 kB)\n",
"Requirement already satisfied: matplotlib<4 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (3.8.4)\n",
"Collecting gunicorn<22 (from mlflow==2.9.0)\n",
" Downloading gunicorn-21.2.0-py3-none-any.whl.metadata (4.1 kB)\n",
"Requirement already satisfied: Jinja2<4,>=2.11 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from mlflow==2.9.0) (3.1.3)\n",
"Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from langchain==0.0.339) (3.9.5)\n",
"Collecting anyio<4.0 (from langchain==0.0.339)\n",
" Downloading anyio-3.7.1-py3-none-any.whl.metadata (4.7 kB)\n",
"Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from langchain==0.0.339) (4.0.3)\n",
"Requirement already satisfied: dataclasses-json<0.7,>=0.5.7 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from langchain==0.0.339) (0.6.4)\n",
"Requirement already satisfied: jsonpatch<2.0,>=1.33 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from langchain==0.0.339) (1.33)\n",
"Collecting langsmith<0.1.0,>=0.0.63 (from langchain==0.0.339)\n",
" Downloading langsmith-0.0.92-py3-none-any.whl.metadata (9.9 kB)\n",
"Requirement already satisfied: pydantic<3,>=1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from langchain==0.0.339) (2.7.0)\n",
"Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from langchain==0.0.339) (8.2.3)\n",
"Requirement already satisfied: distro<2,>=1.7.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from openai) (1.9.0)\n",
"Requirement already satisfied: httpx<1,>=0.23.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from openai) (0.27.0)\n",
"Requirement already satisfied: sniffio in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from openai) (1.3.1)\n",
"Requirement already satisfied: tqdm>4 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from openai) (4.66.2)\n",
"Requirement already satisfied: typing-extensions<5,>=4.7 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from openai) (4.11.0)\n",
"Collecting smart-open>=1.8.1 (from gensim)\n",
" Downloading smart_open-7.0.4-py3-none-any.whl.metadata (23 kB)\n",
"Requirement already satisfied: joblib in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from nltk) (1.4.0)\n",
"Requirement already satisfied: regex>=2021.8.3 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from nltk) (2024.4.16)\n",
"Collecting numexpr (from pyLDAvis)\n",
" Downloading numexpr-2.10.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.9 kB)\n",
"Collecting funcy (from pyLDAvis)\n",
" Downloading funcy-2.0-py2.py3-none-any.whl.metadata (5.9 kB)\n",
"Requirement already satisfied: setuptools in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pyLDAvis) (68.2.2)\n",
"Requirement already satisfied: aiosignal>=1.1.2 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.339) (1.3.1)\n",
"Requirement already satisfied: attrs>=17.3.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.339) (23.2.0)\n",
"Requirement already satisfied: frozenlist>=1.1.1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.339) (1.4.1)\n",
"Requirement already satisfied: multidict<7.0,>=4.5 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.339) (6.0.5)\n",
"Requirement already satisfied: yarl<2.0,>=1.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain==0.0.339) (1.9.4)\n",
"Collecting Mako (from alembic!=1.10.0,<2->mlflow==2.9.0)\n",
" Downloading Mako-1.3.3-py3-none-any.whl.metadata (2.9 kB)\n",
"Requirement already satisfied: idna>=2.8 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from anyio<4.0->langchain==0.0.339) (3.7)\n",
"Requirement already satisfied: exceptiongroup in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from anyio<4.0->langchain==0.0.339) (1.2.0)\n",
"Collecting pyjwt>=1.7.0 (from databricks-cli<1,>=0.8.7->mlflow==2.9.0)\n",
" Downloading PyJWT-2.8.0-py3-none-any.whl.metadata (4.2 kB)\n",
"Requirement already satisfied: oauthlib>=3.1.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from databricks-cli<1,>=0.8.7->mlflow==2.9.0) (3.2.2)\n",
"Collecting tabulate>=0.7.7 (from databricks-cli<1,>=0.8.7->mlflow==2.9.0)\n",
" Downloading tabulate-0.9.0-py3-none-any.whl.metadata (34 kB)\n",
"Requirement already satisfied: six>=1.10.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from databricks-cli<1,>=0.8.7->mlflow==2.9.0) (1.16.0)\n",
"Requirement already satisfied: urllib3<3,>=1.26.7 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from databricks-cli<1,>=0.8.7->mlflow==2.9.0) (2.2.1)\n",
"Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain==0.0.339) (3.21.1)\n",
"Requirement already satisfied: typing-inspect<1,>=0.4.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain==0.0.339) (0.9.0)\n",
"Requirement already satisfied: websocket-client>=0.32.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from docker<7,>=4.0.0->mlflow==2.9.0) (1.7.0)\n",
"Collecting Werkzeug>=3.0.0 (from Flask<4->mlflow==2.9.0)\n",
" Downloading werkzeug-3.0.2-py3-none-any.whl.metadata (4.1 kB)\n",
"Collecting itsdangerous>=2.1.2 (from Flask<4->mlflow==2.9.0)\n",
" Downloading itsdangerous-2.2.0-py3-none-any.whl.metadata (1.9 kB)\n",
"Requirement already satisfied: blinker>=1.6.2 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from Flask<4->mlflow==2.9.0) (1.7.0)\n",
"Requirement already satisfied: gitdb<5,>=4.0.1 in /u/marshad/.local/lib/python3.9/site-packages (from gitpython<4,>=2.1.0->mlflow==2.9.0) (4.0.11)\n",
"Requirement already satisfied: certifi in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from httpx<1,>=0.23.0->openai) (2024.2.2)\n",
"Requirement already satisfied: httpcore==1.* in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from httpx<1,>=0.23.0->openai) (1.0.5)\n",
"Requirement already satisfied: h11<0.15,>=0.13 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai) (0.14.0)\n",
"Requirement already satisfied: zipp>=0.5 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from importlib-metadata!=4.7.0,<8,>=3.7.0->mlflow==2.9.0) (3.17.0)\n",
"Requirement already satisfied: MarkupSafe>=2.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from Jinja2<4,>=2.11->mlflow==2.9.0) (2.1.5)\n",
"Requirement already satisfied: jsonpointer>=1.9 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from jsonpatch<2.0,>=1.33->langchain==0.0.339) (2.4)\n",
"Requirement already satisfied: contourpy>=1.0.1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from matplotlib<4->mlflow==2.9.0) (1.2.1)\n",
"Requirement already satisfied: cycler>=0.10 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from matplotlib<4->mlflow==2.9.0) (0.12.1)\n",
"Requirement already satisfied: fonttools>=4.22.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from matplotlib<4->mlflow==2.9.0) (4.51.0)\n",
"Requirement already satisfied: kiwisolver>=1.3.1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from matplotlib<4->mlflow==2.9.0) (1.4.5)\n",
"Requirement already satisfied: pillow>=8 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from matplotlib<4->mlflow==2.9.0) (10.3.0)\n",
"Requirement already satisfied: pyparsing>=2.3.1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from matplotlib<4->mlflow==2.9.0) (3.1.2)\n",
"Requirement already satisfied: python-dateutil>=2.7 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from matplotlib<4->mlflow==2.9.0) (2.9.0)\n",
"Requirement already satisfied: importlib-resources>=3.2.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from matplotlib<4->mlflow==2.9.0) (6.4.0)\n",
"Requirement already satisfied: tzdata>=2022.7 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pandas<3->mlflow==2.9.0) (2024.1)\n",
"Requirement already satisfied: annotated-types>=0.4.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pydantic<3,>=1->langchain==0.0.339) (0.6.0)\n",
"Requirement already satisfied: pydantic-core==2.18.1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pydantic<3,>=1->langchain==0.0.339) (2.18.1)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from requests<3,>=2.17.3->mlflow==2.9.0) (3.3.2)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from scikit-learn<2->mlflow==2.9.0) (3.4.0)\n",
"Requirement already satisfied: wrapt in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from smart-open>=1.8.1->gensim) (1.16.0)\n",
"Requirement already satisfied: greenlet!=0.4.17 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from sqlalchemy<3,>=1.4.0->mlflow==2.9.0) (3.0.3)\n",
"Requirement already satisfied: smmap<6,>=3.0.1 in /u/marshad/.local/lib/python3.9/site-packages (from gitdb<5,>=4.0.1->gitpython<4,>=2.1.0->mlflow==2.9.0) (5.0.1)\n",
"Requirement already satisfied: mypy-extensions>=0.3.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain==0.0.339) (1.0.0)\n",
"Downloading mlflow-2.9.0-py3-none-any.whl (19.1 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m19.1/19.1 MB\u001b[0m \u001b[31m48.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n",
"\u001b[?25hDownloading langchain-0.0.339-py3-none-any.whl (2.0 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.0/2.0 MB\u001b[0m \u001b[31m69.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading faiss_cpu-1.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27.0 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m27.0/27.0 MB\u001b[0m \u001b[31m65.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n",
"\u001b[?25hDownloading gensim-4.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (26.6 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m26.6/26.6 MB\u001b[0m \u001b[31m71.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n",
"\u001b[?25hDownloading nltk-3.8.1-py3-none-any.whl (1.5 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.5/1.5 MB\u001b[0m \u001b[31m76.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading pyLDAvis-3.4.1-py3-none-any.whl (2.6 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.6/2.6 MB\u001b[0m \u001b[31m95.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading alembic-1.13.1-py3-none-any.whl (233 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m233.4/233.4 kB\u001b[0m \u001b[31m21.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading anyio-3.7.1-py3-none-any.whl (80 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m80.9/80.9 kB\u001b[0m \u001b[31m8.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading cloudpickle-3.0.0-py3-none-any.whl (20 kB)\n",
"Downloading databricks_cli-0.18.0-py2.py3-none-any.whl (150 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m150.3/150.3 kB\u001b[0m \u001b[31m14.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading docker-6.1.3-py3-none-any.whl (148 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m148.1/148.1 kB\u001b[0m \u001b[31m16.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading entrypoints-0.4-py3-none-any.whl (5.3 kB)\n",
"Downloading flask-3.0.3-py3-none-any.whl (101 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m101.7/101.7 kB\u001b[0m \u001b[31m11.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading gunicorn-21.2.0-py3-none-any.whl (80 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m80.2/80.2 kB\u001b[0m \u001b[31m7.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading langsmith-0.0.92-py3-none-any.whl (56 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m56.5/56.5 kB\u001b[0m \u001b[31m4.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading Markdown-3.6-py3-none-any.whl (105 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m105.4/105.4 kB\u001b[0m \u001b[31m10.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading pyarrow-14.0.2-cp39-cp39-manylinux_2_28_x86_64.whl (38.0 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m38.0/38.0 MB\u001b[0m \u001b[31m46.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n",
"\u001b[?25hDownloading pytz-2023.4-py2.py3-none-any.whl (506 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m506.5/506.5 kB\u001b[0m \u001b[31m23.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading querystring_parser-1.2.4-py2.py3-none-any.whl (7.9 kB)\n",
"Downloading smart_open-7.0.4-py3-none-any.whl (61 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m61.2/61.2 kB\u001b[0m \u001b[31m5.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading sqlparse-0.5.0-py3-none-any.whl (43 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m44.0/44.0 kB\u001b[0m \u001b[31m3.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading funcy-2.0-py2.py3-none-any.whl (30 kB)\n",
"Downloading numexpr-2.10.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (375 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m375.6/375.6 kB\u001b[0m \u001b[31m28.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading itsdangerous-2.2.0-py3-none-any.whl (16 kB)\n",
"Downloading PyJWT-2.8.0-py3-none-any.whl (22 kB)\n",
"Downloading tabulate-0.9.0-py3-none-any.whl (35 kB)\n",
"Downloading werkzeug-3.0.2-py3-none-any.whl (226 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m226.8/226.8 kB\u001b[0m \u001b[31m14.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hDownloading Mako-1.3.3-py3-none-any.whl (78 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m78.8/78.8 kB\u001b[0m \u001b[31m7.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hInstalling collected packages: pytz, funcy, Werkzeug, tabulate, sqlparse, smart-open, querystring-parser, pyjwt, pyarrow, numexpr, nltk, Mako, itsdangerous, gunicorn, faiss-cpu, entrypoints, cloudpickle, anyio, markdown, gensim, Flask, docker, databricks-cli, alembic, pyLDAvis, mlflow, langsmith, langchain\n",
" Attempting uninstall: pytz\n",
" Found existing installation: pytz 2024.1\n",
" Uninstalling pytz-2024.1:\n",
" Successfully uninstalled pytz-2024.1\n",
" Attempting uninstall: pyarrow\n",
" Found existing installation: pyarrow 15.0.2\n",
" Uninstalling pyarrow-15.0.2:\n",
" Successfully uninstalled pyarrow-15.0.2\n",
" Attempting uninstall: anyio\n",
" Found existing installation: anyio 4.3.0\n",
" Uninstalling anyio-4.3.0:\n",
" Successfully uninstalled anyio-4.3.0\n",
" Attempting uninstall: langsmith\n",
" Found existing installation: langsmith 0.1.48\n",
" Uninstalling langsmith-0.1.48:\n",
" Successfully uninstalled langsmith-0.1.48\n",
" Attempting uninstall: langchain\n",
" Found existing installation: langchain 0.1.16\n",
" Uninstalling langchain-0.1.16:\n",
" Successfully uninstalled langchain-0.1.16\n",
"\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
"langchain-community 0.0.33 requires langsmith<0.2.0,>=0.1.0, but you have langsmith 0.0.92 which is incompatible.\n",
"langchain-core 0.1.44 requires langsmith<0.2.0,>=0.1.0, but you have langsmith 0.0.92 which is incompatible.\u001b[0m\u001b[31m\n",
"\u001b[0mSuccessfully installed Flask-3.0.3 Mako-1.3.3 Werkzeug-3.0.2 alembic-1.13.1 anyio-3.7.1 cloudpickle-3.0.0 databricks-cli-0.18.0 docker-6.1.3 entrypoints-0.4 faiss-cpu-1.8.0 funcy-2.0 gensim-4.3.2 gunicorn-21.2.0 itsdangerous-2.2.0 langchain-0.0.339 langsmith-0.0.92 markdown-3.6 mlflow-2.9.0 nltk-3.8.1 numexpr-2.10.0 pyLDAvis-3.4.1 pyarrow-14.0.2 pyjwt-2.8.0 pytz-2023.4 querystring-parser-1.2.4 smart-open-7.0.4 sqlparse-0.5.0 tabulate-0.9.0\n",
"Requirement already satisfied: faiss-cpu in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (1.8.0)\n",
"Requirement already satisfied: gensim in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (4.3.2)\n",
"Requirement already satisfied: nltk in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (3.8.1)\n",
"Requirement already satisfied: pyLDAvis in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (3.4.1)\n",
"Requirement already satisfied: tiktoken in /u/marshad/.local/lib/python3.9/site-packages (0.6.0)\n",
"Requirement already satisfied: numpy in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from faiss-cpu) (1.26.4)\n",
"Requirement already satisfied: scipy>=1.7.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from gensim) (1.13.0)\n",
"Requirement already satisfied: smart-open>=1.8.1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from gensim) (7.0.4)\n",
"Requirement already satisfied: click in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from nltk) (8.1.7)\n",
"Requirement already satisfied: joblib in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from nltk) (1.4.0)\n",
"Requirement already satisfied: regex>=2021.8.3 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from nltk) (2024.4.16)\n",
"Requirement already satisfied: tqdm in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from nltk) (4.66.2)\n",
"Requirement already satisfied: pandas>=2.0.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pyLDAvis) (2.2.2)\n",
"Requirement already satisfied: jinja2 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pyLDAvis) (3.1.3)\n",
"Requirement already satisfied: numexpr in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pyLDAvis) (2.10.0)\n",
"Requirement already satisfied: funcy in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pyLDAvis) (2.0)\n",
"Requirement already satisfied: scikit-learn>=1.0.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pyLDAvis) (1.4.2)\n",
"Requirement already satisfied: setuptools in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pyLDAvis) (68.2.2)\n",
"Requirement already satisfied: requests>=2.26.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from tiktoken) (2.31.0)\n",
"Requirement already satisfied: python-dateutil>=2.8.2 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pandas>=2.0.0->pyLDAvis) (2.9.0)\n",
"Requirement already satisfied: pytz>=2020.1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pandas>=2.0.0->pyLDAvis) (2023.4)\n",
"Requirement already satisfied: tzdata>=2022.7 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from pandas>=2.0.0->pyLDAvis) (2024.1)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from requests>=2.26.0->tiktoken) (3.3.2)\n",
"Requirement already satisfied: idna<4,>=2.5 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from requests>=2.26.0->tiktoken) (3.7)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from requests>=2.26.0->tiktoken) (2.2.1)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from requests>=2.26.0->tiktoken) (2024.2.2)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from scikit-learn>=1.0.0->pyLDAvis) (3.4.0)\n",
"Requirement already satisfied: wrapt in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from smart-open>=1.8.1->gensim) (1.16.0)\n",
"Requirement already satisfied: MarkupSafe>=2.0 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from jinja2->pyLDAvis) (2.1.5)\n",
"Requirement already satisfied: six>=1.5 in /u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages (from python-dateutil>=2.8.2->pandas>=2.0.0->pyLDAvis) (1.16.0)\n"
]
}
],
"source": [
"# !pip install mlflow==2.9.0 langchain==0.0.339 openai faiss-cpu gensim nltk pyLDAvis tiktoken\n",
"# !pip install faiss-cpu gensim nltk pyLDAvis tiktoken\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "414eb948-7f7a-411b-8308-facadb0bdde8",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"import ast\n",
"import os\n",
"import pprint\n",
"from typing import List\n",
"\n",
"import pandas as pd\n",
"from langchain.docstore.document import Document\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import FAISS\n",
"\n",
"import mlflow\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"key-here\"\n",
"\n",
"CHUNK_SIZE = 100 # WARNING! MAKE SURE ITS SAME. CHUNK_SIZE = 100 # Warning : Change in other notebook as well. # IRRELEVANT SINCE THE ALREAYD BUILT DATABAES IS USED\n",
"\n",
"# Assume running from https://github.com/mlflow/mlflow/blob/master/examples/llms/rag\n",
"OUTPUT_DF_PATH = \"/projects/bcjp/marshad/agllm/agllm-data/evaluation/question_answer_source_agllm.csv\"\n",
"SCRAPPED_DOCS_PATH = \"None\"\n",
"EVALUATION_DATASET_PATH = \"/projects/bcjp/marshad/agllm/agllm-data/evaluation/static_evaluation_dataset.csv\" # This is where/how the dataset will be saved when it is prepared for evaluation\n",
"DB_PERSIST_DIR = \"faiss_index\""
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "eebcf0d9-6634-47d9-808d-e79c5a50fbbf",
"showTitle": false,
"title": ""
}
},
"source": [
"## Step 2: Evaluation Dataset Preparation\n",
"The evaluation dataset should contain three columns: questions, ground truth doc IDs, retrieved relevant doc IDs. A \"doc ID\" is a unique string identifier of the documents in you RAG application. For example, it could be the URL of a documentation web page, or the file path of a PDF document.\n",
"\n",
"If you have a list of questions that you would like to evaluate, please see 1.1 Manual Preparation. If you do not have a question list yet, please see 1.2 Generate the Evaluation Dataset.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "f8a690cc-7672-4f24-8518-8faabfc9afea",
"showTitle": false,
"title": ""
}
},
"source": [
"### Manual Preparation\n",
"\n",
"When evaluating a retriever, it's recommended to save the retrieved document IDs into a static dataset represented by a Pandas Dataframe or an MLflow Pandas Dataset containing the input queries, retrieved relevant document IDs, and the ground-truth document IDs for the evaluation.\n",
"\n",
"#### Concepts\n",
"\n",
"A \"document ID\" is a string that identifies a document.\n",
"\n",
"A list of \"retrieved relevant document IDs\" are the output of the retriever for a specific input query and a `k` value.\n",
"\n",
"A list of \"ground-truth document IDs\" are the labeled relevant documents for a specific input query.\n",
"\n",
"#### Expected Data Format\n",
"\n",
"For each row, the retrieved relevant document IDs and the ground-truth relevant document IDs should be provided as a tuple of document ID strings.\n",
"\n",
"The column name of the retrieved relevant document IDs should be specified by the `predictions` parameter, and the column name of the ground-truth relevant document IDs should be specified by the `targets` parameter.\n",
"\n",
"Here is a simple example dataset that illustrates the expected data format. The doc IDs are the paths of the documentation pages."
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "1a61b1b2-582e-49d5-864d-b58d2b6c3392",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"# # THIS IS NO BEING USED INSTEAD : QUESTION_ANSWER_SOURCE (GENERATED FROM LAST NOTEBOOK IS BEING USED)\n",
"# data = pd.DataFrame(\n",
"# {\n",
"# \"questions\": [\n",
"# \"What is MLflow?\",\n",
"# \"What is Databricks?\",\n",
"# \"How to serve a model on Databricks?\",\n",
"# \"How to enable MLflow Autologging for my workspace by default?\",\n",
"# ],\n",
"# \"retrieved_context\": [\n",
"# [\n",
"# \"mlflow/index.html\",\n",
"# \"mlflow/quick-start.html\",\n",
"# ],\n",
"# [\n",
"# \"introduction/index.html\",\n",
"# \"getting-started/overview.html\",\n",
"# ],\n",
"# [\n",
"# \"machine-learning/model-serving/index.html\",\n",
"# \"machine-learning/model-serving/model-serving-intro.html\",\n",
"# ],\n",
"# [],\n",
"# ],\n",
"# \"ground_truth_context\": [\n",
"# [\"mlflow/index.html\"],\n",
"# [\"introduction/index.html\"],\n",
"# [\n",
"# \"machine-learning/model-serving/index.html\",\n",
"# \"machine-learning/model-serving/llm-optimized-model-serving.html\",\n",
"# ],\n",
"# [\"mlflow/databricks-autologging.html\"],\n",
"# ],\n",
"# }\n",
"# )"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "f740b47c-71ee-4633-944c-172887ff5081",
"showTitle": false,
"title": ""
}
},
"source": [
"### Generate the Evaluation Dataset\n",
"There are two steps to generate the evaluation dataset: generate questions with ground truth doc IDs and retrieve relevant doc IDs. "
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "f6beddae-85e2-44e7-8ec6-7ca2f02bc16b",
"showTitle": false,
"title": ""
}
},
"source": [
"\n",
"#### Generate Questions with Ground Truth Doc IDs\n",
"If you don't have a list of questions to evaluate, you can generate them using LLMs. The [Question Generation Notebook](https://mlflow.org/docs/latest/llms/rag/notebooks/question-generation-retrieval-evaluation.html) provides an example way to do it. Here is the result of running that notebook."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "98bf55c7-3e58-4fff-bc0e-1af58d64839f",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"generated_df = pd.read_csv(OUTPUT_DF_PATH)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "17baa097-457f-46df-9e25-56061972785f",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
question
\n",
"
answer
\n",
"
chunk
\n",
"
chunk_id
\n",
"
source
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
When does peak adult flight of stalk borer occur?
\n",
"
Peak adult flight of stalk borer occurs during...
\n",
"
11/10/23, 9:03 AM\\nStart Scouting for Stalk Bo...
\n",
"
0
\n",
"
agllm-data/Start Scouting for Stalk Borer _ In...
\n",
"
\n",
"
\n",
"
1
\n",
"
What are the distinguishing features of stalk ...
\n",
"
Stalk borer larvae have three pairs of true le...
\n",
"
Description. Stalk borer larvae have three pai...
\n",
"
0
\n",
"
agllm-data/Start Scouting for Stalk Borer _ In...
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" question \\\n",
"0 When does peak adult flight of stalk borer occur? \n",
"1 What are the distinguishing features of stalk ... \n",
"\n",
" answer \\\n",
"0 Peak adult flight of stalk borer occurs during... \n",
"1 Stalk borer larvae have three pairs of true le... \n",
"\n",
" chunk chunk_id \\\n",
"0 11/10/23, 9:03 AM\\nStart Scouting for Stalk Bo... 0 \n",
"1 Description. Stalk borer larvae have three pai... 0 \n",
"\n",
" source \n",
"0 agllm-data/Start Scouting for Stalk Borer _ In... \n",
"1 agllm-data/Start Scouting for Stalk Borer _ In... "
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"generated_df.head(3)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "93165dc5-aff9-46f9-83ab-e6dbfcbbc32b",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
question
\n",
"
source
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
When does peak adult flight of stalk borer occur?
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
\n",
"
\n",
"
1
\n",
"
What are the distinguishing features of stalk ...
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" question \\\n",
"0 When does peak adult flight of stalk borer occur? \n",
"1 What are the distinguishing features of stalk ... \n",
"\n",
" source \n",
"0 [agllm-data/Start Scouting for Stalk Borer _ I... \n",
"1 [agllm-data/Start Scouting for Stalk Borer _ I... "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Prepare dataframe `data` with the required format\n",
"data = pd.DataFrame({})\n",
"data[\"question\"] = generated_df[\"question\"].copy(deep=True)\n",
"data[\"source\"] = generated_df[\"source\"].apply(lambda x: [x])\n",
"data.head(3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "3eabe651-28be-45bb-94ad-58e6bc582137",
"showTitle": false,
"title": ""
}
},
"source": [
"#### Retrieve Relevant Doc IDs\n",
"\n",
"Once we have a list of questions with ground truth doc IDs from 1.1, we can collect the retrieved relevant doc IDs. In this tutorial, we use a LangChain retriever. You can plug in your own retriever as needed."
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "9817f671-f2fd-4b2e-abe9-3bc9afd9ce3c",
"showTitle": false,
"title": ""
}
},
"source": [
"First, we build a FAISS retriever from the docs saved at https://github.com/mlflow/mlflow/blob/master/examples/llms/question_generation/mlflow_docs_scraped.csv. See the [Question Generation Notebook](https://mlflow.org/docs/latest/llms/rag/notebooks/question-generation-retrieval-evaluation.html) for how to create this csv file."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "178b45b4-11f9-47ca-9564-c8caa32d2504",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages/langchain_core/_api/deprecation.py:117: LangChainDeprecationWarning: The class `langchain_community.embeddings.openai.OpenAIEmbeddings` was deprecated in langchain-community 0.0.9 and will be removed in 0.2.0. An updated version of the class exists in the langchain-openai package and should be used instead. To use it run `pip install -U langchain-openai` and import as `from langchain_openai import OpenAIEmbeddings`.\n",
" warn_deprecated(\n"
]
}
],
"source": [
"embedding = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "e5a113bb-11b8-4d1a-a21b-b59b523f3525",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"# scrapped_df = pd.read_csv(SCRAPPED_DOCS_PATH)\n",
"# list_of_documents = [\n",
"# Document(page_content=row[\"text\"], metadata={\"source\": row[\"source\"]})\n",
"# for i, row in scrapped_df.iterrows()\n",
"# ]\n",
"# text_splitter = CharacterTextSplitter(chunk_size=CHUNK_SIZE, chunk_overlap=0)\n",
"# docs = text_splitter.split_documents(list_of_documents)\n",
"# db = FAISS.from_documents(docs, embeddings)\n",
"\n",
"# # Save the db to local disk\n",
"# db.save_local(DB_PERSIST_DIR)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "bace7c63-e3d5-42f3-bf6a-00ef1842baae",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"# Load the db from local disk : Already stored database that is being used in agllm should come here\n",
"# db = FAISS.load_local(DB_PERSIST_DIR, embeddings)\n",
"from langchain.vectorstores import Chroma\n",
"\n",
"persist_directory = 'db3'\n",
"vectordb = Chroma(persist_directory=persist_directory, \n",
" embedding_function=embedding)\n",
"\n",
"retriever = vectordb.as_retriever()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "c06bcb3c-58c8-454c-bf5b-e29ec227991f",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Control. To prevent stand loss, scout and determine the percent of infested plants.\\nThe use of an economic threshold (Table 1), first developed by ISU entomologist\\nLarry Pedigo, will help determine justifiable insecticide treatments based on market\\nvalue and plant stage. Young plants have a lower threshold because they are more\\neasily killed by stalk borer larvae.\\nTable 1. Economic thresholds for stalk borer in corn, based on plant\\nstage, expected yield and market value.', metadata={'author': '', 'creationDate': \"D:20231110150309+00'00'\", 'creator': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36', 'file_path': 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf', 'format': 'PDF 1.4', 'keywords': '', 'matched_specie_0': 'Hypagyrtis unipunctata', 'matched_specie_1': 'Papaipema nebris', 'modDate': \"D:20231110150309+00'00'\", 'page': 2, 'producer': 'Skia/PDF m118', 'source': 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf', 'subject': '', 'title': '', 'total_pages': 6, 'trapped': ''}),\n",
" Document(page_content='This has been the week of the fuzzy brown moths or FBMs (as entomologists not-so-\\ntechnically call the hundreds of moth species that fit this description). There is a wide\\nvariety of species that can be called FBMs and it seems we are experiencing several.', metadata={'author': '', 'creationDate': \"D:20231110150311+00'00'\", 'creator': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36', 'file_path': 'agllm-data/Moths Abundant Around Iowa _ Integrated Crop Management.pdf', 'format': 'PDF 1.4', 'keywords': '', 'matched_specie_0': 'Nomophila nearctica', 'matched_specie_1': 'Agrotis ipsilon', 'matched_specie_2': 'Euxoa auxiliaris', 'modDate': \"D:20231110150311+00'00'\", 'page': 0, 'producer': 'Skia/PDF m118', 'source': 'agllm-data/Moths Abundant Around Iowa _ Integrated Crop Management.pdf', 'subject': '', 'title': '', 'total_pages': 4, 'trapped': ''}),\n",
" Document(page_content='Dr Laura Jesse Iles directs the North Central IPM Center.\\xa0 \\xa0Dr. Iles has\\nearned B.S. (Animal Ecology), M.S. (Entomology), and Ph.D. (Co-major\\nin Entomology and Ecology and Evolutionary Biology) degrees, all from\\nIowa State University.\\xa0 In addit...\\nErin Hodgson Professor\\nDr. Erin Hodgson started working in the Department of Entomology,\\nnow the Department of Plant Pathology, Entomology, and Microbiology,\\nat Iowa State University in 2009. She is a professor with extension and', metadata={'author': '', 'creationDate': \"D:20231110150311+00'00'\", 'creator': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36', 'file_path': 'agllm-data/Moths Abundant Around Iowa _ Integrated Crop Management.pdf', 'format': 'PDF 1.4', 'keywords': '', 'matched_specie_0': 'Nomophila nearctica', 'matched_specie_1': 'Agrotis ipsilon', 'matched_specie_2': 'Euxoa auxiliaris', 'modDate': \"D:20231110150311+00'00'\", 'page': 3, 'producer': 'Skia/PDF m118', 'source': 'agllm-data/Moths Abundant Around Iowa _ Integrated Crop Management.pdf', 'subject': '', 'title': '', 'total_pages': 4, 'trapped': ''}),\n",
" Document(page_content='For more information on stalk borer biology and management, read a recent Journal\\nof Integrated Pest Management article by Rice and Davis (2010), called \"Stalk borer\\necology and IPM in corn.\"\\nErin Hodgson is an assistant professor of entomology with extension and research\\nresponsibilities; contact at ewh@iastate.edu or phone 515-294-2847. Adam Sisson is\\nan Integrated Pest Management program assistant. Sisson can be contacted by\\nemail at ajsisson@iastate.edu or by calling 515-294-5899', metadata={'author': '', 'creationDate': \"D:20231110150309+00'00'\", 'creator': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36', 'file_path': 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf', 'format': 'PDF 1.4', 'keywords': '', 'matched_specie_0': 'Hypagyrtis unipunctata', 'matched_specie_1': 'Papaipema nebris', 'modDate': \"D:20231110150309+00'00'\", 'page': 4, 'producer': 'Skia/PDF m118', 'source': 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf', 'subject': '', 'title': '', 'total_pages': 6, 'trapped': ''})]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Test the retriever with a query\n",
"retrieved_docs = retriever.get_relevant_documents(\n",
" \"What is the purpose of the MLflow Model Registry?\"\n",
")\n",
"(retrieved_docs)"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "2ec7b458-c248-4ec0-9d85-0e447d6b4ecd",
"showTitle": false,
"title": ""
}
},
"source": [
"After building a retriever, we define a function that takes a question string as input and returns a list of relevant doc ID strings."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "bc688e4b-3389-4804-b7bf-159bce4f9db8",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
question
\n",
"
source
\n",
"
retrieved_doc_ids
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
When does peak adult flight of stalk borer occur?
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
\n",
"
\n",
"
1
\n",
"
What are the distinguishing features of stalk ...
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" question \\\n",
"0 When does peak adult flight of stalk borer occur? \n",
"1 What are the distinguishing features of stalk ... \n",
"\n",
" source \\\n",
"0 [agllm-data/Start Scouting for Stalk Borer _ I... \n",
"1 [agllm-data/Start Scouting for Stalk Borer _ I... \n",
"\n",
" retrieved_doc_ids \n",
"0 [agllm-data/Start Scouting for Stalk Borer _ I... \n",
"1 [agllm-data/Start Scouting for Stalk Borer _ I... "
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Define a function to return a list of retrieved doc ids\n",
"def retrieve_doc_ids(question: str) -> List[str]:\n",
" docs = retriever.get_relevant_documents(question)\n",
" return [doc.metadata[\"source\"] for doc in docs]\n",
"\n",
"data[\"retrieved_doc_ids\"] = data[\"question\"].apply(retrieve_doc_ids)\n",
"data.head(3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "330a3336-6ca2-455f-a1ae-5bd842a4d2bb",
"showTitle": false,
"title": ""
}
},
"source": [
"We can store the retrieved doc IDs in the dataframe as a column \"retrieved_doc_ids\"."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "f96ec69b-bea3-4023-8cd3-6bee1e327ff0",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'data' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m/projects/bcjp/marshad/agllm/retriever-evaluation-tutorial.ipynb Cell 24\u001b[0m line \u001b[0;36m1\n\u001b[0;32m----> 1\u001b[0m data[\u001b[39m\"\u001b[39m\u001b[39mquestion\u001b[39m\u001b[39m\"\u001b[39m][\u001b[39m0\u001b[39m]\n",
"\u001b[0;31mNameError\u001b[0m: name 'data' is not defined"
]
}
],
"source": [
"data[\"question\"][0]"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf',\n",
" 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf',\n",
" 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf',\n",
" 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf']"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data.iloc[0][\"retrieved_doc_ids\"]"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "5e5c4cd1-38c3-4709-8d41-6e319fb8a924",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"# Persist the static evaluation dataset to disk\n",
"data.to_csv(EVALUATION_DATASET_PATH, index=False)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "deabd8f0-44cf-409f-a27b-e82dd4d99940",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
question
\n",
"
source
\n",
"
retrieved_doc_ids
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
When does peak adult flight of stalk borer occur?
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
\n",
"
\n",
"
1
\n",
"
What are the distinguishing features of stalk ...
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" question \\\n",
"0 When does peak adult flight of stalk borer occur? \n",
"1 What are the distinguishing features of stalk ... \n",
"\n",
" source \\\n",
"0 [agllm-data/Start Scouting for Stalk Borer _ I... \n",
"1 [agllm-data/Start Scouting for Stalk Borer _ I... \n",
"\n",
" retrieved_doc_ids \n",
"0 [agllm-data/Start Scouting for Stalk Borer _ I... \n",
"1 [agllm-data/Start Scouting for Stalk Borer _ I... "
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Load the static evaluation dataset from disk and deserialize the source and retrieved doc ids\n",
"data = pd.read_csv(EVALUATION_DATASET_PATH)\n",
"data[\"source\"] = data[\"source\"].apply(ast.literal_eval)\n",
"data[\"retrieved_doc_ids\"] = data[\"retrieved_doc_ids\"].apply(ast.literal_eval)\n",
"data.head(3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "c62b5306-9c0d-4ce8-8c4a-23cb1ecc7f66",
"showTitle": false,
"title": ""
}
},
"source": [
"## Step 3: Calling `mlflow.evaluate()`"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "bdeebdcc-b4e7-4f9d-8fdc-366f9c13ed20",
"showTitle": false,
"title": ""
}
},
"source": [
"### Metrics Definition\n",
"\n",
"There are three built-in metrics provided for the retriever model type. Click the metric name below to see the metrics definitions.\n",
"\n",
"1. [mlflow.metrics.precision_at_k(k)](https://mlflow.org/docs/latest/python_api/mlflow.metrics.html#mlflow.metrics.precision_at_k)\n",
"1. [mlflow.metrics.recall_at_k(k)](https://mlflow.org/docs/latest/python_api/mlflow.metrics.html#mlflow.metrics.recall_at_k)\n",
"1. [mlflow.metrics.ndcg_at_k(k)](https://mlflow.org/docs/latest/python_api/mlflow.metrics.html#mlflow.metrics.ndcg_at_k) \n",
"\n",
"All metrics compute a score between 0 and 1 for each row representing the corresponding metric of the retriever model at the given `k` value.\n",
"\n",
"The `k` parameter should be a positive integer representing the number of retrieved documents\n",
"to evaluate for each row. `k` defaults to 3.\n",
"\n",
"When the model type is `\"retriever\"`, these metrics will be calculated automatically with the\n",
"default `k` value of 3.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "25a32237-b2ef-4f4a-9e5a-4537e7e43012",
"showTitle": false,
"title": ""
}
},
"source": [
"### Basic usage\n",
"\n",
"There are two supported ways to specify the retriever's output:\n",
"\n",
"* Case 1: Save the retriever's output to a static evaluation dataset\n",
"* Case 2: Wrap the retriever in a function"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf']\n",
"['agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf', 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf', 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf', 'agllm-data/Start Scouting for Stalk Borer _ Integrated Crop Management.pdf']\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/tmp/ipykernel_1842537/457881667.py:1: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`\n",
" print(data.iloc[0][1])\n",
"/tmp/ipykernel_1842537/457881667.py:2: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`\n",
" print(data.iloc[0][2])\n"
]
}
],
"source": [
"print(data.iloc[0][1])\n",
"print(data.iloc[0][2])\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "0390728a-a6cf-4c84-867a-0c6832114471",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages/mlflow/data/digest_utils.py:26: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
" string_columns = trimmed_df.columns[(df.applymap(type) == str).all(0)]\n",
"/u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages/mlflow/models/evaluation/base.py:414: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
" data = data.applymap(_hash_array_like_element_as_bytes)\n",
"2024/04/19 15:45:38 WARNING mlflow.data.pandas_dataset: Failed to infer schema for Pandas dataset. Exception: Unable to map 'object' type to MLflow DataType. object can be mapped iff all values have identical data type which is one of (string, (bytes or byterray), int, float).\n",
"2024/04/19 15:45:39 INFO mlflow.models.evaluation.base: Evaluating the model with the default evaluator.\n",
"2024/04/19 15:45:39 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...\n",
"2024/04/19 15:45:39 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: precision_at_3\n",
"2024/04/19 15:45:39 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: recall_at_3\n",
"2024/04/19 15:45:39 INFO mlflow.models.evaluation.default_evaluator: Evaluating builtin metrics: ndcg_at_3\n"
]
}
],
"source": [
"# Case 1: Evaluating a static evaluation dataset\n",
"with mlflow.start_run() as run:\n",
" evaluate_results = mlflow.evaluate(\n",
" data=data,\n",
" model_type=\"retriever\",\n",
" targets=\"source\",\n",
" predictions=\"retrieved_doc_ids\",\n",
" evaluators=\"default\",\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "70aa6719-f69d-4fda-8a67-ac4e0d8ea6d8",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
question
\n",
"
source
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
When does peak adult flight of stalk borer occur?
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
\n",
"
\n",
"
1
\n",
"
What are the distinguishing features of stalk ...
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" question \\\n",
"0 When does peak adult flight of stalk borer occur? \n",
"1 What are the distinguishing features of stalk ... \n",
"\n",
" source \n",
"0 [agllm-data/Start Scouting for Stalk Borer _ I... \n",
"1 [agllm-data/Start Scouting for Stalk Borer _ I... "
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question_source_df = data[[\"question\", \"source\"]]\n",
"question_source_df.head(3)"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "00672280-3dfc-4c00-9ae2-bea50732ef8b",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"# # Case 2: Evaluating a function\n",
"# def retriever_model_function(question_df: pd.DataFrame) -> pd.Series:\n",
"# return question_df[\"question\"].apply(retrieve_doc_ids)\n",
"\n",
"\n",
"# with mlflow.start_run() as run:\n",
"# evaluate_results = mlflow.evaluate(\n",
"# model=retriever_model_function,\n",
"# data=question_source_df,\n",
"# model_type=\"retriever\",\n",
"# targets=\"source\",\n",
"# evaluators=\"default\",\n",
"# )"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "cb24318b-6149-4703-ad06-731c8a75866f",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{ 'ndcg_at_3/mean': 1.0,\n",
" 'ndcg_at_3/p90': 1.0,\n",
" 'ndcg_at_3/variance': 0.0,\n",
" 'precision_at_3/mean': 1.0,\n",
" 'precision_at_3/p90': 1.0,\n",
" 'precision_at_3/variance': 0.0,\n",
" 'recall_at_3/mean': 1.0,\n",
" 'recall_at_3/p90': 1.0,\n",
" 'recall_at_3/variance': 0.0}\n"
]
}
],
"source": [
"pp = pprint.PrettyPrinter(indent=4)\n",
"pp.pprint(evaluate_results.metrics)"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "7a9b83c5-1544-4b0e-81f6-7abc1fafa258",
"showTitle": false,
"title": ""
}
},
"source": [
"### Try different k values\n",
"To use another `k` value, use the `evaluator_config` parameter\n",
"in the `mlflow.evaluate()` API as follows: `evaluator_config={\"retriever_k\": }`.\n",
"\n",
"\n",
"```python\n",
"# Case 1: Specifying the model type\n",
"evaluate_results = mlflow.evaluate(\n",
" data=data,\n",
" model_type=\"retriever\",\n",
" targets=\"ground_truth_context\",\n",
" predictions=\"retrieved_context\",\n",
" evaluators=\"default\",\n",
" evaluator_config={\"retriever_k\": 5}\n",
" )\n",
"```\n",
"\n",
"Alternatively, you can directly specify the desired metrics\n",
"in the `extra_metrics` parameter of the `mlflow.evaluate()` API without specifying a model\n",
"type. In this case, the `k` value specified in the `evaluator_config` parameter will be\n",
"ignored.\n",
"\n",
"\n",
"```python\n",
"# Case 2: Specifying the extra_metrics\n",
"evaluate_results = mlflow.evaluate(\n",
" data=data,\n",
" targets=\"ground_truth_context\",\n",
" predictions=\"retrieved_context\",\n",
" extra_metrics=[\n",
" mlflow.metrics.precision_at_k(4),\n",
" mlflow.metrics.precision_at_k(5)\n",
" ],\n",
" )\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "4b7174aa-0aa2-497d-aaa5-842121fcf270",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages/mlflow/data/digest_utils.py:26: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
" string_columns = trimmed_df.columns[(df.applymap(type) == str).all(0)]\n",
"/u/marshad/.conda/envs/agllm-env1/lib/python3.9/site-packages/mlflow/models/evaluation/base.py:414: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.\n",
" data = data.applymap(_hash_array_like_element_as_bytes)\n",
"2024/04/19 15:46:08 WARNING mlflow.data.pandas_dataset: Failed to infer schema for Pandas dataset. Exception: Unable to map 'object' type to MLflow DataType. object can be mapped iff all values have identical data type which is one of (string, (bytes or byterray), int, float).\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.base: Evaluating the model with the default evaluator.\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Testing metrics on first row...\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: precision_at_1\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: precision_at_2\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: precision_at_3\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: recall_at_1\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: recall_at_2\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: recall_at_3\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: ndcg_at_1\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: ndcg_at_2\n",
"2024/04/19 15:46:08 INFO mlflow.models.evaluation.default_evaluator: Evaluating metrics: ndcg_at_3\n"
]
}
],
"source": [
"with mlflow.start_run() as run:\n",
" evaluate_results = mlflow.evaluate(\n",
" data=data,\n",
" targets=\"source\",\n",
" predictions=\"retrieved_doc_ids\",\n",
" evaluators=\"default\",\n",
" extra_metrics=[\n",
" mlflow.metrics.precision_at_k(1),\n",
" mlflow.metrics.precision_at_k(2),\n",
" mlflow.metrics.precision_at_k(3),\n",
" mlflow.metrics.recall_at_k(1),\n",
" mlflow.metrics.recall_at_k(2),\n",
" mlflow.metrics.recall_at_k(3),\n",
" mlflow.metrics.ndcg_at_k(1),\n",
" mlflow.metrics.ndcg_at_k(2),\n",
" mlflow.metrics.ndcg_at_k(3),\n",
" ],\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "d57c201b-3718-43af-b8c2-ef22bfa2c15b",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAEiCAYAAAA21pHjAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8fJSN1AAAACXBIWXMAAA9hAAAPYQGoP6dpAABDRklEQVR4nO3dd1gUV9sG8HtZmkixoKCAgL0jakREpYRALIi9RgFLbCiGxIixGyOo0eibmBiJhSQWIsYWjagI2GNPlESNAkFREAugqLQ93x9+bNzsorAusML9u669ZGfOnHlmdtxnZ+bMORIhhAARERFpJZ2KDoCIiIiKx0RNRESkxZioiYiItBgTNRERkRZjoiYiItJiTNRERERajImaiIhIizFRExERaTEmaiIiIi3GRE0a5e/vD4lEguTk5IoOhUooOTkZEokE/v7+FR1KpbNx40ZIJBJs3LhRad6WLVvQvn17mJiYQCKRYNq0aSWaR1UPE7UWK/oClUgksLS0REFBgcpyf/31l7ycnZ3da61z/vz5kEgkiIuLe616tMG1a9cwZcoUtGrVCqampjAwMICNjQ0GDhyI7du3QyaTVXSIVI7i4uIgkUgwf/58tZZ78WVsbAwbGxv06NEDYWFhuH37dqnqPHnyJEaMGIHs7GxMnDgR8+bNw7vvvvvKeW8aiUQCNzc3tZZr3ry5ynlRUVEwMDBAzZo1ceLEideM8M2gW9EB0Kvp6uoiPT0d+/btQ58+fZTmr1u3Djo62vGbKzQ0FCEhIbCysqrQOJYvX44ZM2ZAJpOha9eueOedd2BkZISbN2/i0KFD2L59O0aPHo1169ZVaJzawMrKCn/99RfMzMwqOhSt1qFDB/Tu3RsA8OTJE6SlpeHEiRPYv38/FixYgKVLl2LKlCkKy/Tr1w+dO3dGvXr1FKbv3bsXQgh8//336NKlS4nnVXXh4eGYMGECLCwsEB0djTZt2lR0SOWCifoN0KVLF/z+++9Yv369UqIuKCjAjz/+CE9PT8THx1dQhP+qV6+e0pdSeVu7di0++ugj2NnZYfv27Wjfvr3C/IKCAkRERODo0aMVFKF20dPTK/bshf7VsWNHlWfju3btwpgxYzB16lRUr14do0ePls8zMzNT+QOo6Ay8fv36pZpXlS1ZsgQhISFo2LAhDh48iIYNG1Z0SOVHkNZKSkoSAIS3t7cYP3680NXVFenp6QplduzYIQCIrVu3CgMDA2Fra6tUj0wmE+vWrRNdunQRJiYmolq1aqJDhw5i3bp1CuVcXV0FAKXXi3Xa2toKW1tb8fDhQzF58mRhbW0tpFKp2LBhgxBCCD8/PwFAJCUlKcURHx8vfH19Rd26dYW+vr6wtrYW/fr1E0ePHpWXefr0qfj8889F27ZthampqTAyMhK2trZi0KBB4uLFi6/cZw8fPhSmpqZCX19fJCQkvLTss2fPFN4/fvxYzJ07VzRr1kwYGBiImjVrip49e4pjx44pLTtv3jwBQMTGxor169eL1q1bC0NDQ2FnZydWrVolhHi+3z///HPRtGlTYWBgIBo3biwiIiKU6iraZzdu3BBLliwRjRs3FgYGBsLOzk4sWLBA5OXlKZTPzc0V//vf/4SXl5ewtrYW+vr6ok6dOqJfv37i/PnzSvVv2LBBABAbNmwQu3fvFl26dBHGxsbyz7XoOPPz81NY7vbt22Lq1KmicePGwtDQUJiZmYnmzZuL8ePHi8zMTIWyGRkZIigoSNjZ2cnjGTRokLh06VKx25uYmChWrVolmjVrJvT19UWDBg3E/PnzRWFhofKHVYx169aJPn36CFtbW/ln5uXlJQ4fPqxQrujzUvVSday+KDY2VgAQ48ePf2UZc3Nz8fjxY/n0F/f9i+VUvYrKvirGxMREMWbMGGFjYyP09fWFpaWl8PPzE8nJyUpxARCurq7i1q1bYuTIkcLCwkJIJBIRGxsrLxMfHy969+4tateuLfT19UXjxo3FrFmzRE5OjsptnDdvnjhz5ozw9PQUxsbGwtTUVPTt21chxldt56sAEM2aNZO/nz59ugAg2rRpI+7cuaNU/nW/N7Qdz6jfEKNHj8a3336LH374AR9++KF8+vr161GrVi307dtX5XJCCIwYMQJbtmxBkyZNMHz4cOjr6+PgwYMYM2YM/vzzT3z++ecAIG9MFB8fDz8/P/n97ho1aijUmZubCw8PDzx+/Bh9+vSBrq4uLCwsXhr/qlWr8MEHH6BatWro168fGjRogNTUVBw7dgxRUVHo2rUrAMDPzw8//fQT2rZti4CAABgYGODmzZuIjY3FmTNn4ODg8NL1REVFITs7G8OHD0fLli1fWtbAwED+97Nnz+Dh4YHTp0+jffv2mDZtGtLT0xEZGYno6Ghs2bIFgwYNUqpj5cqViIuLg6+vLzw8PLB9+3YEBQXByMgIFy5cwPbt29G7d2+8/fbb2Lp1q3y/du/eXamuadOm4fjx4xg8eDCMjY2xZ88ezJs3D3/88QeioqLk5R48eIBp06ahW7du6NmzJ2rWrInExETs3r0bv/76K44cOYK33npLqf5t27bhwIED6N27NyZNmoTs7Oxi982TJ0/g4uKC5ORkeHl5oV+/fsjLy0NSUhJ++OEHfPTRR/IzxYyMDDg7O+PGjRtwc3PD0KFDkZSUhKioKOzduxfR0dHyz/dF06dPR3x8PHr37g1vb2/s3LkT8+fPR15eHj777LOXfnZFJk+eDAcHB3h6eqJOnTpITU3Fzp074enpiZ9//hm+vr4AADc3NyQnJyMiIgKurq4K903/e3yrw83NDd26dcPRo0dx+PBh+Pj4qCxnZ2eHefPmYefOnfj9998RFBQkX3+7du2KnVf072+//QZvb2/k5OSgd+/eaNKkCZKTk7Fp0yb8+uuvOHnypNKZ5v379+Hs7IxatWph6NChePbsGUxNTQEA33zzDSZPnowaNWrAx8cHdevWxdmzZ/HZZ58hNjYWsbGx0NfXV6jvzJkzWLp0Kdzd3TF+/HhcuHABO3fuxKVLl3D58mUYGhrKt3PBggWwtbVVaKjYrl27Eu/XwsJCjB8/HuvWrYOLiwt++eUXlZ/X635vaL2K/qVAxXvxjFoIIVq3bi1atWoln3/nzh2hq6srpkyZIoQQKs+o165dKwCIgIAAhTOz3Nxc4ePjIwCIs2fPyqe/eKaoiq2trTymJ0+eKM1XdUZ98eJFoaOjI+rXr6909iKTyURqaqoQQojMzEwhkUhEhw4dREFBgUK5goIC8fDhQ5Uxvcjf318AEN99990ry75owYIFAoAYMWKEkMlk8unnz58X+vr6okaNGiI7O1s+vWg/1apVS9y4cUM+PSUlRejr6wszMzPRtGlTcffuXfm8U6dOCQDCx8dHYd1F+6xOnTri5s2b8um5ubmie/fuAoCIioqST3/27Jm4deuW0jZcvnxZGBsbC09PT4XpRWdqOjo64uDBg0rLqTqj3r17twAgpk2bplT+0aNHClcjAgICBAAxc+ZMhXJ79+4VAETjxo0VzpKLttfe3l7cvn1bPj0jI0PUqFFDmJiYiNzcXKX1qpKYmKg07fbt26J+/fqiSZMmCtNfPCMsjZKcUQshxJw5cwQAMWfOHPm0/55RF3nZlafi5uXl5Qk7OzthYmKidOXk6NGjQiqVit69eytMx/+fxQYEBCj9n0pISBC6urrCwcFB3Lt3T2FeaGioACA+//xzpf2A/7+C96KRI0cKAGLLli1K63d1dVXaxlcpOj4GDBggAIgePXooneEX0cT3hrbTjhZIVCKjR49GQkICfvvtNwBAREQECgoKFO6J/ddXX32F6tWrY/Xq1dDT05NP19fXl5+1bNmypdSxLF26FNWqVStR2W+//RYymQyLFi1SapUukUjk9+IkEgmEEDA0NFRqHCeVSkt05pOWlgYAsLa2LlFsRSIiIqCnp4ewsDBIJBL5dEdHR/j5+SEzMxM7d+5UWi4oKEjhDMbGxgZdu3ZFVlYWZs2ahTp16sjnOTk5oWHDhvj9999VxhAUFKQQ94uf0YuP9xgYGKhsrNeqVSu4u7vjyJEjyM/PV5rv6+sLT0/P4neCCqo+Y2NjY/nViLy8PGzZsgW1a9fG7NmzFcr17NkT77zzDq5fv47jx48r1TNnzhyF9gzm5ubw9fXFo0ePcPXq1RLFZ29vrzStXr16GDBgAP7++2/8888/JapHE4qO43v37pVJ/b/88guSk5Mxffp0ODo6Kszr2rUrfH19sW/fPqUrJfr6+li6dCmkUqnC9G+//RYFBQX48ssvUbt2bYV5H3/8MerUqaPyu6F79+4YMmSIwrSi76AzZ86ovX3/lZSUhO3bt8PW1hY7duyAkZGRynKa+N7Qdrz0/QZ57733MGPGDKxfvx5OTk7YsGEDHB0di72U9OTJE1y6dAn169fHkiVLlOYXfZlfuXKlVHEYGhqWqrXl6dOnAQBeXl4vLWdqaoqePXti3759aN++PQYNGgQ3Nze89dZbCj8yNC07OxuJiYlo0aKFygTv7u6O8PBwXLx4ESNHjlSYp2rfFyWf4uYV/dD6r27duilNc3Z2hq6uLi5cuKAw/eLFi1i6dCmOHTuGtLQ0pcR87949pUZ9nTp1UrleVbp374569eohLCwMv//+O3r37g1XV1e0aNFC4YfMlStX8OzZM7i7u6v8InV3d8fBgwdx8eJFpe3r0KGDUvmi/Z+ZmVmiOBMTExEaGorDhw8jNTUVubm5CvNv374NW1vbEtWl7U6dOgUAuHr1qspGbWlpaZDJZLh27Ro6duwon25vbw9zc/Ni64uOjkZMTIzSfD09PZXfDZr43Eqifv36qFmzJhISEjB58mSEh4crHHtFKup7ozwxUb9B6tSpAx8fH2zduhWDBg3C1atX8eWXXxZb/uHDhxBCIDU1FQsWLCi2XE5OTqniqFu3rsr/MMXJysqCRCIpUWvwbdu2YfHixdi8eTNmzZoF4Pl/xICAACxevLjYX9VFLC0tAQCpqakljq/oDKS4++xFcau6p1t0r+9Furq6L51X3PPwqtYvlUpRu3ZtZGVlyaedOHECHh4eAJ7/+GnSpAmMjY0hkUjk9zf/m7CKq784ZmZmOHXqFObOnYs9e/Zg3759AJ5fMQgJCcGkSZMAlN2+KywsfGWM169fR6dOnZCdnQ13d3f4+PjA1NQUOjo6iIuLQ3x8vMr9UFaKWmu/eBVFkx48eAAA2LRp00vL/ff/c3GfTVF9JW0PUOR1P7eSMjExQWxsLN5++22sW7cOMpkM3333ncpHUV/3e0Pb8dL3G2bMmDHIzs6Gv78/DA0NMWLEiGLLFv2H6tChA4QQxb5iY2NLFUNpkjTwvCGMEAJ37tx5ZVkjIyMsWrQIiYmJSExMxLp169CsWTN5Y7RXcXFxAQCVZwjFKdpP6enpKucXXU5X9QWlSarWX1hYiPv37ys84vPZZ58hNzcXhw4dwu7du7F8+XIsWLAA8+fPl/9QUaW0n1uDBg2wceNGZGRk4MKFC1iyZAlkMhkmT54svyRakfvuiy++wMOHD7Fx40YcPHgQK1euxMKFCzF//vwKedysqJMgVQ35NKFoH+7Zs+el/59dXV0Vlivucy+qLzs7+6X1VaQ6derg8OHDcHBwwIYNGxAQEKCyo6LX/d7QdkzUbxhvb29YWVkhNTUVffv2Rc2aNYsta2JighYtWuCvv/4q8SWpovtYmvxlXHTJ9cCBA6Vazt7eHqNHj0Z8fDyMjY2xe/fuVy4zcOBAmJqaYvv27a+8pF90tmVqaoqGDRvi+vXrKs/Ei76AS9NaVR2qnus+efIkCgoKFO5J3rhxA7Vq1VJqSf3kyROcP39e43Hp6OigXbt2+Pjjj+UJuuizaN68OQwNDXHmzBk8efJEadmy3Hc3btwAAHnL7iJCCJX3xMvi2C4SHx+Po0ePom7duvKrHZrm5OQE4Pkxocn6ii6BlwUdHZ3X3t/m5uY4fPgwHB0d8f3332PUqFEvrVOd7w1tx0T9hpFKpdi5cyd27NiB0NDQV5afOnUqnjx5gnHjxqm8xJ2UlKTQL3etWrUAADdv3tRYzBMmTIBUKsXs2bOVGvcIIeSXDDMyMnD58mWl5R8+fIjc3FwYGhq+cl01atTAsmXLkJubi169euHixYtKZQoLCxEREYEJEybIp/n5+SE/Px8zZ85UOIv4448/sHHjRpiZmRX7CJymrFq1Crdu3ZK/z8vLk1/Ge/HxFltbWzx8+BAJCQnyaYWFhfjoo4+QkZGhkVgSEhJUniUXTSv6LPT19TFs2DDcu3dP6Xjcv38/oqOj0bhxY/mVDk0quvd87NgxhelhYWEqj6OyOLaB52e4AwYMAPC8U46yuszq6+uLBg0aYMWKFThy5IjS/Pz8fKV98TKTJk2Crq4upkyZgpSUFKX5mZmZSm0jSqtWrVoKx/Tr1BMTE4MOHTpg06ZNeO+99+TJWhPfG9qO96jfQB07dlRoLPIy48ePx6lTpxAREYHjx4/D09MT9evXR3p6Oq5cuYLffvsNmzdvlrfGdnd3h0QiwSeffIKEhASYmZmhRo0aCAwMVDveNm3aYOXKlZg6dSpatWqFvn37wtbWFmlpaThy5Ah69eqFlStXIjU1FY6OjnBwcEDbtm1hZWWF+/fvY9euXcjPz8dHH31UovW9//77yM7ORkhICNq3b4/u3bvD0dER1apVQ2pqKmJiYpCamoqxY8fKl/n444+xd+9e/PDDD/jrr7/w9ttv4+7du4iMjERBQQHCw8NhYmKi9j4oic6dO8PBwQFDhgxB9erVsWfPHly9ehX9+/eXJwIAmDJlCg4cOICuXbti8ODBMDQ0RFxcHFJTU+Hm5qaRftoPHjyI6dOnw8XFBU2bNkXt2rXlz2obGhpi8uTJ8rJLlixBfHw8Fi1ahBMnTsDJyQnJycnYtm0bjIyMsGHDhjLp4nbChAnYsGEDBgwYgMGDB6N27do4deoUzp8/j169emHv3r0K5Zs3b4769etj69atMDAwgLW1NSQSCaZMmVKi7lPPnj0rb8T17Nkz3LlzBydOnMD169dRrVo1rF69ukwHNjEwMEBUVBR69OgBV1dXeHh4oE2bNpBIJPjnn39w9OhR1K5du8SNQ1u3bo2vv/4aEydORLNmzdCzZ080atQIjx49QmJiIuLj4+Hv7481a9aoHbOHhwd++ukn9O3bF46OjpBKpejTpw/atm1b6rpq1qyJQ4cOwdvbG1u3boVMJsOmTZs09r2h1crnKTBSx3+fo36V4nomE0KIyMhI4enpKWrWrCn09PSElZWVcHNzE8uXLxcZGRkKZTdu3CjatGkjDAwMiu2ZrDgvez40NjZW9O7dW9SqVUveM9mAAQPE8ePHhRDPexWbP3++6N69u6hXr57Q19cX9evXF++++6749ddfS7QPXnTlyhURGBgoWrZsKYyNjeXb3bdvXxEVFaXwvLQQz3smmzNnjmjatKn82ekePXoo9JxW5GXPm79sHxT1/qaq/I0bN0RYWJho3Lix0NfXF7a2tmL+/PkqnymOiooS7du3F0ZGRsLc3FwMHjxY3LhxQ+W6i3uWt4iq56j//PNPERQUJBwdHUXt2rWFgYGBaNiwofDz81PZ41tGRoaYOnWqsLW1FXp6esLc3FwMHDjwpT2Tqdo/r3qO/79iY2OFi4uLMDExETVq1BA9e/YU586dK7aeU6dOCVdXV2FiYlLqnslefBkZGQlra2vh7e0twsLCFJ4Hf5Emn6MucuvWLREUFCSaNGkiDAwMhKmpqWjRooUYO3asiImJUSiLEjzHfPr0aTF06FBRv359+WfXvn17ERISIv766y+l/aDqOfTiere7c+eOGDx4sDA3Nxc6Ojpq90z2oqysLNG5c2cBQAwYMEDcvXtXo98b2kgiRAW3FiCq4vz9/REREYGkpKTXHv2MiCof3qMmIiLSYkzUREREWoyJmoiISIvxHjUREZEW4xk1ERGRFmOiJiIi0mJVvsMTmUyG27dvw8TEpNR9IRMREalDCIFHjx6hfv36r+wQqMon6tu3b8PGxqaiwyAioiro5s2bKofXfVGVT9RF3ULevHmzzEdHIiIiAp6PWmZjY1OiromrfKIuutxtamrKRE1EROWqJLdc2ZiMiIhIizFRExERaTGtuvR95MgRLFu2DOfOncOdO3ewY8eOV44BHBcXh+DgYCQkJMDGxgazZ88u06HmiIi0RWFhIfLz8ys6DHoJPT09SKXS16pDqxJ1Tk4OHBwcMHr0aPTv3/+V5ZOSktCrVy9MmDABmzZtQkxMDMaOHYt69erB29u7HCImIqoYjx8/xq1bt8DOJbWbRCKBtbU1jI2N1a5DqxJ1jx490KNHjxKXX7NmDezt7bF8+XIAQIsWLXDs2DF88cUXTNREVGkVFhbi1q1bMDIyQp06ddgHhJYSQiAjIwO3bt1CkyZN1D6z1qpEXVonT56Ep6enwjRvb29MmzatYgIiIioH+fn5EEKgTp06qFatWkWHQy9Rp04dJCcnIz8/v2om6rS0NFhYWChMs7CwQHZ2Np4+faryAM7NzUVubq78fXZ2dpnHSURUFngmrf008Rm90YlaHaGhoViwYEGZ1e/z5bEyq3uP/qwyq3uIlWWZ1R3ZO7LM6q6qeJwp43GmeX+nPyqzupvopJaq/Nyw/6FZY3uMGOhTbJmxH8zGiAE+aODe9XXDw2/Hf8OP3/2I61euQ09fD81aNsPYwLHo6dJTXsbf3x/t2rUr86u4b3SitrS0RHp6usK09PR0mJqaFns5aObMmQgODpa/L+odhojoTVYWP96e5Rfi6xHtNVpnQUEBdHVLn3oWhkx9ZZnvvlgEALhR6toV/W/J/3DpwiVMnTEVrRxaQUdHBxfOXMD86fPx7KNnJWrsrElv9HPUzs7OiImJUZh28OBBODs7F7uMgYGBvBcy9kZGRKQZTS1N8UXYQvh6doVXF0fs3h6pMG/V0s/Q39sVyz+bj8ePH2Fc8Bx08h6Mtq6+eP/DucjLywMApN5Jx8DRQWjj2gdtXX0xJ2wVAMB/ykys/DYCALAnOhZtXX3Rzr0fWnf3wa5fn+cBt76jsHPfIQDA/Yz7mDRyEnp26Ykezj2wZcMWeTyubV2xcvFKDPQaCDcHN6z+fLV8XsyvMUi+noy1W9aijWMb+YAZjm85Yn3UeqxYsQL3799X2v6jR4+iZcuWOHv2rCZ3KwAtO6N+/Pgxrl+/Ln+flJSEixcvolatWmjQoAFmzpyJ1NRUfP/99wCACRMm4KuvvsLHH3+M0aNH4/Dhw/jpp5+wd+/eitoEIqIqSyKRYNehY0j5JwkDvF3R/q3OsG5gCwCQSqX4OToeADD7o6no5dQB4Ss+hRAC44LnYNXaHzA9cAzem/QxvNxcELX+eYLOuPdAaT2zQ1fh28/nw/ktR8hkMmQ/eqxUZsGMBbBvbI+vf/ga9zPuw9fNF81bN4fjW44AgOysbEQdiMKD+w/g4eiBAcMHwLK+Jb5f+z1WrF0BiUSCZQuW4cihI2jYpCEKCgowa/EsBAUFYdOmTZg69d8z/MjISISGhmLv3r2wt7fX+H7VqkR99uxZuLu7y98XXaL28/PDxo0bcefOHaSkpMjn29vbY+/evfjggw+watUqWFtb47vvvuOjWUREFWDQcD8AQANbe3Ts7IIzp47LE/XAYSPl5Q79+gv+PHccK9Y8P0N++uwZpDpSPH6cg2O/nUd0ZLi8bB3zWkrrebtbZwTNDsXA3l7wcnNBuzYtlMqciDuBnXE7AQC169SGt483TsSfkCdqn/+/112rdi3Y2Nng1j+3YFnfErnPclG7Tm3EHYjDtT+v4efDPyM1JRV9XPtAJpOhVatWiI+Pl6/nhx9+gFQqRWxsLGrWrPk6u69YWpWo3dzcXvrw/saNG1Uuc+HChTKMioiI1PFii2ej6tXlfwshsH39KjRtpHj2+fhxTonqXfFpCBKu/I3Y46fhN2UmRgzojY+njC1xLABgYGgg/1sqlaKgsOB5OZ3n5a79dQ1uXm7Q09ODXSM7NGneBABw584d1KtXT75s27ZtcfToUVy6dAndu3cvUfyl9UbfoyYiIu2xfeuPAIBbKf/g3G8n0NGpi8pynj16Y8mX36Gg4HlyfJiZheuJ/8DYuDq6O3fE8m82ysuquvR95e9EtGreBIFjRmCi/1CcOve7Upkubl0QGfH8Pvn9e/cRvScaLm4ur9wGXV1dZD7MRNMWTXEk5ggKCgrwT9I/+PvK38h8mIklS5bgvffek5d3cHDAnj17MHr0aOzfv/+V9auDiZqIiDRCJiuEr2dXjB7aF7MXLZVf9v6vTxaGopqhIdp59ENbV1+8PSAAyTefP671w+olOHvxMlp164127v3w1bpNyst/9gVadesNR4/++GHbbsyfHqhUZm7YXNy4dgM9u/TEez7vYdKHk9CuY7tXbsOwgGEImxsGNy83NG7WGP3c+2HFpyvg2dMTXy39CqGhobC1VdyuFi1aIDo6GkFBQdi+fXsJ9lTpSEQV7yg2OzsbZmZmyMrK0kgLcD7fqozPt2oejzNlVek4e/bsGZKSkmBvbw9DQ8MyW09pnqNuammKs1dTYGpWo0TlS/scdWnc0NN7reXD5oQh/U46gj4Jgl1DOwghcO3Pa7hw5gJmTSvd/4/iPqvS5B6tukdNRERU0UI+DcHRw0cRNicMqTdTUZBfgBZtWsB/gn+FxMNETUREr+1aWuXqjrmbRzd08+hW0WEA4D1qIiIircZETUREpMWYqImIiLQYEzUREZEWY6ImIiLSYmz1TURUGXzrqvEqbQpkuDlwn8brpdLhGTUREWm9F4ewfHHIyyKJyTcRGPIpHNz6wqebD8YOHouYXxWHQd6+eTsmjJhQbjFrChM1ERFpVFEf3uXl15gjGPJ+MLzcXHA6+ifsOboH8z+fjwO/HEDonNByjaUsMFETEdFra2ppilVLP0N/b1cs/2w+Hj9+hFkfTsGAd93g4+6M2R9NRV5eHgAg7c5tDBwdhDaufdDW1Rdzwp6PPb15+y9wencIHD36w8GtL/ZEx75yvbfT7mLuki8Rs30D+rzrAQMDfQCAdQNrLFm9BI8fPcbRw0eVlku/k45+Hv2w7cdtGtwLZYP3qImISCOkUil+jn4+VvPsj6aio5MzPlv+JYQQmPXhFHwf/g3GTg7C9Mnj4OveEVHrnyfoohGyvN1dMKx/L0gkEiSnpKJzj6H453yMPPmq8m1EJGZNGw9TE2P8tOtXLF65FtXNa6Fpi6ZwfMsRwbOCMffDuQq9jF1NuIqgsUGY9dksrel97GWYqImISCMGDhsp//vQr7/g4tnT2PDtagBA7rOnkEqlyMl5jHOnT+LIT1/Jy9YxrwUASEpJxYiJH+PWnTToSnXxIDMLSSm30LxJw2LXee6PBARP9Mf9Bw8xO3QVju3ZhDRTEwzvPRwt2rRA7Tq1kfkwU17+7yt/Y/zw8fjmx2/Qok0LDe+BssFETUREGmFUvbr8byEEvlz3A+wbNVEok5PzuNjlh77/IcLmBGOgjzcAoFbTznj2LPeV65Xq6OBq8k04tmmBunVq45GeHlxcn489/fTJU1SrVk1etq5lXeTl5eHk0ZNvTKLmPWoiItI4zx69Ef7VSnnDsqzMh/gn6QaqVzdGx84uWP7NRnnZokvfD7OyYN/AGgDw47bdeJiZ9cr1tGvdHPEnz6CRnQ1+T7iKe/cf4knOE5w4cgL5eflYtmAZ+g/vLy9vVsMMET9H4ODeg/hy6Zca3OKywzNqIqLKYHy8xqu8WYrxqP/rk4Wh+HzRPPi+7QIdHR1IdXUxfc5C2No3wrKv1uKLWYFo1a039HT14PuuBxbMmIJViz7BwDFBqGFqCo+uTmhgXe+V63l/5GAMGvsBDm/fgAUfB+KdQWNgVLsmOrl0wp7tezDUbyh69u2psIyxiTE2RG3AxJETETYnDCGfhqi9neVBIoQQFR1ERSrN4N0l4fPlMQ1Epdoe/dINWF4aQ6wsy6zuyN6RZVZ3VcXjTFlVOs6ePXuGpKQk2Nvbw9DQsMzW8/drJOpXaaKTqrG6fv7lAFasiUDY7GC4OLVHor4+0m6n4cAvB9BvaD+YmJpobF2NajQqVfniPqvS5B6eURMR0Rutf28vtGzWGMu/2YCg2YvxKC8f9azqYeB7AzWapCsKEzUREb3xmjdpiPAVnwIAbujpVXA0msXGZERERFqMiZqIiEiLMVETERFpMSZqIiIqV/29XBF3/PRr1/PiiFnt3Puh1/DxSv2Dv6kjZr2IjcmIiCqBIb8M0XidufkyLOuyXuP1asKvMUcwd8mXmBM8EcsXzICBgT6SU1Kx4PPV2PXbOcz8dGZFh6gxWndGvXr1atjZ2cHQ0BBOTk44ffrlv7pWrlyJZs2aoVq1arCxscEHH3yAZ8+elVO0REQEPB8965tVn2PAu27weKsNtm/5UT7v/Jnf0OdtF/RydUJI0EQUFP47DGbqnXSVI2ndSb8Lr0Fj0LJrb3gNGoOh7wdj/tLn/YMXN2KWXQMrbPjf4koxYtaLtOqMOjIyEsHBwVizZg2cnJywcuVKeHt74+rVq6hbt65S+c2bNyMkJATr169Hly5dcO3aNfj7+0MikWDFihUVsAVERFWXvr4Btu+Pw42/r2Hgu27wHTQUMpkM08b7I3Tl13Dp7o5jcTH4OXKTfJn3Jn0MLzcXpZG0pn6yGM4d22HBjClIS89AO4/+aN74+eAcqkbMMq9VA61bNIFzx3aVYsSsF2nVGfWKFSswbtw4BAQEoGXLllizZg2MjIywfr3qSy8nTpyAi4sLhg8fDjs7O3h5eWHYsGGvPAsnIiLN6zNgMACgUZOmkOrq4t7ddCRevwZdqS5cursDALq6vQ0bWzsAwOPHOTj223l8ONFfXkfRSFoxR09h9P/30W1pUQe9vVzlZc79kQD3rk7yEbMO/PQddkZ8haOnzuHps9xiR8z6Yu0Xb1ySBrQoUefl5eHcuXPw9PSUT9PR0YGnpydOnjypcpkuXbrg3Llz8sScmJiIffv2oWfPnirLExFR2TEwMJD/rSPVkQ/I8V8SiaTUdf93GamODm68MGKWsXF1eHZ3BqB6xKzadWvj5FHVuUTbaU2ivnfvHgoLC2FhYaEw3cLCAmlpaSqXGT58OBYuXIiuXbtCT08PjRo1gpubGz755JNi15Obm4vs7GyFFxERlY2GjZuioLAAp44dAQAcPxKLlOQkAICxcXV0d+6ociQtj65O2Lh1JwAg/e49/HIgTl5G1YhZOTlPEHP0FPIqyYhZL9KaRK2OuLg4LF68GF9//TXOnz+Pn3/+GXv37sWnn35a7DKhoaEwMzOTv2xsbMoxYiKiqkVfXx8rv92IxfNmordbZ/zy8zY0b9VGPv+H1Utw9uJltOrWG+3c++Grdc/vX6/67BMcPXUWLbv2xoiJ0+HU3gE1zJ732/3+yMFYuPwbGBoYyEfM8h01Ga7Ob2HLjr1wfMux2BGzzp8+j7A5YeW3AzRAaxqTmZubQyqVIj09XWF6eno6LC1Vj7gzZ84cjBw5EmPHjgUAtGnTBjk5OXj//fcxa9Ys6Ogo/w6ZOXMmgoOD5e+zs7OZrInojVcWo4eVZvSsa2mKVydP/5ks/7v9W07YHXNcYX7R6Fn1Leti+4b/KdVXq4YZ9keGQ1dXF/cfPETnHkMxM2gcAKCBdX3MCBwD7yHjEDY7GOdjtkMikSD1Tjp27DuErl5u8noGDB+AAcMHAAAMqxliQ9SGEm+TtlA7URcWFmLbtm2IjY3F3bt3sXDhQrRp0wZZWVmIiYmBi4uL0mXsl9HX10eHDh0QExODvn37AgBkMhliYmIQGBiocpknT54oJWOpVAoAKG70TgMDA4X7KEREpH3+TvwHowJDIIRAXn4+JgUMg1MHB/n8/46YlZuXB5v69TB6eP9KMWLWi9RK1JmZmXj33Xdx+vRpGBsbIycnB1OmTAEAGBsbY+rUqRg1ahQWL15cqnqDg4Ph5+eHjh07olOnTli5ciVycnIQEBAAABg1ahSsrKwQGhoKAPDx8cGKFSvg6OgIJycnXL9+HXPmzIGPj488YRMR0ZunbatmuBi746VlXhwx60U3yiqoCqJWog4JCUFCQgKio6Ph6Oio8IyzVCrFwIEDsW/fvlIn6iFDhiAjIwNz585FWloa2rVrh/3798vPzFNSUhTOoGfPng2JRILZs2cjNTUVderUgY+PDz777DN1NouIiEjrqJWod+7ciSlTpuCdd97B/fv3leY3bdoUGzduVCugwMDAYi91x8XFKbzX1dXFvHnzMG/ePLXWRUREpO3UavWdlZUFe3v7Yufn5+cX+/wcERFpRnFtcUh7aOIzUuuMulGjRjh//nyx8w8cOICWLVuqHRQRERVPT08PEokEGRkZqFOnjlodiJREYX5emdQLAM90ZGVWd6EoLLO6SzOWhBACGRkZkEgk0NPTU3udaiXqsWPHYsaMGXBzc8Pbb78N4HmvMbm5uVi4cCH279+PtWvXqh0UEREVTyqVwtraGrdu3UJycnKZredudtkNcCQkmWVWd4Zu2TUmLqxWuh8BEokE1tbWr9XAWa1EHRQUhISEBAwbNgw1atQA8LyXsPv376OgoADjx4/HmDFj1A6KiIheztjYGE2aNEF+fn6ZrWPpj+fKrO5v9L4us7r/V9e8zOr+wv2LUpXX09N77aeQ1ErUEokE4eHh8PPzQ1RUFP7++2/IZDI0atQIgwcPRvfu3V8rKCIiejWpVFqmj6Lee1p298ANC++WWd0PCsuu001DQ8Myq7s4r9UzWdeuXdG1a1dNxUJERET/8Ub39U1ERFTZqXVGbW9v/8pWhhKJBDduVLb+YYiIiMqXWona1dVVKVEXFhbin3/+wfHjx9G6dWs4OjpqJEAiIqKqTK1E/bJex37//Xd4e3tjxIgR6sZERERE/0/j96gdHBwwfvx4zJgxQ9NVExERVTll0pjMwsICf/75Z1lUTUREVKVoPFHfv38f69atg7W1taarJiIiqnLUukft4eGhcnpmZiauXLmCvLw8/PDDD68VGBEREamZqGUymVKrb4lEAnt7e3h6emL06NFo3ry5RgIkIiKqytRK1P8dF5qIiIjKBnsmIyIi0mIlOqP+/vvv1ap81KhRai1HREREz5UoUfv7+5e6YolEwkRNRET0mkqUqJOSkso6DiIiIlKhRIna1ta2rOMgIiIiFdiYjIiISIup9XgWAKSlpWHdunU4f/48srKyIJPJFOZLJBLExMS8doBERERVmVqJ+o8//oCbmxuePn2KZs2a4dKlS2jZsiUyMzORmpqKRo0awcbGRtOxEhERVTlqXfoOCQmBsbExrl69ikOHDkEIgVWrVuHmzZuIjIzEw4cPERYWpulYiYiIqhy1EvXx48cxfvx4NGjQADo6z6souvQ9aNAgjBgxAtOnT9dclERERFWUWolaJpPBwsICAFCjRg1IpVI8ePBAPr9NmzY4d+6cZiIkIiKqwtRK1Pb29vJnq3V0dGBvb49Dhw7J5584cQI1atRQK6DVq1fDzs4OhoaGcHJywunTp19aPjMzE5MnT0a9evVgYGCApk2bYt++fWqtm4iISNuolai9vLywbds2+fuJEyfiu+++g6enJ95++21ERERg+PDhpa43MjISwcHBmDdvHs6fPw8HBwd4e3vj7t27Ksvn5eXhnXfeQXJyMqKionD16lWEh4fDyspKnc0iIiLSOiVu9f3w4UPUrFkTADBr1iwMGzYM+fn50NPTw7Rp05CTk4Pt27dDKpVizpw5+OSTT0odzIoVKzBu3DgEBAQAANasWYO9e/di/fr1CAkJUSq/fv16PHjwACdOnICenh4AwM7OrtTrJSIi0lYlPqO2tLREv379EBUVBSMjI3To0EGeHCUSCWbPno0LFy7g7NmzmD9/PvT19UsVSF5eHs6dOwdPT89/g9PRgaenJ06ePKlymd27d8PZ2RmTJ0+GhYUFWrdujcWLF6OwsLBU6yYiItJWJU7UAwcOxKFDhzBkyBBYWFhg9OjRiImJgRBCI4Hcu3cPhYWF8kZqRSwsLJCWlqZymcTERERFRaGwsBD79u3DnDlzsHz5cixatKjY9eTm5iI7O1vhRUREpK1KnKg3bdqEu3fv4scff0S3bt2wadMmeHl5wcrKCh9++GGFtPKWyWSoW7cu1q5diw4dOmDIkCGYNWsW1qxZU+wyoaGhMDMzk7/YMQsREWmzUjUmq1atGoYNG4Y9e/YgLS0NX3/9NZo0aYKVK1eiU6dOaN68ORYtWoTExMRSB2Jubg6pVIr09HSF6enp6bC0tFS5TL169dC0aVNIpVL5tBYtWiAtLQ15eXkql5k5cyaysrLkr5s3b5Y6ViIiovKi9qAcNWvWxPjx4xEfH4+UlBSEhYXByMgIc+fORZMmTdClS5dS1aevr48OHToo9A8uk8kQExMDZ2dnlcu4uLjg+vXrCv2MX7t2DfXq1Sv2HrmBgQFMTU0VXkRERNpKI6NnWVlZYfr06YiIiICvry+EEPjtt99KXU9wcDDCw8MRERGBv/76CxMnTkROTo68FfioUaMwc+ZMefmJEyfiwYMHCAoKwrVr17B3714sXrwYkydP1sRmERERVTi1R88qkpKSgs2bN2PLli24fPkyhBDo0qULRowYUeq6hgwZgoyMDMydOxdpaWlo164d9u/fL29glpKSIu+yFABsbGwQHR2NDz74AG3btoWVlRWCgoIwY8aM190sIiIiraBWor537x5++uknbN68GSdPnoQQAs2bN8fChQsxYsSI13qWOTAwEIGBgSrnxcXFKU1zdnbGqVOn1F4fERGRNitxos7JycGOHTuwefNmxMTEID8/H/Xq1cO0adMwYsQItG/fvizjJCIiqpJKnKjr1q2LZ8+ewdjYGMOHD8eIESPg4eGhcCmaiIiINKvEidrT0xMjRoxAnz59YGhoWJYxERER0f8rcaLetWtXWcZBREREKvC6NRERkRZjoiYiItJiTNRERERajImaiIhIizFRExERaTG1EvWWLVvg7+9f7PyAgAD89NNP6sZERERE/0+tRP3FF1/AwMCg2PnVqlXDF198oXZQRERE9Jxaifrq1atwdHQsdr6DgwOuXLmidlBERET0nFqJWgiBzMzMYuc/fPgQ+fn56sZERERE/0+tRO3o6IgtW7YgLy9PaV5ubi42b9780jNuIiIiKhm1EnVISAguX74Md3d37NmzB4mJiUhMTMTu3bvh5uaGhIQEhISEaDpWIiKiKket8ah79OiBdevWISgoCH379pVPF0LAxMQE4eHh6NWrl6ZiJCIiqrLUStQA4O/vj/79++PgwYO4ceMGAKBRo0bw8vKCiYmJxgIkIiKqytRO1ABgamqKAQMGaCoWIiIi+o8SJeqUlBQAQIMGDRTev0pReSIiIlJPiRK1nZ0dJBIJnj59Cn19ffn7VyksLHztAImIiKqyEiXq9evXQyKRQE9PT+E9ERERla0SJer/9uv9sn6+iYiISHNK/Rz1kydPULt2bSxbtqws4iEiIqIXlDpRGxkZQVdXF9WrVy+LeIiIiOgFavVMNmDAAERFRUEIoel4iIiI6AVqPUc9dOhQTJo0Ce7u7hg3bhzs7OxQrVo1pXLt27d/7QCJiIiqMrUStZubm/zvo0ePKs0XQkAikfDxLCIiotekVqIu68ezVq9ejWXLliEtLQ0ODg748ssv0alTp1cut3XrVgwbNgy+vr7YuXNnmcVHRERUXtRK1GX5eFZkZCSCg4OxZs0aODk5YeXKlfD29sbVq1dRt27dYpdLTk7GRx99hG7dupVZbEREROVNrcZkHh4eiImJKXZ+bGwsPDw81ApoxYoVGDduHAICAtCyZUusWbMGRkZGWL9+fbHLFBYWYsSIEViwYAEaNmyo1nqJiIi0kVqJOi4uDunp6cXOv3v3LuLj40tdb15eHs6dOwdPT89/A9TRgaenJ06ePFnscgsXLkTdunUxZsyYUq+TiIhIm6k9etbL7lFfv35draEu7927h8LCQlhYWChMt7CwwJUrV1Quc+zYMaxbtw4XL14s0Tpyc3ORm5srf5+dnV3qOImIiMpLiRN1REQEIiIi5O8XLVqE8PBwpXKZmZn4448/0LNnT81E+BKPHj3CyJEjER4eDnNz8xItExoaigULFpRxZERERJpR4kT95MkTZGRkyN8/evQIOjqKV84lEgmqV6+OCRMmYO7cuaUOxtzcHFKpVOmyenp6OiwtLZXK37hxA8nJyfDx8ZFPk8lkAABdXV1cvXoVjRo1Ulhm5syZCA4Olr/Pzs6GjY1NqWMlIiIqDyVO1BMnTsTEiRMBAPb29li1ahX69Omj0WD09fXRoUMHxMTEoG/fvgCeJ96YmBgEBgYqlW/evDkuXbqkMG327Nl49OgRVq1apTIBGxgYwMDAQKNxExERlRW17lEnJSVpOg654OBg+Pn5oWPHjujUqRNWrlyJnJwcBAQEAABGjRoFKysrhIaGwtDQEK1bt1ZYvkaNGgCgNJ2IiOhNpHZjssLCQmzbtg2xsbG4e/cuFi5ciDZt2iArKwsxMTFwcXFRahRWEkOGDEFGRgbmzp2LtLQ0tGvXDvv375fXlZKSonTJnYiIqLJSK1FnZmbi3XffxenTp2FsbIycnBxMmTIFAGBsbIypU6di1KhRWLx4sVpBBQYGqrzUDTx/NOxlNm7cqNY6iYiItJFap6YhISFISEhAdHQ0EhMTFUbRkkqlGDhwIPbt26exIImIiKoqtRL1zp07MWXKFLzzzjsqn6du2rQpkpOTXzc2IiKiKk+tRJ2VlQV7e/ti5+fn56OgoEDtoIiIiOg5tRJ1o0aNcP78+WLnHzhwAC1btlQ7KCIiInpOrUQ9duxYrF+/HpGRkfL70xKJBLm5uZg1axb279+P8ePHazRQIiKiqkitVt9BQUFISEjAsGHD5M8tDx8+HPfv30dBQQHGjx/PATKIiIg0QK1ELZFIEB4eDj8/P0RFReHvv/+GTCZDo0aNMHjwYHTv3l3TcRIREVVJand4AgBdu3ZF165dNRULERER/Qe7+CIiItJiJT6jLu0AHBKJBLt27Sp1QERERPSvEifqX375BYaGhrC0tFToiaw4qjpCISIiotIpcaK2srJCamoqzM3NMXz4cAwdOlTlGNFERESkOSW+R33z5k3ExsbC0dERn376KWxsbODp6YkNGzbg0aNHZRkjERFRlVWqxmSurq749ttvkZaWhqioKNSuXRuBgYGoW7cu+vfvj6ioKOTm5pZVrERERFWOWq2+9fT04Ovri8jISKSnp8uT95AhQ7B06VJNx0hERFRlvdbjWbm5uYiOjsauXbtw4cIFGBoaws7OTkOhERERUakTtUwmQ3R0NPz9/WFhYYFhw4bh6dOnCA8Px927dzFy5MiyiJOIiKhKKnGr7xMnTmDz5s3Ytm0b7t+/j86dO2Px4sUYPHgwzM3NyzJGIiKiKqvEibpr166oVq0aevbsiWHDhskvcaekpCAlJUXlMu3bt9dIkERERFVVqfr6fvr0KbZv346ff/75peWEEJBIJCgsLHyt4IiIiKq6EifqDRs2lGUcREREpEKJE7Wfn19ZxkFEREQqcPQsIiIiLcZETUREpMWYqImIiLQYEzUREZEWY6ImIiLSYlqZqFevXg07OzsYGhrCyckJp0+fLrZseHg4unXrhpo1a6JmzZrw9PR8aXkiIqI3idYl6sjISAQHB2PevHk4f/48HBwc4O3tjbt376osHxcXh2HDhiE2NhYnT56EjY0NvLy8kJqaWs6RExERaZ7WJeoVK1Zg3LhxCAgIQMuWLbFmzRoYGRlh/fr1Kstv2rQJkyZNQrt27dC8eXN89913kMlkiImJKefIiYiINE+rEnVeXh7OnTsHT09P+TQdHR14enri5MmTJarjyZMnyM/PR61atcoqTCIionJTqr6+y9q9e/dQWFgICwsLhekWFha4cuVKieqYMWMG6tevr5DsX5Sbm4vc3Fz5++zsbPUDJiIiKmNadUb9usLCwrB161bs2LEDhoaGKsuEhobCzMxM/rKxsSnnKImIiEpOqxK1ubk5pFIp0tPTFaanp6fD0tLypct+/vnnCAsLw4EDB9C2bdtiy82cORNZWVny182bNzUSOxERUVnQqkStr6+PDh06KDQEK2oY5uzsXOxyS5cuxaeffor9+/ejY8eOL12HgYEBTE1NFV5ERETaSqvuUQNAcHAw/Pz80LFjR3Tq1AkrV65ETk4OAgICAACjRo2ClZUVQkNDAQBLlizB3LlzsXnzZtjZ2SEtLQ0AYGxsDGNj4wrbDiIiIk3QukQ9ZMgQZGRkYO7cuUhLS0O7du2wf/9+eQOzlJQU6Oj8eyHgm2++QV5eHgYOHKhQz7x58zB//vzyDJ2IiEjjtC5RA0BgYCACAwNVzouLi1N4n5ycXPYBERERVRCtukdNREREipioiYiItBgTNRERkRZjoiYiItJiTNRERERajImaiIhIizFRExERaTEmaiIiIi3GRE1ERKTFmKiJiIi0GBM1ERGRFmOiJiIi0mJM1ERERFqMiZqIiEiLMVETERFpMSZqIiIiLcZETUREpMWYqImIiLQYEzUREZEWY6ImIiLSYkzUREREWoyJmoiISIsxURMREWkxJmoiIiItxkRNRESkxZioiYiItBgTNRERkRbTykS9evVq2NnZwdDQEE5OTjh9+vRLy2/btg3NmzeHoaEh2rRpg3379pVTpERERGVL6xJ1ZGQkgoODMW/ePJw/fx4ODg7w9vbG3bt3VZY/ceIEhg0bhjFjxuDChQvo27cv+vbti8uXL5dz5ERERJqndYl6xYoVGDduHAICAtCyZUusWbMGRkZGWL9+vcryq1atwrvvvovp06ejRYsW+PTTT9G+fXt89dVX5Rw5ERGR5mlVos7Ly8O5c+fg6ekpn6ajowNPT0+cPHlS5TInT55UKA8A3t7exZYnIiJ6k+hWdAAvunfvHgoLC2FhYaEw3cLCAleuXFG5TFpamsryaWlpKsvn5uYiNzdX/j4rKwsAkJ2d/Tqhy+U/zdFIPapkFxaUWd35T/LLrG5N7Vv6F48zZTzONI/HmTJNHWdF9QghXllWqxJ1eQgNDcWCBQuUptvY2FRANKVjVtEBqGkHdlR0CFQKPM6oPPA4e+7Ro0cwM3v53tCqRG1ubg6pVIr09HSF6enp6bC0tFS5jKWlZanKz5w5E8HBwfL3MpkMDx48QO3atSGRSF5zC7RHdnY2bGxscPPmTZiamlZ0OFRJ8Tij8lAZjzMhBB49eoT69eu/sqxWJWp9fX106NABMTEx6Nu3L4DniTQmJgaBgYEql3F2dkZMTAymTZsmn3bw4EE4OzurLG9gYAADAwOFaTVq1NBE+FrJ1NS00hzYpL14nFF5qGzH2avOpItoVaIGgODgYPj5+aFjx47o1KkTVq5ciZycHAQEBAAARo0aBSsrK4SGhgIAgoKC4OrqiuXLl6NXr17YunUrzp49i7Vr11bkZhAREWmE1iXqIUOGICMjA3PnzkVaWhratWuH/fv3yxuMpaSkQEfn38bqXbp0webNmzF79mx88sknaNKkCXbu3InWrVtX1CYQERFpjESUpMkZvXFyc3MRGhqKmTNnKl3qJ9IUHmdUHqr6ccZETUREpMW0qsMTIiIiUsRETUREpMWYqImIiLQYE3Ulc+TIEfj4+KB+/fqQSCTYuXNnRYdElUxoaCjeeustmJiYoG7duujbty+uXr1a0WFRJfPNN9+gbdu28mennZ2d8euvv1Z0WBWCibqSycnJgYODA1avXl3RoVAlFR8fj8mTJ+PUqVM4ePAg8vPz4eXlhZycsusXmqoea2trhIWF4dy5czh79iw8PDzg6+uLhISEig6t3LHVdyUmkUiwY8cOeS9vRGUhIyMDdevWRXx8PLp3717R4VAlVqtWLSxbtgxjxoyp6FDKldZ1eEJEb5aiEehq1apVwZFQZVVYWIht27YhJyen2O6hKzMmaiJSm0wmw7Rp0+Di4sLeAEnjLl26BGdnZzx79gzGxsbYsWMHWrZsWdFhlTsmaiJS2+TJk3H58mUcO3asokOhSqhZs2a4ePEisrKyEBUVBT8/P8THx1e5ZM1ETURqCQwMxC+//IIjR47A2tq6osOhSkhfXx+NGzcGAHTo0AFnzpzBqlWr8O2331ZwZOWLiZqISkUIgSlTpmDHjh2Ii4uDvb19RYdEVYRMJkNubm5Fh1HumKgrmcePH+P69evy90lJSbh48SJq1aqFBg0aVGBkVFlMnjwZmzdvxq5du2BiYoK0tDQAz8fWrVatWgVHR5XFzJkz0aNHDzRo0ACPHj3C5s2bERcXh+jo6IoOrdzx8axKJi4uDu7u7krT/fz8sHHjxvIPiCodiUSicvqGDRvg7+9fvsFQpTVmzBjExMTgzp07MDMzQ9u2bTFjxgy88847FR1auWOiJiIi0mLsmYyIiEiLMVETERFpMSZqIiIiLcZETUREpMWYqImIiLQYEzUREZEWY6ImIiLSYkzUREREWoyJmohKZf78+ZBIJLh3715Fh0JUJTBRExERaTEmaiIiIi3GRE1ERKTFmKiJ6LX9888/aNy4MVq3bo309PSKDoeoUmGiJqLXcuPGDXTv3h0mJiaIi4uDhYVFRYdEVKkwUROR2q5cuYLu3bvDwsIChw8fhrm5eUWHRFTpMFETkVouX74MV1dX2NnZ4dChQ6hZs2ZFh0RUKTFRE5FafHx8YGJigujoaJiamlZ0OESVFhM1EallwIABuHHjBjZt2lTRoRBVaroVHQARvZmWLVsGXV1dTJo0CSYmJhg+fHhFh0RUKTFRE5FaJBIJ1q5di0ePHsHPzw/Gxsbo06dPRYdFVOnw0jcRqU1HRwc//vgjvLy8MHjwYBw+fLiiQyKqdJioiei16OnpISoqCp07d4avry9+++23ig6JqFKRCCFERQdBREREqvGMmoiISIsxURMREWkxJmoiIiItxkRNRESkxZioiYiItBgTNRERkRZjoiYiItJiTNRERERajImaiIhIizFRExERaTEmaiIiIi3GRE1ERKTFmKiJiIi02P8BLa9Mrn12vk0AAAAASUVORK5CYII=",
"text/plain": [
"
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"# Prepare data\n",
"metrics = [\"precision\", \"recall\", \"ndcg\"]\n",
"k_values = [1, 2, 3]\n",
"bar_width = 0.15\n",
"opacity = 0.8\n",
"\n",
"# Create subplots\n",
"fig, ax = plt.subplots(figsize=(5, 3))\n",
"\n",
"# Plotting each metric\n",
"for i, metric_name in enumerate(metrics):\n",
" y = [evaluate_results.metrics[f\"{metric_name}_at_{k}/mean\"] for k in k_values]\n",
" x = np.arange(len(k_values)) + i * bar_width\n",
" ax.bar(x, y, width=bar_width, alpha=opacity, label=f\"{metric_name}@k\")\n",
"\n",
"# Adding labels and title\n",
"ax.set_xlabel(\"k\", fontsize=12)\n",
"ax.set_ylabel(\"Metric Value\", fontsize=12)\n",
"ax.set_title(\"Metrics Comparison at Different Ks\", fontsize=14)\n",
"\n",
"# Setting x-axis ticks\n",
"ax.set_xticks(np.arange(len(k_values)) + bar_width)\n",
"ax.set_xticklabels(k_values)\n",
"\n",
"# Add legend and adjust layout\n",
"ax.legend(fontsize=8)\n",
"fig.tight_layout()\n",
"\n",
"# Display the plot\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "cac23d4b-bece-4274-836f-9ca2b7c3860d",
"showTitle": false,
"title": ""
}
},
"source": [
"### Corner case handling\n",
"\n",
"There are a few corner cases handle specially for each built-in metric."
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "e05a4ede-db44-46d2-bce8-752b0ce5d807",
"showTitle": false,
"title": ""
}
},
"source": [
"#### Empty retrieved document IDs\n",
"\n",
"When no relevant docs are retrieved:\n",
"\n",
"- `mlflow.metrics.precision_at_k(k)` is defined as:\n",
" * 0 if the ground-truth doc IDs is non-empty\n",
" * 1 if the ground-truth doc IDs is also empty\n",
"\n",
"- `mlflow.metrics.ndcg_at_k(k)` is defined as:\n",
" * 0 if the ground-truth doc IDs is non-empty\n",
" * 1 if the ground-truth doc IDs is also empty"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "931a32e7-29cb-4a22-b94e-ea2bf4f0b1a7",
"showTitle": false,
"title": ""
}
},
"source": [
"#### Empty ground-truth document IDs\n",
"\n",
"When no ground-truth document IDs are provided:\n",
"\n",
"- `mlflow.metrics.recall_at_k(k)` is defined as:\n",
" * 0 if the retrieved doc IDs is non-empty\n",
" * 1 if the retrieved doc IDs is also empty\n",
"\n",
"- `mlflow.metrics.ndcg_at_k(k)` is defined as:\n",
" * 0 if the retrieved doc IDs is non-empty\n",
" * 1 if the retrieved doc IDs is also empty"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "5a1453f6-a62d-43da-b230-955841c66651",
"showTitle": false,
"title": ""
}
},
"source": [
"#### Duplicate retreived document IDs\n",
"\n",
"It is a common case for the retriever in a RAG system to retrieve multiple chunks in the same document for a given query. In this case, `mlflow.metrics.ndcg_at_k(k)` is calculated as follows:\n",
"\n",
"If the duplicate doc IDs are in the ground truth,\n",
" they will be treated as different docs. For example, if the ground truth doc IDs are\n",
" [1, 2] and the retrieved doc IDs are [1, 1, 1, 3], the score will be equavalent to\n",
" ground truth doc IDs [10, 11, 12, 2] and retrieved doc IDs [10, 11, 12, 3].\n",
"\n",
"If the duplicate doc IDs are not in the ground truth, the ndcg score is calculated as normal."
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "525ccc10-3a60-4dc9-804e-083cfa313349",
"showTitle": false,
"title": ""
}
},
"source": [
"## Step 4: Result Analysis and Visualization\n",
"\n",
"You can view the per-row scores in the logged \"eval_results_table.json\" in artifacts by either loading it to a pandas dataframe (shown below) or visiting the MLflow run comparison UI."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "32f3d5b3-245c-46b7-87ce-d85e261eac28",
"showTitle": true,
"title": ""
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Downloading artifacts: 100%|██████████| 1/1 [00:00<00:00, 574.25it/s] "
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n"
]
},
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
question
\n",
"
source
\n",
"
retrieved_doc_ids
\n",
"
precision_at_1/score
\n",
"
precision_at_2/score
\n",
"
precision_at_3/score
\n",
"
recall_at_1/score
\n",
"
recall_at_2/score
\n",
"
recall_at_3/score
\n",
"
ndcg_at_1/score
\n",
"
ndcg_at_2/score
\n",
"
ndcg_at_3/score
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
What are her research responsibilities in corn...
\n",
"
[Moths Abundant Around Iowa _ Integrated Crop ...
\n",
"
[agllm-data/Moths Abundant Around Iowa _ Integ...
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0.000000
\n",
"
0.306574
\n",
"
\n",
"
\n",
"
1
\n",
"
What important degree day benchmark did some p...
\n",
"
[Start Scouting for Stalk Borer _ Integrated C...
\n",
"
[agllm-data/Start Scouting for Stalk Borer _ I...
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0
\n",
"
0.386853
\n",
"
0.530721
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" question \\\n",
"0 What are her research responsibilities in corn... \n",
"1 What important degree day benchmark did some p... \n",
"\n",
" source \\\n",
"0 [Moths Abundant Around Iowa _ Integrated Crop ... \n",
"1 [Start Scouting for Stalk Borer _ Integrated C... \n",
"\n",
" retrieved_doc_ids precision_at_1/score \\\n",
"0 [agllm-data/Moths Abundant Around Iowa _ Integ... 0 \n",
"1 [agllm-data/Start Scouting for Stalk Borer _ I... 0 \n",
"\n",
" precision_at_2/score precision_at_3/score recall_at_1/score \\\n",
"0 0 0 0 \n",
"1 0 0 0 \n",
"\n",
" recall_at_2/score recall_at_3/score ndcg_at_1/score ndcg_at_2/score \\\n",
"0 0 0 0 0.000000 \n",
"1 0 0 0 0.386853 \n",
"\n",
" ndcg_at_3/score \n",
"0 0.306574 \n",
"1 0.530721 "
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"eval_results_table = evaluate_results.tables[\"eval_results_table\"]\n",
"eval_results_table.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "cf18dd29-1017-4245-9f3b-923dbd46f742",
"showTitle": false,
"title": ""
}
},
"source": [
"With the evaluate results table, you can further visualize the well-answered questions and poorly-answered questions using topical analysis techniques."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "b1d9e40a-ccf6-4d6a-b24c-8cf41bbfa005",
"showTitle": true,
"title": "Utilitity functions"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[nltk_data] Downloading package punkt to\n",
"[nltk_data] /Users/liang.zhang/nltk_data...\n",
"[nltk_data] Package punkt is already up-to-date!\n",
"[nltk_data] Downloading package stopwords to\n",
"[nltk_data] /Users/liang.zhang/nltk_data...\n",
"[nltk_data] Package stopwords is already up-to-date!\n"
]
}
],
"source": [
"import nltk\n",
"import pyLDAvis.gensim_models as gensimvis\n",
"from gensim import corpora, models\n",
"from nltk.corpus import stopwords\n",
"from nltk.tokenize import word_tokenize\n",
"\n",
"# Initialize NLTK resources\n",
"nltk.download(\"punkt\")\n",
"nltk.download(\"stopwords\")\n",
"\n",
"\n",
"def topical_analysis(questions: List[str]):\n",
" stop_words = set(stopwords.words(\"english\"))\n",
"\n",
" # Tokenize and remove stop words\n",
" tokenized_data = []\n",
" for question in questions:\n",
" tokens = word_tokenize(question.lower())\n",
" filtered_tokens = [word for word in tokens if word not in stop_words and word.isalpha()]\n",
" tokenized_data.append(filtered_tokens)\n",
"\n",
" # Create a dictionary and corpus\n",
" dictionary = corpora.Dictionary(tokenized_data)\n",
" corpus = [dictionary.doc2bow(text) for text in tokenized_data]\n",
"\n",
" # Apply LDA model\n",
" lda_model = models.LdaModel(corpus, num_topics=5, id2word=dictionary, passes=15)\n",
"\n",
" # Get topic distribution for each question\n",
" topic_distribution = []\n",
" for i, ques in enumerate(questions):\n",
" bow = dictionary.doc2bow(tokenized_data[i])\n",
" topics = lda_model.get_document_topics(bow)\n",
" topic_distribution.append(topics)\n",
" print(f\"Question: {ques}\\nTopic: {topics}\")\n",
"\n",
" # Print all topics\n",
" print(\"\\nTopics found are:\")\n",
" for idx, topic in lda_model.print_topics(-1):\n",
" print(f\"Topic: {idx} \\nWords: {topic}\\n\")\n",
" return lda_model, corpus, dictionary"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "e892d804-a4d8-468c-93e2-acc4a5fbcf2c",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"filtered_df = eval_results_table[eval_results_table[\"precision_at_1/score\"] == 1]\n",
"hit_questions = filtered_df[\"question\"].tolist()\n",
"filtered_df = eval_results_table[eval_results_table[\"precision_at_1/score\"] == 0]\n",
"miss_questions = filtered_df[\"question\"].tolist()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "7c178b69-37d4-4a6b-9737-b93e7f3d75c5",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Question: What is the purpose of the MLflow Model Registry?\n",
"Topic: [(0, 0.0400703), (1, 0.040002838), (2, 0.040673085), (3, 0.04075462), (4, 0.8384991)]\n",
"Question: What is the purpose of registering a model with the Model Registry?\n",
"Topic: [(0, 0.0334267), (1, 0.033337697), (2, 0.033401005), (3, 0.033786207), (4, 0.8660484)]\n",
"Question: What can you do with registered models and model versions?\n",
"Topic: [(0, 0.04019648), (1, 0.04000775), (2, 0.040166058), (3, 0.8391777), (4, 0.040452003)]\n",
"Question: How can you add, modify, update, or delete a model in the Model Registry?\n",
"Topic: [(0, 0.025052568), (1, 0.025006149), (2, 0.025024023), (3, 0.025236268), (4, 0.899681)]\n",
"Question: How can you deploy and organize models in the Model Registry?\n",
"Topic: [(0, 0.033460867), (1, 0.033337582), (2, 0.033362914), (3, 0.8659808), (4, 0.033857808)]\n",
"Question: What method do you use to create a new registered model?\n",
"Topic: [(0, 0.028867528), (1, 0.028582651), (2, 0.882546), (3, 0.030021703), (4, 0.029982116)]\n",
"Question: How can you deploy and organize models in the Model Registry?\n",
"Topic: [(0, 0.033460878), (1, 0.033337586), (2, 0.033362918), (3, 0.8659798), (4, 0.03385884)]\n",
"Question: How can you fetch a list of registered models in the MLflow registry?\n",
"Topic: [(0, 0.0286206), (1, 0.028577656), (2, 0.02894385), (3, 0.88495284), (4, 0.028905064)]\n",
"Question: What is the default channel logged for models using MLflow v1.18 and above?\n",
"Topic: [(0, 0.02862059), (1, 0.028577654), (2, 0.028883327), (3, 0.8851736), (4, 0.028744776)]\n",
"Question: What information is stored in the conda.yaml file?\n",
"Topic: [(0, 0.050020963), (1, 0.051287953), (2, 0.051250603), (3, 0.7968765), (4, 0.05056402)]\n",
"Question: How can you save a model with a manually specified conda environment?\n",
"Topic: [(0, 0.02862434), (1, 0.02858204), (2, 0.02886313), (3, 0.8851747), (4, 0.028755778)]\n",
"Question: What are inference params and how are they used during model inference?\n",
"Topic: [(0, 0.86457103), (1, 0.03353862), (2, 0.033417325), (3, 0.034004394), (4, 0.034468662)]\n",
"Question: What is the purpose of model signatures in MLflow?\n",
"Topic: [(0, 0.040070876), (1, 0.04000346), (2, 0.040688124), (3, 0.040469088), (4, 0.8387685)]\n",
"Question: What is the API used to set signatures on models?\n",
"Topic: [(0, 0.033873636), (1, 0.033508822), (2, 0.033337757), (3, 0.035357967), (4, 0.8639218)]\n",
"Question: What components are used to generate the final time series?\n",
"Topic: [(0, 0.028693806), (1, 0.8853218), (2, 0.028573763), (3, 0.02862714), (4, 0.0287835)]\n",
"Question: What functionality does the configuration DataFrame submitted to the pyfunc flavor provide?\n",
"Topic: [(0, 0.02519801), (1, 0.025009492), (2, 0.025004204), (3, 0.025004204), (4, 0.8997841)]\n",
"Question: What is a common configuration for lowering the total memory pressure for pytorch models within transformers pipelines?\n",
"Topic: [(0, 0.93316424), (1, 0.016669936), (2, 0.016668117), (3, 0.016788227), (4, 0.016709473)]\n",
"Question: What does the save_model() function do?\n",
"Topic: [(0, 0.10002145), (1, 0.59994656), (2, 0.10001026), (3, 0.10001026), (4, 0.10001151)]\n",
"Question: What is an MLflow Project?\n",
"Topic: [(0, 0.06667001), (1, 0.06667029), (2, 0.7321751), (3, 0.06711196), (4, 0.06737265)]\n",
"Question: What are the entry points in a MLproject file and how can you specify parameters for them?\n",
"Topic: [(0, 0.02857626), (1, 0.88541776), (2, 0.02868285), (3, 0.028626908), (4, 0.02869626)]\n",
"Question: What are the project environments supported by MLflow?\n",
"Topic: [(0, 0.040009078), (1, 0.040009864), (2, 0.839655), (3, 0.040126894), (4, 0.040199146)]\n",
"Question: What is the purpose of specifying a Conda environment in an MLflow project?\n",
"Topic: [(0, 0.028579442), (1, 0.028580135), (2, 0.8841217), (3, 0.028901232), (4, 0.029817443)]\n",
"Question: What is the purpose of the MLproject file?\n",
"Topic: [(0, 0.05001335), (1, 0.052611485), (2, 0.050071735), (3, 0.05043289), (4, 0.7968705)]\n",
"Question: How can you pass runtime parameters to the entry point of an MLflow Project?\n",
"Topic: [(0, 0.025007373), (1, 0.025498485), (2, 0.8993807), (3, 0.02504522), (4, 0.025068246)]\n",
"Question: How does MLflow run a Project on Kubernetes?\n",
"Topic: [(0, 0.04000677), (1, 0.040007353), (2, 0.83931196), (3, 0.04012452), (4, 0.04054937)]\n",
"Question: What fields are replaced when MLflow creates a Kubernetes Job for an MLflow Project?\n",
"Topic: [(0, 0.022228329), (1, 0.022228856), (2, 0.023192631), (3, 0.02235802), (4, 0.90999216)]\n",
"Question: What is the syntax for searching runs using the MLflow UI and API?\n",
"Topic: [(0, 0.025003674), (1, 0.02500399), (2, 0.02527212), (3, 0.89956146), (4, 0.025158761)]\n",
"Question: What is the syntax for searching runs using the MLflow UI and API?\n",
"Topic: [(0, 0.025003672), (1, 0.025003988), (2, 0.025272164), (3, 0.8995614), (4, 0.025158769)]\n",
"Question: What are the key parts of a search expression in MLflow?\n",
"Topic: [(0, 0.03334423), (1, 0.03334517), (2, 0.8662702), (3, 0.033611353), (4, 0.033429127)]\n",
"Question: What are the key attributes for the model with the run_id 'a1b2c3d4' and run_name 'my-run'?\n",
"Topic: [(0, 0.05017508), (1, 0.05001634), (2, 0.05058142), (3, 0.7985237), (4, 0.050703418)]\n",
"Question: What information does each run record in MLflow Tracking?\n",
"Topic: [(0, 0.03333968), (1, 0.033340227), (2, 0.86639804), (3, 0.03349555), (4, 0.033426523)]\n",
"Question: What are the two components used by MLflow for storage?\n",
"Topic: [(0, 0.0334928), (1, 0.033938777), (2, 0.033719826), (3, 0.03357158), (4, 0.86527705)]\n",
"Question: What interfaces does the MLflow client use to record MLflow entities and artifacts when running MLflow on a local machine with a SQLAlchemy-compatible database?\n",
"Topic: [(0, 0.014289577), (1, 0.014289909), (2, 0.94276434), (3, 0.014325481), (4, 0.014330726)]\n",
"Question: What is the default backend store used by MLflow?\n",
"Topic: [(0, 0.033753525), (1, 0.03379533), (2, 0.033777602), (3, 0.86454684), (4, 0.0341267)]\n",
"Question: What information does autologging capture when launching short-lived MLflow runs?\n",
"Topic: [(0, 0.028579954), (1, 0.02858069), (2, 0.8851724), (3, 0.029027484), (4, 0.028639426)]\n",
"Question: What is the purpose of the --serve-artifacts flag?\n",
"Topic: [(0, 0.06670548), (1, 0.066708855), (2, 0.067003354), (3, 0.3969311), (4, 0.40265122)]\n",
"\n",
"Topics found are:\n",
"Topic: 0 \n",
"Words: 0.059*\"inference\" + 0.032*\"models\" + 0.032*\"used\" + 0.032*\"configuration\" + 0.032*\"common\" + 0.032*\"transformers\" + 0.032*\"total\" + 0.032*\"within\" + 0.032*\"pytorch\" + 0.032*\"pipelines\"\n",
"\n",
"Topic: 1 \n",
"Words: 0.036*\"file\" + 0.035*\"mlproject\" + 0.035*\"used\" + 0.035*\"components\" + 0.035*\"entry\" + 0.035*\"parameters\" + 0.035*\"specify\" + 0.035*\"final\" + 0.035*\"points\" + 0.035*\"time\"\n",
"\n",
"Topic: 2 \n",
"Words: 0.142*\"mlflow\" + 0.066*\"project\" + 0.028*\"information\" + 0.028*\"use\" + 0.028*\"record\" + 0.028*\"run\" + 0.015*\"key\" + 0.015*\"running\" + 0.015*\"artifacts\" + 0.015*\"client\"\n",
"\n",
"Topic: 3 \n",
"Words: 0.066*\"models\" + 0.066*\"model\" + 0.066*\"mlflow\" + 0.041*\"using\" + 0.041*\"registry\" + 0.028*\"api\" + 0.028*\"registered\" + 0.028*\"runs\" + 0.028*\"syntax\" + 0.028*\"searching\"\n",
"\n",
"Topic: 4 \n",
"Words: 0.089*\"model\" + 0.074*\"purpose\" + 0.074*\"mlflow\" + 0.046*\"registry\" + 0.031*\"used\" + 0.031*\"signatures\" + 0.017*\"kubernetes\" + 0.017*\"fields\" + 0.017*\"job\" + 0.017*\"replaced\"\n",
"\n"
]
}
],
"source": [
"lda_model, corpus, dictionary = topical_analysis(hit_questions)\n",
"vis_data = gensimvis.prepare(lda_model, corpus, dictionary)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "a0587a0f-b35d-488d-9054-55435a9585bf",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"# Uncomment the following line to render the interactive widget\n",
"# pyLDAvis.display(vis_data)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "1375250d-9818-4503-87ec-f14020d87c81",
"showTitle": false,
"title": ""
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Question: What is the purpose of the mlflow.sklearn.log_model() method?\n",
"Topic: [(0, 0.0669118), (1, 0.06701085), (2, 0.06667974), (3, 0.73235476), (4, 0.06704286)]\n",
"Question: How can you fetch a specific model version?\n",
"Topic: [(0, 0.83980393), (1, 0.040003464), (2, 0.04000601), (3, 0.040101767), (4, 0.040084846)]\n",
"Question: How can you fetch the latest model version in a specific stage?\n",
"Topic: [(0, 0.88561153), (1, 0.028575428), (2, 0.028578365), (3, 0.0286214), (4, 0.028613236)]\n",
"Question: What can you do to promote MLflow Models across environments?\n",
"Topic: [(0, 0.8661927), (1, 0.0333396), (2, 0.03362743), (3, 0.033428304), (4, 0.033411972)]\n",
"Question: What is the name of the model and its version details?\n",
"Topic: [(0, 0.83978903), (1, 0.04000637), (2, 0.04001106), (3, 0.040105395), (4, 0.040088095)]\n",
"Question: What is the purpose of saving the model in pickled format?\n",
"Topic: [(0, 0.033948876), (1, 0.03339717), (2, 0.033340737), (3, 0.86575514), (4, 0.033558063)]\n",
"Question: What is an MLflow Model and what is its purpose?\n",
"Topic: [(0, 0.7940762), (1, 0.05068333), (2, 0.050770763), (3, 0.053328265), (4, 0.05114142)]\n",
"Question: What are the flavors defined in the MLmodel file for the mlflow.sklearn library?\n",
"Topic: [(0, 0.86628276), (1, 0.033341788), (2, 0.03334801), (3, 0.03368498), (4, 0.033342462)]\n",
"Question: What command can be used to package and deploy models to AWS SageMaker?\n",
"Topic: [(0, 0.89991224), (1, 0.025005225), (2, 0.025009066), (3, 0.025006713), (4, 0.025066752)]\n",
"Question: What is the purpose of the --build-image flag when running mlflow run?\n",
"Topic: [(0, 0.033957016), (1, 0.033506736), (2, 0.034095332), (3, 0.034164555), (4, 0.86427635)]\n",
"Question: What is the relative path to the python_env YAML file within the MLflow project's directory?\n",
"Topic: [(0, 0.02243), (1, 0.02222536), (2, 0.022470985), (3, 0.9105873), (4, 0.02228631)]\n",
"Question: What are the additional local volume mounted and environment variables in the docker container?\n",
"Topic: [(0, 0.022225259), (1, 0.9110914), (2, 0.02222932), (3, 0.022227468), (4, 0.022226628)]\n",
"Question: What are some examples of entity names that contain special characters?\n",
"Topic: [(0, 0.028575381), (1, 0.88568854), (2, 0.02858065), (3, 0.028578246), (4, 0.028577149)]\n",
"Question: What type of constant does the RHS need to be if LHS is a metric?\n",
"Topic: [(0, 0.028575381), (1, 0.8856886), (2, 0.028580645), (3, 0.028578239), (4, 0.028577147)]\n",
"Question: How can you get all active runs from experiments IDs 3, 4, and 17 that used a CNN model with 10 layers and had a prediction accuracy of 94.5% or higher?\n",
"Topic: [(0, 0.015563371), (1, 0.015387185), (2, 0.015389071), (3, 0.015427767), (4, 0.9382326)]\n",
"Question: What is the purpose of the 'experimentIds' variable in the given paragraph?\n",
"Topic: [(0, 0.040206533), (1, 0.8384999), (2, 0.040013183), (3, 0.040967643), (4, 0.040312726)]\n",
"Question: What is the MLflow Tracking component used for?\n",
"Topic: [(0, 0.8390845), (1, 0.04000697), (2, 0.040462855), (3, 0.04014182), (4, 0.040303845)]\n",
"Question: How can you create an experiment in MLflow?\n",
"Topic: [(0, 0.050333958), (1, 0.0500024), (2, 0.7993825), (3, 0.050153885), (4, 0.05012722)]\n",
"Question: How can you create an experiment using MLflow?\n",
"Topic: [(0, 0.04019285), (1, 0.04000254), (2, 0.8396381), (3, 0.040091105), (4, 0.04007539)]\n",
"Question: What is the architecture depicted in this example scenario?\n",
"Topic: [(0, 0.04000523), (1, 0.040007014), (2, 0.040012203), (3, 0.04000902), (4, 0.83996654)]\n",
"\n",
"Topics found are:\n",
"Topic: 0 \n",
"Words: 0.078*\"model\" + 0.059*\"mlflow\" + 0.059*\"version\" + 0.041*\"models\" + 0.041*\"fetch\" + 0.041*\"specific\" + 0.041*\"used\" + 0.022*\"command\" + 0.022*\"deploy\" + 0.022*\"sagemaker\"\n",
"\n",
"Topic: 1 \n",
"Words: 0.030*\"local\" + 0.030*\"container\" + 0.030*\"variables\" + 0.030*\"docker\" + 0.030*\"mounted\" + 0.030*\"environment\" + 0.030*\"volume\" + 0.030*\"additional\" + 0.030*\"special\" + 0.030*\"names\"\n",
"\n",
"Topic: 2 \n",
"Words: 0.096*\"experiment\" + 0.096*\"create\" + 0.096*\"mlflow\" + 0.051*\"using\" + 0.009*\"purpose\" + 0.009*\"model\" + 0.009*\"method\" + 0.009*\"file\" + 0.009*\"version\" + 0.009*\"used\"\n",
"\n",
"Topic: 3 \n",
"Words: 0.071*\"purpose\" + 0.039*\"file\" + 0.039*\"mlflow\" + 0.039*\"yaml\" + 0.039*\"directory\" + 0.039*\"relative\" + 0.039*\"within\" + 0.039*\"path\" + 0.039*\"project\" + 0.039*\"format\"\n",
"\n",
"Topic: 4 \n",
"Words: 0.032*\"purpose\" + 0.032*\"used\" + 0.032*\"model\" + 0.032*\"prediction\" + 0.032*\"get\" + 0.032*\"accuracy\" + 0.032*\"active\" + 0.032*\"layers\" + 0.032*\"higher\" + 0.032*\"experiments\"\n",
"\n"
]
}
],
"source": [
"lda_model, corpus, dictionary = topical_analysis(miss_questions)\n",
"vis_data = gensimvis.prepare(lda_model, corpus, dictionary)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "724db985-5382-43a6-ada5-0ac1c2d49c18",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": [
"# Uncomment the following line to render the interactive widget\n",
"# pyLDAvis.display(vis_data)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {},
"inputWidgets": {},
"nuid": "31945151-7cf9-4f25-af30-d9b9bd526e7b",
"showTitle": false,
"title": ""
}
},
"outputs": [],
"source": []
}
],
"metadata": {
"application/vnd.databricks.v1+notebook": {
"dashboards": [],
"language": "python",
"notebookMetadata": {
"pythonIndentUnit": 4
},
"notebookName": "retriever-evaluation-tutorial",
"widgets": {}
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.19"
}
},
"nbformat": 4,
"nbformat_minor": 1
}