Skip to main content

Gemini + Moorcheh

This integration uses Google Gemini to generate embeddings and Moorcheh vector namespaces to store and search them with ITS ranking. Gemini embedding models can map text, image, video, audio, and PDFs (including interleaved combinations) into a unified vector space. This page focuses on the gemini-embedding-2-preview model with text; you can extend the same pattern to files using the Gemini Embedding API. Use this approach when you want full control over the embedding model and upload pre-computed vectors directly to Moorcheh.

Architecture

Embedding generation

Generate vectors with Gemini gemini-embedding-2-preview and task types such as RETRIEVAL_DOCUMENT / RETRIEVAL_QUERY

Vector storage

Store vectors in Moorcheh vector namespaces

Semantic retrieval

Search by vector query for high-relevance results

Model flexibility

Tune output dimensionality to balance quality and storage

Prerequisites

Install dependencies:
pip install google-genai moorcheh-sdk python-dotenv
The PyPI package name is google-genai (with a hyphen). That provides the Python module google.genai (with a dot). If you see ModuleNotFoundError: No module named 'google.genai', run the pip install line above in the same environment you use to run the script.

.env file

MOORCHEH_API_KEY=your_moorcheh_key
GEMINI_API_KEY=your_gemini_key

Task types

The Gemini Embedding API accepts a task_type that optimizes vectors for the intended use. Common choices for retrieval:
Task typeWhen to use
RETRIEVAL_DOCUMENTChunks or documents you index for search
RETRIEVAL_QUERYUser queries at search time
Other supported types include SEMANTIC_SIMILARITY, CLASSIFICATION, CLUSTERING, CODE_RETRIEVAL_QUERY, QUESTION_ANSWERING, and FACT_VERIFICATION. Use the same model and dimension settings for both indexing and querying.

Vector dimensions

By default, gemini-embedding-2-preview returns 3072 dimensions. You can set output_dimensionality (for example 768 or 1536) to reduce storage. The Moorcheh namespace vector_dimension must match the size you produce at index and query time.

End-to-end example

The following example loads keys from .env, embeds document chunks with RETRIEVAL_DOCUMENT, uploads them to Moorcheh, embeds a query with RETRIEVAL_QUERY, and runs vector search.
import os
import textwrap
from typing import List

from dotenv import load_dotenv
from google.genai import Client, types
from moorcheh_sdk import MoorchehClient

load_dotenv()

MOORCHEH_API_KEY = os.getenv("MOORCHEH_API_KEY", "").strip()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY", "").strip()
if not MOORCHEH_API_KEY or not GEMINI_API_KEY:
    raise SystemExit("Set MOORCHEH_API_KEY and GEMINI_API_KEY.")
os.environ.setdefault("GEMINI_API_KEY", GEMINI_API_KEY)

NAMESPACE = "gemini-embed-demo"
VECTOR_DIMENSION = 3072  # Match default for gemini-embedding-2-preview; or set output_dimensionality and use that size
CHUNK_SIZE = 900
CHUNK_OVERLAP = 180


def to_float_vector(values) -> List[float]:
    return [float(x) for x in values]


def chunk_text(text: str, chunk_size: int = CHUNK_SIZE, overlap: int = CHUNK_OVERLAP) -> List[str]:
    chunks: List[str] = []
    start = 0
    while start < len(text):
        end = min(start + chunk_size, len(text))
        chunks.append(text[start:end].strip())
        if end == len(text):
            break
        start = max(end - overlap, 0)
    return [c for c in chunks if c]


def extract_text(result: dict) -> str:
    if result.get("text"):
        return str(result["text"])
    metadata = result.get("metadata") or {}
    if isinstance(metadata, dict):
        return str(metadata.get("text") or metadata.get("raw_text") or metadata.get("content") or "")
    return ""


def clean_text(text: str) -> str:
    return " ".join(str(text).split())


def print_result(idx: int, result: dict) -> None:
    metadata = result.get("metadata") or {}
    text_value = clean_text(extract_text(result))
    wrapped = textwrap.fill(text_value, width=100)
    print(f"[{idx}] id={result.get('id')}")
    print(f"score={result.get('score')} label={result.get('label')}")
    print(f"section={metadata.get('section')} source_doc_id={metadata.get('source_doc_id')}")
    print("text:")
    print(wrapped if wrapped else "(no text returned)")
    print("-" * 120)


# 1) Clients
gemini_client = Client()
mc = MoorchehClient(api_key=MOORCHEH_API_KEY)

# 2) Create vector namespace once (ignore if it already exists)
try:
    mc.namespaces.create(
        namespace_name=NAMESPACE,
        type="vector",
        vector_dimension=VECTOR_DIMENSION,
    )
except Exception:
    pass

# 3) Sample documents and chunking
source_documents = [
    {
        "id": "guide-vector-namespaces",
        "section": "vector-namespace-best-practices",
        "text": (
            "Moorcheh vector namespaces support bring-your-own-embedding workflows. "
            "When using Gemini gemini-embedding-2-preview, the namespace dimension must match the embedding output size. "
            "Each vector item should include a stable id and the original chunk text so results can be shown without a second fetch."
        ),
    },
    {
        "id": "guide-search-tuning",
        "section": "semantic-search-tuning",
        "text": (
            "For better relevance, use RETRIEVAL_DOCUMENT for stored chunks and RETRIEVAL_QUERY for queries. "
            "Keep chunk sizes coherent and use overlap to preserve context across chunk boundaries."
        ),
    },
]

documents = []
for doc in source_documents:
    parts = chunk_text(doc["text"])
    for idx, chunk in enumerate(parts):
        documents.append(
            {
                "id": f"{doc['id']}-chunk-{idx}",
                "text": chunk,
                "source_doc_id": doc["id"],
                "section": doc["section"],
                "chunk_index": idx,
                "total_chunks": len(parts),
            }
        )

texts = [d["text"] for d in documents]

# 4) Embed documents (index-time)
doc_result = gemini_client.models.embed_content(
    model="gemini-embedding-2-preview",
    contents=texts,
    config=types.EmbedContentConfig(task_type="RETRIEVAL_DOCUMENT"),
)

# 5) Upload to Moorcheh
mc.vectors.upload(
    namespace_name=NAMESPACE,
    vectors=[
        {
            "id": documents[i]["id"],
            "vector": to_float_vector(doc_result.embeddings[i].values),
            "text": documents[i]["text"],
            "source": "gemini-embedding-2-preview",
            "model": "gemini-embedding-2-preview",
            "task_type": "RETRIEVAL_DOCUMENT",
            "section": documents[i]["section"],
            "source_doc_id": documents[i]["source_doc_id"],
            "chunk_index": documents[i]["chunk_index"],
            "total_chunks": documents[i]["total_chunks"],
        }
        for i in range(len(documents))
    ],
)

# 6) Query embedding + search
query = (
    "How should I set RETRIEVAL_DOCUMENT vs RETRIEVAL_QUERY for Gemini embeddings in Moorcheh?"
)
query_result = gemini_client.models.embed_content(
    model="gemini-embedding-2-preview",
    contents=query,
    config=types.EmbedContentConfig(task_type="RETRIEVAL_QUERY"),
)

query_vec = to_float_vector(query_result.embeddings[0].values)

results = mc.similarity_search.query(
    namespaces=[NAMESPACE],
    query=query_vec,
    top_k=5,
    kiosk_mode=True,
    threshold=0.15,
)

print(f"namespace={NAMESPACE} total_results={len(results.get('results', []))}")
print("=" * 120)
for idx, r in enumerate(results.get("results", []), start=1):
    print_result(idx, r)

Embedding PDFs and other files

You can pass binary parts to embed_content (for example a PDF) using types.Part.from_bytes:
with open("filename.pdf", "rb") as f:
    pdf_bytes = f.read()

pdf_part = types.Part.from_bytes(data=pdf_bytes, mime_type="application/pdf")

gemini_client.models.embed_content(
    model="gemini-embedding-2-preview",
    contents=[pdf_part],
    config=types.EmbedContentConfig(task_type="RETRIEVAL_DOCUMENT"),
)
Chunk or split large documents as needed before upload; store each resulting vector in Moorcheh with metadata that points back to the source file.

Runnable demo script

See integrations/gemini/gemini_moorcheh_demo.py.

Important notes

The namespace vector_dimension must exactly match the length of vectors you upload. If you use output_dimensionality on the Gemini side, create the namespace with that same size.
Use RETRIEVAL_DOCUMENT (or equivalent) for indexed content and RETRIEVAL_QUERY for search queries.
Include text in each uploaded vector object so search results can return the original chunk without an extra lookup.
Use the same model, task types, and dimension settings for indexing and querying.

Troubleshooting

  • No vector namespace found: Create the namespace first with type="vector".
  • Dimension mismatch: Recreate the namespace with the correct vector_dimension or align Gemini output_dimensionality with the namespace.
  • Auth errors: Confirm GEMINI_API_KEY is set and valid for the Gemini API.