Repositories QA

In this tutorial, we will show you how you can create a simple program using Stable Code Instruct and Langchain to ask questions about a repository.

Note: This tutorial has been adapted from Langchain’s official tutorial.

First, let’s go ahead and install our dependencies.

%pip install -qU tree_sitter tree_sitter_languages sentence_transformers langchain langchain-community faiss-cpu GitPython

Let’s first go ahead and clone a repository to work with. We’ll use the langchain repository as an example and GitPython to clone it.

from git import Repo

repo_path = "/content/langchain"
repo = Repo.clone_from("https://github.com/langchain-ai/langchain", to_path=repo_path)

Now that we have the repository, we need to get it into a form that we can easily ask questions about. To do this we need to create a vector database that will be used to match questions to the code in the repository.

We will be using tree-sitter for parsing the code in our repository to chunks that we will retrieve when we ask questions.

from langchain_community.document_loaders.generic import GenericLoader
from langchain_community.document_loaders.parsers import LanguageParser
from langchain_text_splitters import Language, RecursiveCharacterTextSplitter

loader = GenericLoader.from_filesystem(
    repo_path,
    glob="**/*",
    suffixes=[".py"],
    exclude=["**/non-utf8-encoding.py"],
    parser=LanguageParser(),
)
docs = loader.load()

python_splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.PYTHON, chunk_size=2000, chunk_overlap=200 # Tune these parameters to your liking
)
n = 1_000
texts = python_splitter.split_documents(docs)[:n]
len(texts)

With the chunks of code in hand, we can now vectorize them and store them in a database. For vectorization, we will use the awesome MiniLM model from the sentence-tranformer library. And for the database, we will use Faiss.

from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings

db = FAISS.from_documents(texts, HuggingFaceEmbeddings())
retriever = db.as_retriever(
    search_type="mmr",  # Also test "similarity"
    search_kwargs={"k": 8},
)

We will use Ollama to serve our model and answer questions about the repository. Therefore, we need to install it. You can install it by running the following if you are on linux or using colab:

!curl -fsSL https://ollama.com/install.sh | sh
!command -v systemctl >/dev/null && sudo systemctl stop ollama

Next, we need to setup the ollama server to run in the background:

import subprocess

# Start a subprocess in the background
process = subprocess.Popen(["ollama", "serve"], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

Finally, we can chat with our repository.

Note: The first time running this will take a while since you need to download the necessary model weights to run the model.

from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationSummaryMemory
from langchain_community.chat_models import ChatOllama

llm = ChatOllama(model="stable-code:instruct")
memory = ConversationSummaryMemory(
    llm=llm, memory_key="chat_history", return_messages=True
)
qa = ConversationalRetrievalChain.from_llm(llm, retriever=retriever, memory=memory)

question = "How can I initialize a ReAct agent?"
result = qa(question)
print(result["answer"])