Close Menu
The Daily PostingThe Daily Posting
  • Home
  • Android
  • Business
  • IPhone
    • Lifestyle
  • Politics
  • Europe
  • Science
    • Top Post
  • USA
  • World
Facebook X (Twitter) Instagram
Trending
  • Jennifer Lopez and Ben Affleck reveal summer plans after Europe trip
  • T20 World Cup: Quiet contributions from Akshar Patel, Kuldeep Yadav and Ravindra Jadeja justify Rohit Sharma’s spin vision | Cricket News
  • The impact of a sedentary lifestyle on health
  • Bartok: The World of Lilette
  • Economists say the sharp rise in the U.S. budget deficit will put a strain on Americans’ incomes
  • Our Times: Williams memorial unveiled on July 4th | Lifestyle
  • Heatwaves in Europe are becoming more dangerous: what it means for travelers
  • Christian Science speaker to visit Chatauqua Institute Sunday | News, Sports, Jobs
Facebook X (Twitter) Instagram
The Daily PostingThe Daily Posting
  • Home
  • Android
  • Business
  • IPhone
    • Lifestyle
  • Politics
  • Europe
  • Science
    • Top Post
  • USA
  • World
The Daily PostingThe Daily Posting
Science

Advanced Search Extension Generation: From Theory to LlamaIndex Implementation | By Leonie Monigatti | February 2024

thedailyposting.comBy thedailyposting.comFebruary 19, 2024No Comments

[ad_1]

Continue reading here for more ideas on how to improve the performance of your RAG pipeline and make it production-ready.

This section describes the packages and API keys that you need to follow in this article.

required packages

This article shows you how to use LlamaIndex in Python to implement simple and advanced RAG pipelines.

pip install llama-index

This article uses LlamaIndex. v0.10. If you are upgrading from an older LlamaIndex version, you will need to run the following command to install her LlamaIndex and run it properly.

pip uninstall llama-index
pip install llama-index --upgrade --no-cache-dir --force-reinstall

LlamaIndex provides an option to save vector embeddings locally to a JSON file for persistent storage. This is great for quickly prototyping ideas. However, because advanced RAG technology is intended for production-ready applications, it uses vector databases for persistent storage.

In addition to storing vector embeddings, metadata storage and hybrid search capabilities are required, so we decided to use the open source vector database Weaviate (v3.26.2), supports these features.

pip install weaviate-client llama-index-vector-stores-weaviate

API key

Use Weaviate embed, which is free to use without registering an API key. However, this tutorial uses OpenAI’s embedded model and LLM, so you will need an OpenAI API key. To get one, you need an OpenAI account and “Create a new private key” under API Keys.

Then create a local .env Place the file in your root directory and define your API key within it.

OPENAI_API_KEY="<YOUR_OPENAI_API_KEY>"

You can then load your API key using the following code:

# !pip install python-dotenv
import os
from dotenv import load_dotenv,find_dotenv

load_dotenv(find_dotenv())

This section describes how to implement a simple RAG pipeline using LlamaIndex. This Jupyter Notebook describes an entire simple RAG pipeline. For implementation using LangChain, you can continue with this article (Simple RAG Pipeline with LangChain).

Step 1: Define the embedding model and LLM

First, you can define embedded models and LLMs in a global configuration object. By doing this, you don’t have to explicitly specify the model again in your code.

  • Embedding model: Used to generate vector embeddings of document chunks and queries.
  • LLM: Used to generate answers based on user queries and relevant context.
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core.settings import Settings

Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
Settings.embed_model = OpenAIEmbedding()

Step 2: Load the data

Next, create a local directory named . data Go to the root directory and download the sample data from the LlamaIndex GitHub repository (MIT license).

!mkdir -p 'data'
!wget '<https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt>' -O 'data/paul_graham_essay.txt'

The data can then be loaded for further processing.

from llama_index.core import SimpleDirectoryReader

# Load data
documents = SimpleDirectoryReader(
input_files=["./data/paul_graham_essay.txt"]
).load_data()

Step 3: Split the document into nodes

Because the entire document is too large to fit into LLM’s context window, we need to split the document into smaller text chunks called . Nodes In the llama index. To parse the loaded document into nodes, use SimpleNodeParser The defined chunk size is 1024.

from llama_index.core.node_parser import SimpleNodeParser

node_parser = SimpleNodeParser.from_defaults(chunk_size=1024)

# Extract nodes from documents
nodes = node_parser.get_nodes_from_documents(documents)

Step 4: Build the index

Next, we build an index to store all external knowledge in Weaviate, an open source vector database.

First, you need to connect to your Weaviate instance. In this case we are using Weaviate Embedded. This allows you to experiment with Notebooks for free without an API key. For production-ready solutions, we recommend deploying Weaviate yourself, such as via Docker or using a managed service.

import weaviate

# Connect to your Weaviate instance
client = weaviate.Client(
embedded_options=weaviate.embedded.EmbeddedOptions(),
)

next, VectorStoreIndex Store and manipulate data from Weaviate clients.

from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.weaviate import WeaviateVectorStore

index_name = "MyExternalContext"

# Construct vector store
vector_store = WeaviateVectorStore(
weaviate_client = client,
index_name = index_name
)

# Set up the storage for the embeddings
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Setup the index
# build VectorStoreIndex that takes care of chunking documents
# and encoding chunks to embeddings for future retrieval
index = VectorStoreIndex(
nodes,
storage_context = storage_context,
)

Step 5: Set up the query engine

Finally, set up the index as the query engine.

# The QueryEngine class is equipped with the generator
# and facilitates the retrieval and generation steps
query_engine = index.as_query_engine()

Step 6: Run a simple RAG query on your data

Now you can run simple RAG queries on your data, as shown below.

# Run your naive RAG query
response = query_engine.query(
"What happened at Interleaf?"
)

This section describes some simple adjustments you can make to turn the simple RAG pipeline above into an advanced RAG pipeline. This tutorial covers the following advanced RAG techniques:

We’ll only highlight the changes here, but you can find a complete end-to-end advanced RAG pipeline in this Jupyter Notebook.

The sentence window acquisition technique requires two adjustments. First, you need to adjust how your data is stored and post-processed. Instead of, SimpleNodeParserUse the. SentenceWindowNodeParser.

from llama_index.core.node_parser import SentenceWindowNodeParser

# create the sentence window node parser w/ default settings
node_parser = SentenceWindowNodeParser.from_defaults(
window_size=3,
window_metadata_key="window",
original_text_metadata_key="original_text",
)

of SentenceWindowNodeParser Do two things:

  1. Split the document into single sentences and embed them.
  2. A context window is created for each sentence. If you specify window_size = 3, the resulting window is three sentences long, starting from the sentence before the embedded sentence and spanning the sentence after it. Windows are saved as metadata.

During the search, the sentences that best match the query are returned. After retrieval, you need to define the entire window of metadata to replace the statement. MetadataReplacementPostProcessor and use it in a list node_postprocessors.

from llama_index.core.postprocessor import MetadataReplacementPostProcessor

# The target key defaults to `window` to match the node_parser's default
postproc = MetadataReplacementPostProcessor(
target_metadata_key="window"
)

...

query_engine = index.as_query_engine(
node_postprocessors = [postproc],
)

The implementation of hybrid search in LlamaIndex is query_engine If the underlying vector database supports hybrid search queries.of alpha The parameter specifies the weighting between vector searches and keyword-based searches. alpha=0 means keyword-based search, alpha=1 means a pure vector search.

query_engine = index.as_query_engine(
...,
vector_store_query_mode="hybrid",
alpha=0.5,
...
)

Adding a reranker to your advanced RAG pipeline takes just three simple steps:

  1. First, define the reranker model. What we are using here is BAAI/bge-reranker-baseFrom Hug Face.
  2. In the query engine, add the reranker model to the list. node_postprocessors.
  3. increase similarity_top_k Get more context passages in the query engine. This can be reduced to: top_n After reranking.
# !pip install torch sentence-transformers
from llama_index.core.postprocessor import SentenceTransformerRerank

# Define reranker model
rerank = SentenceTransformerRerank(
top_n = 2,
model = "BAAI/bge-reranker-base"
)

...

# Add reranker to query engine
query_engine = index.as_query_engine(
similarity_top_k = 6,
...,
node_postprocessors = [rerank],
...,
)

[ad_2]

Source link

thedailyposting.com
  • Website

Related Posts

Christian Science speaker to visit Chatauqua Institute Sunday | News, Sports, Jobs

June 28, 2024

Hundreds of basketball-sized space rocks hit Mars every year

June 28, 2024

Space Cadet’s Emma Roberts opens up about middle school science trauma

June 28, 2024
Leave A Reply Cancel Reply

ads
© 2025 thedailyposting. Designed by thedailyposting.
  • Home
  • About us
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms of Service
  • Advertise with Us
  • 1711155001.38
  • xtw183871351
  • 1711198661.96
  • xtw18387e4df
  • 1711246166.83
  • xtw1838741a9
  • 1711297158.04
  • xtw183870dc6
  • 1711365188.39
  • xtw183879911
  • 1711458621.62
  • xtw183874e29
  • 1711522190.64
  • xtw18387be76
  • 1711635077.58
  • xtw183874e27
  • 1711714028.74
  • xtw1838754ad
  • 1711793634.63
  • xtw183873b1e
  • 1711873287.71
  • xtw18387a946
  • 1711952126.28
  • xtw183873d99
  • 1712132776.67
  • xtw183875fe9
  • 1712201530.51
  • xtw1838743c5
  • 1712261945.28
  • xtw1838783be
  • 1712334324.07
  • xtw183873bb0
  • 1712401644.34
  • xtw183875eec
  • 1712468158.74
  • xtw18387760f
  • 1712534919.1
  • xtw183876b5c
  • 1712590059.33
  • xtw18387aa85
  • 1712647858.45
  • xtw18387da62
  • 1712898798.94
  • xtw1838737c0
  • 1712953686.67
  • xtw1838795b7
  • 1713008581.31
  • xtw18387ae6a
  • 1713063246.27
  • xtw183879b3c
  • 1713116334.31
  • xtw183872b3a
  • 1713169981.74
  • xtw18387bf0d
  • 1713224008.61
  • xtw183873807
  • 1713277771.7
  • xtw183872845
  • 1713329335.4
  • xtw183874890
  • 1716105960.56
  • xtw183870dd9
  • 1716140543.34
  • xtw18387691b

Type above and press Enter to search. Press Esc to cancel.