Posts

Showing posts from March, 2024

KB: RAG - OpenAI (Retrieval-Augmented Generation)

Image
What is RAG (Retrieval-Augmented Generation)? Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Large Language Models (LLMs) are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. RAG extends the already powerful capabilities of LLMs to specific domains or an organization's internal knowledge base, all without the need to retrain the model. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts. Related Blogs: KB-OpenAI References: What Is RAG? Best Practices in Retrieval Augmented Generation

KB: OpenAI

  Embeddings: What are embeddings : OpenAI’s text embeddings measure the relatedness of text strings. Embeddings are commonly used for: Search (where results are ranked by relevance to a query string) Clustering (where text strings are grouped by similarity) Recommendations (where items with related text strings are recommended) Anomaly detection (where outliers with little relatedness are identified) Diversity measurement (where similarity distributions are analyzed) Classification (where text strings are classified by their most similar label) An embedding is a vector (list) of floating point numbers. The  distance  between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness. What is the difference between embedding and vector database? Embeddings encode all types of data into vectors that capture the meaning and context of an asset. This allows us to find similar assets by searching for neighborin...

KB: Redis

Redis is single-threaded (per-shard) Single threaded nature of RedisRedis uses a mostly single threaded design. This means that a single process serves all the client requests, using a technique called multiplexing. This means that Redis can serve a single request in every given moment, so all the requests are served sequentially.This approach helps Redis maintain consistency and avoid race conditions, even in concurrent and multi-client environments How data is sharded in Redis: Redis Cluster does not use consistent hashing, but a different form of sharding where every key is conceptually part of what we call a hash slot. There are 16384 hash slots in Redis Cluster, and to compute the hash slot for a given key, we simply take the CRC16 of the key modulo 16384 Sharding: -  Question:- What’s the solution to fix the Split-Brain situation ? - Answer:- Maintain an odd number of primary shards and two replicas per primary shard. Here is the detailed solution to this problem To prevent s...

KB: SQL openquery vs linked server

Good read: Openquery vs Linked Server   The linked server query, SQL Server is going to make decisions for you on how it mashes all the data together and returns the result set. By default, when you run a distributed query using a linked server, the query is processed locally. This may or may not be efficient, depending on how much data must be sent from the remote server to the local server for processing . Sometimes it is more efficient to pass through the query so that it is run on the remote server. This way, if the query must process many rows, it can process them on the remote server, and only return to the local server the results of the query. The OPENQUERY function is used to specify that a distributed query be processed on the remote server instead of the local server. The alternative to using Linked Servers is to use the OPENQUERY statement, also known as a pass through query. When using an OPENQUERY statement, the WHERE clause gets executed at the remote server and the ...