0% found this document useful (0 votes)
11 views

Paper 1-Integrating Advanced Language Models

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Paper 1-Integrating Advanced Language Models

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

(IJACSA) International Journal of Advanced Computer Science and Applications,

Vol. 15, No. 6, 2024

Integrating Advanced Language Models and Vector


Database for Enhanced AI Query Retrieval in Web
Development
Xiaoli Huan1, Hong Zhou2
Department of Computer Science, Troy University, Troy, Alabama, USA1
Department of Mathematics and Computer Science, University of Saint Joseph, West Hartford, Connecticut, USA2

Abstract—In the dynamic field of web development, the We explore and compare the performance of both PaLM
integration of sophisticated AI technologies for query processing and Gemini, two powerful Large Language Models (LLMs), to
has become increasingly crucial. This paper presents a identify which is more effective in the context of web
framework that significantly improves the relevance of web development query retrieval. This comparative analysis
query responses by leveraging cutting-edge technologies like provides valuable insights into the strengths and weaknesses of
Hugging Face, FAISS, Google PaLM, Gemini, and LangChain. each model for this specific task. By combining these cost-free
We explore and compare the performance of both PaLM and technologies, we create a query processing system that is not
Gemini, two powerful LLMs, to identify strengths and only more efficient but also delivers significantly more relevant
weaknesses in the context of web development query retrieval.
responses to user queries. This cost-effectiveness allows for the
Our approach capitalizes on the synergistic combination of these
freely accessible tools, ultimately leading to a more efficient and
development of sophisticated AI-driven solutions without the
user-friendly query processing system. burden of API usage fees or proprietary restrictions.
This research contributes novel insights to web
Keywords—LLM (Large Language Model); vector databases; development by:
retrieval-augmented generation
 Highlighting the potential of combining sophisticated
I. INTRODUCTION open-source AI models and advanced methodologies
In the rapidly evolving landscape of web development, the like RAG for improved user query handling.
quest for efficient and accurate query retrieval systems has  Providing a comparative analysis of PaLM and Gemini,
become a cornerstone of enhancing user experience and offering valuable insights into their effectiveness for
information accessibility. While effective to a certain extent, web development query retrieval.
traditional query processing methods often fall short in coping
with the complexity and dynamism of user-generated queries  Emphasizing accessibility and cost-effectiveness
in real-time web environments. through the utilization of freely available tools.
Generative AI, including models like the GPT series [1], The following sections will delve into the technical
Google PaLM, and Gemini, have demonstrated remarkable architecture, implementation details, and performance
capabilities in generating human-like text and answering evaluation of the system in Sections II, III, IV, V and VI,
queries in a contextually relevant manner. These models providing a comprehensive understanding of its capabilities
leverage large-scale transformer architectures to understand and potential impact on the future of web development in
and generate complex language, making them highly suitable Section VII.
for sophisticated query processing tasks.
II. EVOLUTION OF LANGUAGE MODELS
Retrieval-Augmented Generation (RAG) [2] is a cutting-
edge approach that combines retrieval-based and generative The evolution of language models in query processing is
models to enhance the accuracy and relevance of responses. crucial in natural language processing (NLP) and artificial
RAG models retrieve relevant documents or pieces of intelligence (AI), witnessing significant advancements over the
information from a database and use these as context to past few decades. This section explores the trajectory of these
generate more precise and contextually aware answers. This developments, focusing on how they have revolutionized query
technique has been particularly effective in scenarios where the processing and understanding.
generative model alone might lack the necessary contextual The journey began with early language models like n-gram
knowledge to provide accurate responses [3]. models and statistical language models. These models, such as
Our approach integrates freely accessible tools like those used in early versions of machine translation and speech
Hugging Face [4], FAISS [5], Google PaLM [6], Gemini [7], recognition systems, relied heavily on statistical probabilities
and LangChain [8]. Each tool brings its strengths to the table, of word sequences. However, their major limitation was the
contributing to a more robust query processing framework. inability to capture long-range dependencies and contextual

1|P age
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 6, 2024

nuances in language, leading to suboptimal performance in Vector databases, fundamentally different from traditional
complex query processing [9]. relational databases, are designed to store and retrieve high-
dimensional vector data efficiently. This capability is crucial in
The introduction of neural network-based models marked a handling the outputs of advanced AI models, especially in the
significant shift. Recurrent Neural Networks (RNNs) and Long context of natural language processing and machine learning
Short-Term Memory (LSTM) networks began to address the (ML). The early conceptualization and use of vector spaces in
shortcomings of traditional models by better capturing information retrieval set the stage for the development of these
sequential information and context [10]. Despite their databases [13].
improvements, these models still struggled with processing
longer sequences and required substantial computational Developing technologies like FAISS (Facebook AI
resources. Similarity Search) marked a significant milestone in vector
databases. FAISS is designed for efficient similarity search and
The introduction of transformer architecture [11] in 2017 clustering of dense vectors. Its ability to handle billions of
marked a significant transformation in language modeling. vectors makes it particularly suitable for large-scale AI
Distinct from earlier models, the transformer employs self- applications, including those in web development and query
attention mechanisms to analyze entire text sequences at once, processing [14].
allowing for more effective context capturing. This innovative
framework serves as the foundational structure for models such Vector databases have found extensive application in AI-
as Google's BERT (Bidirectional Encoder Representations driven systems, particularly in enhancing the efficiency and
from Transformers) and the GPT (Generative Pre-trained accuracy of information retrieval. For instance, their
Transformer) series developed by OpenAI. integration into recommendation systems and search engines
has significantly improved the relevance of results based on
BERT [12] was groundbreaking due to its bidirectional user queries and preferences [15].
training, enabling it to comprehend the context of a word by
considering all of its surrounding words. This feature made it Despite their advancements, vector databases face
particularly effective for tasks like question answering and challenges, particularly in scalability and real-time processing
language inference. in web environments. Future research is directed toward
optimizing these databases for more efficient real-time query
The GPT series demonstrated remarkable capabilities in processing and integration with evolving AI models [16].
generating human-like text and answering queries in a
contextually relevant manner. Its large-scale transformer IV. SYSTEM ARCHITECTURE
model, trained on vast amounts of data, could generate
coherent and contextually relevant text over extended passages. This section outlines the system architecture of our AI-
driven web application, emphasizing the integration of
The most recent advancements, such as Google's PaLM Streamlit and various AI components to enhance user
(Pathways Language Model) and Gemini, have pushed the experience and query processing efficiency.
boundaries further. PaLM, with its even larger scale and more
sophisticated training, has shown capabilities in not just The front-end of our system is built using Streamlit [17], an
understanding but also generating complex and nuanced innovative framework that allows for rapid development and
language, making it highly effective for sophisticated query deployment of data applications. Streamlit's simplicity and
processing tasks. On the other hand, Gemini showcases efficiency make it an ideal choice for integrating complex AI
strength in its multimodality, seamlessly processing text, models into web applications.
images, and code. This versatility could prove advantageous in Key components of the front-end include:
web development scenarios where queries might incorporate
screenshots or snippets of code alongside textual information.  User Interaction: It presents a web interface where users
can type questions in a text box.
The impact of these advancements on query processing has
been profound. Language models have transitioned from  Visual Elements: Streamlit elements like headers,
simply predicting the next word in a sequence to understanding subheaders, and text boxes are used to create a user-
and generating human-like responses to complex queries. This friendly interface.
evolution has enabled the development of more sophisticated  Feedback Mechanism: The script provides feedback to
AI-driven applications, such as virtual assistants, chatbots, and the user by displaying placeholders that change state
advanced search engines, capable of understanding and based on the application's progress. Initially, it displays
responding to user queries with unprecedented accuracy and "Awaiting your question..." and transitions to
relevance. "Processing..." when the user submits a question.
III. VECTOR DATABASES Finally, it displays "Answer" if a successful response is
generated or an error message if the process fails.
Integrating vector databases in information retrieval
significantly advances AI and web development. This section  Communication: The front-end interacts with the back-
reviews the evolution and application of vector databases, end by passing the user's question as input and
particularly focusing on their role in enhancing information displaying the generated response.
retrieval capabilities in AI systems.
Fig. 1 shows the web application's initial input interface.

2|P age
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 6, 2024

The setup_vector_database() function in Fig. 4 sets up a


vector database using FAISS. It loads question-and-answer
pairs from a CSV file (prompt_answer.csv) using LangChain’s
CSVLoader. The function then initializes the Hugging Face
embeddings (init_hf_embeddings function) to convert the
loaded FAQ data into vector embeddings. These embeddings
are then used to create a FAISS database with the
FAISS.from_documents method. Finally, the FAISS database
is saved locally using the provided file path, allowing the
application to quickly retrieve relevant answers based on
similarity searches in the future.
Fig. 1. The web application interface. Fig. 5 shows samples of the provided data file,
‘prompt_answer.csv’. The dataset comprises a collection of
The front-end acts as the communication layer for users, questions and corresponding responses intended for the
while the back-end handles the core functionalities of the computer science department. This spreadsheet is structured
application. It serves as the engine that processes user queries, with two primary columns: prompt and response. The prompt
interacts with the data and AI models, and generates the final column contains a variety of questions that might be asked by
response. students, faculty, or other stakeholders, while the response
column provides the appropriate answers. The data can be
In Fig. 2, the init_google_palm_model() function initializes easily updated to reflect new queries or changes in the
the Google PaLM model. It retrieves an API key [18], which is information provided, ensuring that the dataset remains current
used to authenticate Google's services. The model is initialized and helpful in addressing the diverse inquiries directed towards
with a temperature parameter set to 0.1, which influences the the CS department.
model's output's randomness (lower temperature values result
in more deterministic outputs). The changes to the
init_google_gemini_model function are relatively
straightforward in leveraging Gemini's capabilities instead of
PaLM. Import the necessary libraries for Gemini and modify
the line that initializes the model. Instead of GooglePaLM, use
ChatGoogleGenerativeAI and specify the model="gemini-pro"
argument to indicate the Gemini model variant.
In Fig. 3, the init_hf_embeddings() function initializes
embeddings from the Hugging Face's InstructEmbeddings [19]
model. These embeddings can be used to convert text into Fig. 5. The prompt_answer.csv data file.
numerical vectors, which represent the semantic meaning of
the text and can be used for similarity search and vector-based After the front-end invoked the setup_qa_chain() function
analysis. and passed the user question, Fig. 6’s setup_qa_chain()
function first initializes embeddings using the function
init_hf_embeddings. These embeddings are then used to load a
local FAISS database from the specified file path generated by
the setup_vector_database() function.

Fig. 2. Functions to initialize PaLM and Gemini models.

Fig. 6. The setup_qa_chain() function.


Fig. 3. The init_hf_embeddings() Function.
In line 45 of Fig. 6, a retriever object is created from the
FAISS database with a score threshold of 0.7, meaning it will
only consider results that meet or exceed this similarity score.
This retriever is used to fetch relevant context for incoming
questions.

Fig. 4. The setup_vector_database() Function.

3|P age
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 6, 2024

The code then defines a prompt_template, which is a misleading or fabricated information (hallucination). Prompt
structured text template for generating prompts to be used with engineering, which involves carefully crafting the input text,
a large language model. Finally, the function initializes a LLM helps achieve this objective. When the context lacks sufficient
(Google PaLM or Gemini model) and generates a QA chain information to answer a department question, the model is
that takes a query as input, uses the retriever to fetch relevant instructed to provide a clear default message ("I am not
context based on the query, and then generates an answer using sure..."). This transparency helps manage user expectations and
the LLM and the structured prompt. prevents frustration in cases where a definitive answer cannot
be found.
Throughout the program, robust error-handling
mechanisms are in place to manage potential failures and There is also a crucial exception for computer science
processing errors. questions. In these cases, instead of being confined solely to
the provided department question context, the model can draw
V. TECHNICAL DETAILS on its own knowledge base to answer computer science
Google PaLM or Google Gemini models generate questions. Comprehensive tutorials and detailed explanations
responses to user queries. The models have customized for computer science topics can be delivered. This exception
behavior (e.g., setting temperature) to tailor the response caters to students seeking deeper understanding and learning
generation. At a high level, temperature controls the likelihood resources beyond basic department questions in the CSV file.
distribution over words or tokens the model might select at The PaLM Model app can be accessed at this link:
each step in generating text. A lower temperature makes the https://troy-cs-ai-assistant.streamlit.app/.
model more confident and conservative in its choices, leading
to more predictable text. A higher temperature increases The Gemini Model app can be accessed at this link:
randomness, making the model more likely to produce varied https://troy-ai-gemini.streamlit.app/.
and sometimes more creative or less likely outputs. Choosing
the right temperature is a balancing act: too low, and the model VI. RESULTS ANALYSIS
might generate dull, repetitive text; too high, and its outputs Fig. 7 shows a programming tutorial interface facilitated by
might become too random and less coherent. The optimal an AI PaLM assistant, exemplifying an interactive learning
setting often depends on the specific application and desired environment. It features a question-and-answer dialogue where
user experience. For a technical query retrieval system, a the user inquiry, "Can you give me C++ tutorial?" is met with a
slightly lower temperature might be preferred to ensure the comprehensive response from the AI assistant. The AI assistant
reliability and relevance of the information provided. outlines C++'s applicability in system, application, and game
Langchain is a backbone that connects various AI and development, and offers resources for further learning. This
exchange is encapsulated in a clean, structured layout,
machine learning components, ensuring seamless interaction.
The library assembles components (like the Google PaLM or promoting an engaging and educational user experience.
Gemini language model, HuggingFaceInstructEmbeddings,
and the FAISS vector database) into a cohesive QA chain. This
chain orchestrates the process of receiving a query, processing
it through the model, and fetching relevant answers. The
CSVLoader component in LangChain is used to load data from
CSV files. The CSV file contains two columns: one for the
prompts (or questions) and another for the corresponding
answers or information. These pairs are used to build a
knowledge base for the FAISS vector database, allowing the
system to retrieve relevant answers based on the embeddings
generated from user queries.
The system uses the following prompt_template: "Please
provide an answer to the question below, ensuring that your
response is derived solely from the provided context. Focus on
using the text from the 'response' section of the source
document, altering it as little as possible. If the context does
not contain the information necessary to answer the question,
simply reply with 'I am not sure. Please call: 1-334-808-6576'
to avoid creating or inferring any information not explicitly
stated in the context. Exceptions: Answer all computer science
questions using your own knowledge and give tutorials and
explain in detail".
The model is instructed to base its responses solely on the
provided context, specifically focusing on the ‘response’
section of the source documents. This restriction ensures Fig. 7. Interactive Programming Tutorial via AI PaLM Assistant.
factual accuracy and reduces the risk of the model generating

4|P age
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 6, 2024

textual content. However, Gemini is less adept when working


with pre-defined CSV data for tasks requiring strict adherence
to specific prompts. Its focus might shift towards pattern
matching within the CSV content, potentially leading to less
comprehensive responses or failing to follow the prompt's
instructions. This could be particularly true for tasks requiring
in-depth explanations or generating new content, such as
computer science tutorials. However, in domains like the
medical field, where factual accuracy and information
retrieval from structured data are crucial, Gemini's ability to
Fig. 8. Departmental Inquiry Response by AI Gemini Assistant.
handle diverse formats and find relevant patterns could prove
advantageous.
Fig. 8 illustrates an AI Gemini assistant answering a Data Restriction:
prospective student's inquiry, "What degrees do you offer?"
The AI assistant provides a detailed response listing the 1) Both models rely heavily on the information stored in
degrees offered by the university, facilitating ease of the CSV files. Gemini's ability to generate content would be
information retrieval for prospective students or interested inherently limited if those files lack rich information or
parties. detailed explanations for departmental or computer science
Forty unique questions were used to test the PaLM and concepts.
Gemini models, assessing their accuracy and capabilities in 2) PaLM might be better at drawing from external
answering questions such as the cost of the master's program, knowledge sources or generating explanations from its own
Python tutorials, identifying the faculty of computer science, understanding, even if the CSV data is somewhat limited.
differences between Python and C++, potential jobs post-
graduation, and details about the degree concentrations. Table I compares the performance of PaLM and Gemini on
a set of sample questions. The table categorizes the questions
Performance: into three sections: scenarios where both systems failed due to
limitations set by the prompt, instances where PaLM provided
1) PaLM demonstrated better accuracy, correctly
more accurate or helpful responses, and examples where
answering 37 out of 40 questions. For example, when asked, Gemini outperformed PaLM. The examples highlight the
"Can I get a bachelor's degree online?" PaLM provided a clear strengths and weaknesses of each system in handling specific
and informative response, highlighting the university's online types of inquiries, such as basic greetings, specific program
program and its benefits. information, and technical support questions.
2) In contrast, Gemini answered 32 questions correctly.
While some answers were accurate, others showed limitations, TABLE I. MODEL PERFORMANCE COMPARISON ON SAMPLE QUESTIONS
like the response to "Can I get a bachelor's degree online?". In The sample question that neither PaLM nor Gemini can address due to
this instance, Gemini offered uncertainty and directed the user the prompt restricting responses to information outside the CSV data file:
to contact the department or instructor, potentially hindering a Q: What is the best website to find sports data
PaLM: I am not sure. Please call 1-334-808-6576…
user's ability to get a quick answer. This difference in handling
Gemini: I am not sure. Please call 1-334-808-6576…
specific questions highlights areas for improvement in
Examples of questions where PaLM performed better than Gemini:
Gemini's knowledge base and its ability to provide definitive Q: Hello?
answers. PaLM: Hello, how can I help you?
3) User feedback indicates response time differences Gemini: I am not sure. Please call: 1-334-808-6576…
between the PaLM and Gemini models. Users report that the Q: How much does the masters of computer science program cost
PaLM web application loads noticeably faster in both Safari PaLM: Please check: https://www.troy.edu/scholarships-costs-aid/costs/
and Google Chrome browsers. Gemini: I am not sure. Please call: 1-334-808-6576…
Q: How big are the graduate classes?
Contextual Understanding and Prompting: PaLM: The average class size for graduate computer science courses at Troy
University is approximately 20 students.
1) PaLM excels at the provided prompt and adheres to Gemini: I am not sure. Please call: 1-334-808-6576…
strict guidelines. The prompt template specifically instructs Q: Can you give Python tutorials?
the model to "give tutorials and explain in detail" for computer PaLM: Python is a general-purpose programming language that is designed to
science questions. PaLM effectively obeys this direction, be easy to read and write…
retrieving relevant information from its knowledge base. Gemini: I am not sure. Please call: 1-334-808-6576…
Examples of questions where Gemini outperformed PaLM:
2) Gemini's strength lies in its multimodality. It can
Q: What are the admission requirements for graduate students?
handle various data formats like text, images, and code. This
PaLM: To be considered for admission to a graduate program at Troy
versatility makes it a good candidate for web development University, you must meet the following requirements…
scenarios where user queries might incorporate diverse Gemini: please check: https://www.troy.edu/applications-
information, such as screenshots or code snippets alongside admissions/admissions-process-requirements/graduate/index.html

5|P age
www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 15, No. 6, 2024

Q: Where can I find additional help on programming [5] "Facebook AI Similarity Search," Meta, [Online]. Available:
PaLM: An error occurred: list index out of range https://ai.meta.com/tools/faiss/. [Accessed 2024].
Gemini: You can contact the course instructors, tutors or use resources such as [6] A. Chowdhery, S. Narang, J. Devlin and M. Bosma, "PaLM: scaling
chatGPT or other generative AI (GenAI). language modeling with pathways," The Journal of Machine Learning
Research, vol. 24, no. 1, p. 11324–11436, 2022.
VII. CONCLUSION AND FUTURE WORK [7] S. Pichai and D. Hassabis, "Introducing Gemini: our largest and most
capable AI model," 6th December 2023. [Online]. Available:
This paper presented a framework that integrates advanced https://blog.google/technology/ai/google-gemini-ai/#sundar-note.
AI models and vector databases to enhance the effectiveness of [8] "Applications that can reason. Powered by LangChain.," [Online].
query retrieval in web development significantly. Our system Available: https://www.langchain.com/. [Accessed 2024].
leverages freely available tools, making it cost-effective and [9] D. Jurafsky and J. H. Martin, Speech and language processing, 2nd ed.,
accessible for developers. The comparative analysis between Prentice Hall, 2009.
PaLM and Gemini revealed their unique strengths: PaLM can [10] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural
learn from a few examples, which might be helpful for limited Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
datasets. In contrast, Gemini focuses on factual accuracy and [11] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N.
Gomez, L. Kaiser and I. Polosukhin, "Attention is all you need," in
aims to reduce factual errors and hallucinations. Future work Proceedings of the 31st International Conference on Neural Information
will involve comprehensive testing and evaluation of the Processing Systems, 2017.
system's performance across diverse user scenarios to ensure [12] J. Devlin, M. W. Chang, K. Lee and K. Toutanova, "BERT: Pre-training
scalability and robustness. The system's effectiveness is highly of deep bidirectional transformers for language understanding," in
dependent on the quality and comprehensiveness of the CSV Proceedings of the 2019 Conference of the North American Chapter of
data. Future work will explore techniques for continuous the Association for Computational Linguistics: Human Language
Technologies, 2019.
knowledge base improvement. Strategies for automatic data
augmentation, user feedback integration, and potentially [13] G. Salton, A. Wong and C. Yang, "A vector space model for automatic
indexing," Communications of the ACM, vol. 18, no. 11, p. 613–620,
incorporating external knowledge sources to enrich the 1975.
information available to the LLMs can be considered. [14] J. Johnson, M. Douze and H. Jégou, "Billion-Scale Similarity Search
with GPUs," IEEE Transactions on Big Data, vol. 7, no. 3, pp. 535 -
REFERENCES 547, 2019.
[1] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal and [15] Y. Wang, X. Chen, J. Fang, Z. Meng and S. Liang, "Enhancing
A. Neelakantan, "Language models are few-shot learners," in Advances Conversational Recommendation Systems with Representation Fusion,"
in Neural Information Processing Systems, Curran Associates, Inc., ACM Transactions on the Web, vol. 17, no. 1, pp. 1-34, 2023.
2020, pp. 1877--1901. [16] Y. Han, C. Liu and P. Wang, "A Comprehensive Survey on Vector
[2] P. Lewis, E. Perez, A. Piktus, F. Petroni and V. Karpukhin, "Retrieval- Database: Storage and Retrieval Technique, Challenge," arxiv.org,
augmented generation for knowledge-intensive NLP tasks," in [Online]. Available: https://arxiv.org/pdf/2310.11703.
Proceedings of the 34th International Conference on Neural Information [17] "A faster way to build and share data apps," [Online]. Available:
Processing Systems, 2020. https://streamlit.io/. [Accessed 2024].
[3] V. Karpukhin, B. Oguz, S. Min, P. Lewis, L. Wu, S. Edunov, D. Chen [18] "Get API key," Google, [Online]. Available:
and W. Yih, "Dense Passage Retrieval for Open-Domain Question https://aistudio.google.com/app/apikey. [Accessed 2024].
Answering," in Proceedings of the 2020 Conference on Empirical
Methods in Natural Language Processing, 2020. [19] "Instruct Embeddings on Hugging Face," [Online]. Available:
https://python.langchain.com/v0.1/docs/integrations/text_embedding/inst
[4] "hkunlp/instructor-large," Hugging Face, [Online]. Available: ruct_embeddings/. [Accessed 2024].
https://huggingface.co/hkunlp/instructor-large. [Accessed 2024].

6|P age
www.ijacsa.thesai.org

You might also like