The Random Walk Blog

2024-06-26

Tiny Pi, Mighty AI: How to Run LLM on a Raspberry Pi 4

Tiny Pi, Mighty AI: How to Run LLM on a Raspberry Pi 4

Using Large Language Models (LLMs) in businesses presents challenges, including high computational resource requirements, concerns about data privacy and security, and the potential for bias in outputs. These issues can hinder effective implementation and raise ethical considerations in decision-making processes.

Introducing local LLMs on small computers is one solution to these challenges. This approach enables businesses to operate offline, enhance data privacy, achieve cost efficiency, and customize LLM functionalities to meet specific operational requirements.

Our goal was to create an LLM on a small, affordable computer demonstrating the potential of powerful models to run on modest hardware. We used Raspberry Pi OS with Ollama as our source file to achieve our goal.

The Raspberry Pi is a compact, low-cost single-board computer that enables people to explore computing and learn how to program. It has its own processor, memory, and graphics driver, running the Raspberry Pi OS, a Linux variant. Beyond core functionalities like internet browsing, high-definition video streaming, and office productivity applications, this device empowers users to delve into creative digital maker projects. Despite its small size, it makes an excellent platform for AI and machine learning experiments.

Choosing and Setting Up the Raspberry Pi

We used the Raspberry Pi 4 Model B, with 8GB of RAM, to balance performance and cost. This model provides enough memory to handle the demands of AI tasks while remaining cost-effective.

First, we set up Raspberry Pi OS by downloading the Raspberry Pi Imager and installed a lite 64-bit OS onto a microSD card. This step is crucial for ensuring the system runs smoothly and efficiently. To prepare the system for further deployment, we completed the OS installation, network configuration, and system updates to ensure optimal functionality and security.

sudo apt update

sudo apt upgrade

sudo apt install python3-pip

Downloading and Setting Up Ollama

Ollama is an open-source language model designed for efficient training and inference. Its lightweight architecture makes it suitable for running on resource-constrained devices like the Raspberry Pi.

  • Downloading Ollama: We downloaded the Linux version of Ollama and verified its compatibility with the Raspberry Pi by running the provided code. This step ensures that the software can run effectively on the Raspberry Pi’s architecture.

curl -fsSL https://ollama.com/install.sh | sh

  • Configuring Ollama: Following Ollama’s installation and configuration, we selected and integrated an appropriate model. This involves setting the correct parameters and ensuring the system can handle the computational load.

Choosing the Model

The Ollama website offers various models, making it challenging to choose the best one for the Raspberry Pi, given its 8GB RAM limitation. Large or medium-sized LLMs could overload the system. Therefore, we decided on the phi3 mini model, which is regularly updated and has a small storage size. This model is ideal for the Raspberry Pi, providing a balance between performance and resource usage.

Setting Up the PHI3 Mini Model

Setting up the phi3 mini model was straightforward but time-consuming. Since the Raspberry Pi lacks a graphics card, the model runs in CPU mode. This version of the phi3 mini model, which we named Jarvis, can change its responses and act as a versatile virtual AI assistant. Jarvis is designed to handle a variety of tasks and queries, making it a powerful tool for natural language processing (NLP) and semantic understanding.

./ollama --model phi3_mini

About Jarvis as an AI Assistant

Jarvis, our version of the phi3 mini model, is an advanced AI assistant capable of responding in a human-like manner, infused with humor, sarcasm, and wit. This customization adds a unique personality to the AI assistant. NLP enables Jarvis to analyze user queries by breaking down the input into comprehensible components, identifying key phrases and context. This allows Jarvis to generate relevant and accurate responses, providing a seamless and intuitive user experience.

Testing and Validation

After thorough testing, it is observed that both versions of phi3 work as expected and provide satisfactory outcomes. Jarvis is capable of handling various queries and tasks efficiently, showcasing the power of LLMs on a modest platform. The testing phase involved running multiple scenarios and queries to ensure Jarvis could handle different types of input and provide accurate, relevant responses.

large language model.svg

natural language processing.svg

Enhancing Jarvis as an AI Assistant

To enhance Jarvis further, we plan to install additional Python packages, create a more interactive environment, and add more code to develop a user-friendly interface and integrate more functionalities. This includes expanding Jarvis’s capabilities to understand more complex queries and provide more detailed responses. Future enhancements could also involve integrating Jarvis with other systems and platforms to broaden its utility.

Challenges Encountered

Throughout the development, we encountered several challenges:

  • Network Configuration: Initially, we faced issues with network configuration due to a booting problem. This was resolved by using a dedicated Raspberry Pi power adapter.

  • Coding Issues: Several coding challenges emerged but were resolved through debugging and community support. The Raspberry Pi community proved invaluable for troubleshooting and finding solutions.

  • Overheating: The Raspberry Pi overheated due to the lack of a graphics card. This was managed by adding heat sinks and a cooling fan, ensuring the system could run smoothly without overheating.

Building an LLM on a Raspberry Pi with Ollama has been both challenging and rewarding. This initiative showcases the potential of low-cost, low-power hardware for wider adoption of LLMs and innovation for business use cases.  As these advancements continue, the future promises even greater integration of AI into everyday operations.

Related Blogs

1-bit LLMs: The Future of Efficient and Accessible Enterprise AI

As data grows, enterprises face challenges in managing their knowledge systems. While Large Language Models (LLMs) like GPT-4 excel in understanding and generating text, they require substantial computational resources, often needing hundreds of gigabytes of memory and costly GPU hardware. This poses a significant barrier for many organizations, alongside concerns about data privacy and operational costs. As a result, many enterprises find it difficult to utilize the AI capabilities essential for staying competitive, as current LLMs are often technically and financially out of reach.

1-bit LLMs: The Future of Efficient and Accessible Enterprise AI

GuideLine: RAG-Enhanced HRMS for Smarter Workflows

Human Resources Management Systems (HRMS) often struggle with efficiently managing and retrieving valuable information from unstructured data, such as policy documents, emails, and PDFs, while ensuring the integration of structured data like employee records. This challenge limits the ability to provide contextually relevant, accurate, and easily accessible information to employees, hindering overall efficiency and knowledge management within organizations.

GuideLine: RAG-Enhanced HRMS for Smarter Workflows

Linking Unstructured Data in Knowledge Graphs for Enterprise Knowledge Management

Enterprise knowledge management models are vital for enterprises managing growing data volumes. It helps capture, store, and share knowledge, improving decision-making and efficiency. A key challenge is linking unstructured data, which includes emails, documents, and media, unlike structured data found in spreadsheets or databases. Gartner estimates that 80% of today’s data is unstructured, often untapped by enterprises. Without integrating this data into the knowledge ecosystem, businesses miss valuable insights. Knowledge graphs address this by linking unstructured data, improving search functions, decision-making, efficiency, and fostering innovation.

Linking Unstructured Data in Knowledge Graphs for Enterprise Knowledge Management

LLMs and Edge Computing: Strategies for Deploying AI Models Locally

Large language models (LLMs) have transformed natural language processing (NLP) and content generation, demonstrating remarkable capabilities in interpreting and producing text that mimics human expression. LLMs are often deployed on cloud computing infrastructures, which can introduce several challenges. For example, for a 7 billion parameter model, memory requirements range from 7 GB to 28 GB, depending on precision, with training demanding four times this amount. This high memory demand in cloud environments can strain resources, increase costs, and cause scalability and latency issues, as data must travel to and from cloud servers, leading to delays in real-time applications. Bandwidth costs can be high due to the large amounts of data transmitted, particularly for applications requiring frequent updates. Privacy concerns also arise when sensitive data is sent to cloud servers, exposing user information to potential breaches. These challenges can be addressed using edge devices that bring LLM processing closer to data sources, enabling real-time, local processing of vast amounts of data.

LLMs and Edge Computing: Strategies for Deploying AI Models Locally

Measuring ROI: Key Metrics for Your Enterprise AI Chatbot

The global AI chatbot market is rapidly expanding, projected to grow to $9.4 billion by 2024. This growth reflects the increasing adoption of enterprise AI chatbots, that not only promise up to 30% cost savings in customer support but also align with user preferences, as 69% of consumers favor them for quick communication. Measuring these key metrics is essential for assessing the ROI of your enterprise AI chatbot and ensuring it delivers valuable business benefits.

Measuring ROI: Key Metrics for Your Enterprise AI Chatbot
1-bit LLMs: The Future of Efficient and Accessible Enterprise AI

1-bit LLMs: The Future of Efficient and Accessible Enterprise AI

As data grows, enterprises face challenges in managing their knowledge systems. While Large Language Models (LLMs) like GPT-4 excel in understanding and generating text, they require substantial computational resources, often needing hundreds of gigabytes of memory and costly GPU hardware. This poses a significant barrier for many organizations, alongside concerns about data privacy and operational costs. As a result, many enterprises find it difficult to utilize the AI capabilities essential for staying competitive, as current LLMs are often technically and financially out of reach.

GuideLine: RAG-Enhanced HRMS for Smarter Workflows

GuideLine: RAG-Enhanced HRMS for Smarter Workflows

Human Resources Management Systems (HRMS) often struggle with efficiently managing and retrieving valuable information from unstructured data, such as policy documents, emails, and PDFs, while ensuring the integration of structured data like employee records. This challenge limits the ability to provide contextually relevant, accurate, and easily accessible information to employees, hindering overall efficiency and knowledge management within organizations.

Linking Unstructured Data in Knowledge Graphs for Enterprise Knowledge Management

Linking Unstructured Data in Knowledge Graphs for Enterprise Knowledge Management

Enterprise knowledge management models are vital for enterprises managing growing data volumes. It helps capture, store, and share knowledge, improving decision-making and efficiency. A key challenge is linking unstructured data, which includes emails, documents, and media, unlike structured data found in spreadsheets or databases. Gartner estimates that 80% of today’s data is unstructured, often untapped by enterprises. Without integrating this data into the knowledge ecosystem, businesses miss valuable insights. Knowledge graphs address this by linking unstructured data, improving search functions, decision-making, efficiency, and fostering innovation.

LLMs and Edge Computing: Strategies for Deploying AI Models Locally

LLMs and Edge Computing: Strategies for Deploying AI Models Locally

Large language models (LLMs) have transformed natural language processing (NLP) and content generation, demonstrating remarkable capabilities in interpreting and producing text that mimics human expression. LLMs are often deployed on cloud computing infrastructures, which can introduce several challenges. For example, for a 7 billion parameter model, memory requirements range from 7 GB to 28 GB, depending on precision, with training demanding four times this amount. This high memory demand in cloud environments can strain resources, increase costs, and cause scalability and latency issues, as data must travel to and from cloud servers, leading to delays in real-time applications. Bandwidth costs can be high due to the large amounts of data transmitted, particularly for applications requiring frequent updates. Privacy concerns also arise when sensitive data is sent to cloud servers, exposing user information to potential breaches. These challenges can be addressed using edge devices that bring LLM processing closer to data sources, enabling real-time, local processing of vast amounts of data.

Measuring ROI: Key Metrics for Your Enterprise AI Chatbot

Measuring ROI: Key Metrics for Your Enterprise AI Chatbot

The global AI chatbot market is rapidly expanding, projected to grow to $9.4 billion by 2024. This growth reflects the increasing adoption of enterprise AI chatbots, that not only promise up to 30% cost savings in customer support but also align with user preferences, as 69% of consumers favor them for quick communication. Measuring these key metrics is essential for assessing the ROI of your enterprise AI chatbot and ensuring it delivers valuable business benefits.

Additional

Your Random Walk Towards AI Begins Now