Tutorial: Integrating Locally-Run DeepSeek R1 Distilled Llama Model with Cursor

In this article, we'll explore how to integrate the DeepSeek R1 Distilled Llama-8B model with the highly-rated intelligent code editor Cursor to create a private coding assistant. DeepSeek R1 is a powerful open-source language model with efficient inference capabilities and cost-effectiveness, making it particularly suitable for developers and researchers. Cursor is a popular AI code editor that can rely on different LLMs to complete code assistance tasks. While large language models specifically trained for coding tasks have shown excellent results, we've found that the trending DeepSeek's coding capabilities are quite impressive and comparable to many programmers.

By integrating DeepSeek R1 with Cursor, you can run a private LLM backend locally, enjoying an efficient coding experience without exposing your code.

(DeepSeek topped the US Apple Store. Joke on intern uploading the code of DoD)

WasmEdge is a WebAssembly-based runtime hosted by CNCF under the Linux Foundation. Why use WasmEdge/LlamaEdge-powered Gaia to run large models?

Lightweight: Only 30MB, no dependencies, root-free, single binary cross-platform operation, more suitable for embedded applications; no need to install daemon processes.
Secure Extensions: Based on Wasm sandbox isolation, supports multimodal models (such as vision/speech/image), developers can deeply customize through Rust API, suitable for building AI-native applications.
Cloud Native Integration: Can package models + runtime through Docker, Wasm also perfectly fits K8s and other cloud architectures. For details, see this article.

Gaia nodes are a set of lightweight and portable LLM inference tools built on the WasmEdge runtime. Since connecting to Cursor requires an HTTPS link, we'll use Gaia to run the DeepSeek R1 distilled model.

1. Running DeepSeek R1 Distilled Llama-8B Model

First, we need to run the DeepSeek R1 Distilled Llama-8B model on our local device and create an OpenAI-compatible API service. We'll use Gaia to run the model.

Hardware Requirements

Recommended: Mac with 16GB RAM, NVIDIA GPU, Huawei Ascend NPU, etc. Minimum: Machine with 16GB RAM.

Step 1: Install Gaia Software with the below command

curl -sSfL 'https://github.com/GaiaNet-AI/gaianet-node/releases/latest/download/install.sh' | bash

Step 2: Download and Initialize the DeepSeek R1 Distilled Llama-8B model

You can adjust the parameters in the configuration file as needed.

gaianet init --config https://raw.githubusercontent.com/GaiaNet-AI/node-configs/main/deepseek-r1-distill-llama-8b/config.json

Step 3: Run DeepSeek R1 Distilled Llama-8B Model

Use the following command to start Gaia and run the DeepSeek R1 Distilled Llama-8B model.

gaianet start

After successful startup, you'll receive an HTTPS URL, similar to https://ids.us.gaia.domains. This is an OpenAI-compatible API URL.

Note: We start the DeepSeek R1 model with an 8k context window by default. If your machine has larger GPU memory (e.g., 64GB), you can increase the context size to 128k. A larger context window is particularly useful in coding tasks, as we need to compress large source code files into prompts to complete complex tasks.

2. Integrating DeepSeek R1 Distilled Llama-8B with Cursor

Cursor is an LLM-dependent AI code editor that supports multiple programming languages. By using DeepSeek R1 as Cursor's backend, you can run a private code assistant locally.

Step 4: Configure Cursor to Use Local DeepSeek R1 API

Find the LLM Backend configuration option in Cursor's settings.
Set the Base API URL to the HTTPS URL provided by your Gaia node, e.g., https://ids.us.gaia.domains.
Set the Model Name to DeepSeek-R1-Distill-Llama-8B.
You can leave the API Key empty or fill it with any value like GAIA.

Step 5: Start Using Cursor + DeepSeek R1 for Coding

After configuration, you can use the DeepSeek R1 model in Cursor for code generation and assistance tasks. For example, you can ask the model to generate a web search page or explain how certain code works. For instance, in the demo video below, we used DeepSeek R1 Distilled Llama-8B to build a simple To-Do web app.

3. Final Thoughts

By integrating the DeepSeek R1 Distilled Llama-8B model with Cursor, you can run a private code assistant on your local device and enjoy efficient, effortless coding. The powerful inference capabilities of DeepSeek R1 combined with Cursor's convenient editor provide developers with a powerful toolchain.

Other DeepSeek large models on this page are equally applicable, so give them a try in your Cursor! If you find it interesting or encounter any issues, please star our GitHub repo or raise an issue.