Run DeepSeek locally on Mac (works with other LLMs, too)

April 10, 2025

1 View

SaveSavedRemoved 0

Run DeepSeek locally on Mac (works with other LLMs, too)

ChatGPT, Google’s Gemini and Apple Intelligence are powerful, but they all share one major drawback — they need constant access to the internet to work. If you value privacy and want better performance, running a large language model like DeepSeek, Google’s Gemma or Meta’s Llama locally on your Mac is a great alternative.

Think it sounds complicated? It’s easier than you think. With the right tools, you can run DeepSeek or any other popular LLM locally on your Mac with minimal effort.

Why run DeepSeek or other LLMs locally on your Mac?

Privacy and security are the biggest reasons for running an LLM locally. This ensures the model cannot share data with external servers, giving you complete control when working with private and sensitive information.

It is also faster and cheaper to run AI models locally to avoid recurring API fees. Plus, by running an LLM locally, you can train it with proprietary data, tailoring its responses to better suit your needs. This approach will also save precious time in the long run if you intend to use DeepSeek or other LLMs for work purposes.

If you are a developer, you might want to run LLMs locally and play around with them to see how they can help your work. With the necessary technical know-how, you can also use the available AI models to build agentic tools for your work.

How to run LLMs like DeepSeek and Llama locally on your Mac

You might think that running LLMs locally requires access to a high-end Mac with plenty of RAM. But that’s not the case. It is possible to run an LLM locally if you have an Apple silicon-powered Mac with at least 16GB of system memory. Even 8GB of memory works, but your system performance will be negatively affected.

Note that LLMs are available in several variations and with different parameters. The higher the parameter, the more complex and smarter the LLM is. And this also means the AI model will need more space and resources to run. For example, Meta’s Llama is available in several variations, including one with a 70 billion parameter. But to run this model, you must use a Mac with more than 40GB of free storage and more than 48GB of system memory.

Ideally, try running an LLM like DeepSeek with a 7 billion or 8 billion parameter. That should run smoothly locally on a Mac with 16GB of system memory. If you have access to a more powerful Mac, use any model that suits your requirements.

It is important to download the right model, depending on your use case. Some LLMs are good at reasoning, while others are good at solving coding queries. Some will better work at STEM chats and tasks, while others will be good at multiturn conversations and long context coherence.

LM Studio is the easiest way to run LLMs locally on your Mac

LM Studio Onboarding process on Mac — LM Studio makes running LLMs on your Mac easy.
Screenshot: Rajesh Pandey/CultOfMac

If you want to run LLMs like DeepSeek and Llama locally on your Mac, LM Studio is the easiest way to get started. The software is free for personal use.

Here’s how to get started:

Download and install LM Studio on your Mac, then launch the application.
If your primary goal is to run DeepSeek locally, complete the onboarding process and download the model. Otherwise, you can skip the onboarding step.
You should see a search bar at the top in LM Studio, asking you to Select a model to load. Click on it and search for the LLM you want to download and install.
Alternatively, go through the list of available LLMs by clicking the Settings cog in the bottom-right corner of LM Studio. From the window that opens, select the Model Search tab from the left. Alternatively, directly open this window using the Command + Shift + M keyboard shortcut.
You will see all the AI models available for download listed here. The window on the right will provide some insight into a given model, providing a brief description and its token limit.
Select DeepSeek, Meta’s Llama, Qwen, phi-4 or any of the other available LLMs. Then click the Download button on the bottom right
Note: While you can download multiple LLMs, LM Studio can only load and run one model at a time.

Use your downloaded LLM

LM Studio - You can configure the resource usage of an AI model while loading it. — Customize the context length and resource usage of the LLM before loading it.
Screenshot: Rajesh Pandey/CultOfMac

After your LLM download finishes, close LM Studio’s Mission Control window. Then, click on the top search bar and load the recently downloaded LLM. When loading an AI model, LM Studio will let you configure its context length, CPU thread pool size and other key settings. You can leave these settings as they are if you are unsure what they do.

You can now ask questions or use the LLM however you like.

LM Studio allows you to have multiple separate chats with an LLM. To initiate a new conversation, click the + icon from the toolbar at the top. This is a handy option if you are simultaneously using the LLM to work on multiple projects. You can also create folders and sort your chats in them.

How to keep it from using all your Mac’s resources

LM Studio - Model Loading guardrails — LM Studio can ensure the AI model does not eat up all your Mac’s system resources.
Screenshot: Rajesh Pandey/CultOfMac

If you are worried about the AI model eating up all your Mac’s system resources, bring up LM Studio’s settings using the keyboard shortcut Command + ,. Then, ensure the Model loading guardrails setting is set to Strict. This should ensure the LLM will not overload your Mac.

You can see the resource usage by LM Studio and your downloaded LLM in the bottom toolbar. If the CPU or memory usage is too high, consider switching to an AI model with a lower parameter to reduce resource usage.

For the past few weeks, I’ve been using LM Studio to run Llama and DeepSeek LLMs locally on my Mac, and I’ve been impressed by the performance of my M1 Mac mini with 16GB of RAM. It has handled the workload effortlessly. Yes, the LLMs run more smoothly on my M1 Pro MacBook Pro. And the performance should be even better on newer M2-, M3- or M4-equipped Macs with higher system memory.

Still, even if you own an older Apple silicon Mac, don’t assume it can’t run LLMs smoothly. You’ll be surprised by its performance.

Make sure to delete any unwanted LLMs from your Mac to free up space. Otherwise, if you download a few of them to try out, you will run out of space in no time.

Source