Step-by-Step Guidance

Welcome to the Step-by-Step Guidance version of this project. Let's do this!

πŸ“£ If you're EVER stuck - ask the NextWork community. Students like you are already asking questions about this project.

Before we start Step #1...Use DeepSeek On Your Browser

We’ll start with the easiest way to use DeepSeek - its web app.

If you’ve used ChatGPT or any other LLM in your browser, this will feel familiar. But, as we dive into more advanced prompts, you’ll start noticing what makes DeepSeek stand out.


In this step, you're going to:


Create a DeepSeek Account

πŸ™‹β€β™€οΈ I don't want to enter my email
You could consider using a temporary email address instead.

For example, you could use a tool like TempMail to set up a temporary email address that receives mail for 1-2 hours, and deletes itself right after. Just make sure to keep this tab open until the end of the project, so you don't lose the inbox.

πŸ™‹β€β™€οΈ I'm not getting a code
You might not get a code if DeepSeek's registration is busy. You could try signing up with Google instead. If Google isn't an option or isn't working either, you could skip this step for now and head to the next step. The first step is great for easy access to DeepSeek, but you can always come back to it later!


Run a Prompt

Summarize the fall of the Roman Empire using only text abbreviations and emojis.

πŸ’‘ Tip: You can choose another prompt to give DeepSeek. We'd recommend a short and easy request (e.g. ask for a 100 word summary instead of a 1000 word essay), so you won't be waiting too long for an answer!

Summarize the fall of the Roman Empire using only text abbreviations and emojis.

πŸ’‘ What is DeepThink (R1)?
DeepThink (R1) is DeepSeek's latest AI model. It stands out for displaying its real-time reasoning process before generating a response.

Reasoning is a big deal in LLMs. Unlike older models that simply predict the next word in a sequence, DeepThink - as well as OpenAI’s o1 model - is one of the first models to actively reviews its own responses as it generates them.

In R1's thinking process, spot for intermediate thinking steps like self-doubt ("hmm") and verification checks ("wait"), which gives you a lot of transparency into its problem-solving approach. Reasoning also makes an LLM much more efficient - they're more likely to solve a problem in one go, without requiring lots of back and forth between you and the LLM.


Test DeepSeek vs ChatGPT on Advanced Reasoning

Let's challenge DeepSeek to a harder prompt that requires more reasoning, and see how it compares with another LLM (e.g. OpenAI).

Self Host DeepSeek

Now that you have a feel of how DeepSeek works, let's see how we can host it locally without relying on the web app.

πŸ’‘ What are the downsides of using a web app?
Ooo good question! Web-based LLMs, like ChatGPT and DeepSeek online:

  1. Require constant internet connection (you can't go offline)
  2. Introduce latency (web apps run slower when there are lots of people sending requests)
  3. Process queries through external servers - which might raise privacy concerns around how the data is stored and used.


πŸ’‘ What does it mean to host DeepSeek locally?
Running DeepSeek on your own computer means you don’t need the web app at all. Your device does all the processing, so no external servers are involved.

That means you can use DeepSeek offline, keep all your data private, and get faster responses since there's no waiting on the internet.

In this step, you're going to:


Download Ollama

Note: If you already have Ollama installed, you can skip ahead to the next step.

πŸ’‘ What is Ollama?
Ollama is a tool that makes it easy to host LLMs, like DeepSeek, on your own computer. You can start chatting with LLMs over your computer's terminal!

Ollama takes care of downloading, installing, and running the models, so you don't have to worry about the complex setup that comes with hosting an LLM locally.

Ollama also gives you more control around the LLM you're using. We'll experiment with a setting called temperature later in this project to see the benefits of having wider control.


Install Ollama

Next up, installing Ollama! Installation instructions depend on your operating system.

πŸ’‘ Haven't I already installed Ollama?
So far you've just downloaded Ollama's installation files, which means Ollama is like a package that's been delivered to your door - but you haven't opened the package yet.

You'll need to open the package and set up permissions to start using Ollama's software in your computer.

Nice! Now Ollama will take you through the process of installing the software locally.

For the most up to date instructions, we'd recommend visiting Ollama's GitHub

curl -fsSL https://ollama.com/install.sh | sh

Manual install

[!NOTE] If you are upgrading from a prior version, you should remove the old libraries with sudo rm -rf /usr/lib/ollama first.

Download and extract the package:

curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz sudo tar -C /usr -xzf ollama-linux-amd64.tgz

Start Ollama:

ollama serve

In another terminal, verify that Ollama is running:

ollama -v

AMD GPU install

If you have an AMD GPU, also download and extract the additional ROCm package:

curl -L https://ollama.com/download/ollama-linux-amd64-rocm.tgz -o ollama-linux-amd64-rocm.tgz sudo tar -C /usr -xzf ollama-linux-amd64-rocm.tgz

ARM64 install

Download and extract the ARM64-specific package:

curl -L https://ollama.com/download/ollama-linux-arm64.tgz -o ollama-linux-arm64.tgz sudo tar -C /usr -xzf ollama-linux-arm64.tgz

Adding Ollama as a startup service (recommended)

Create a user and group for Ollama:

sudo useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama sudo usermod -a -G ollama $(whoami)

Create a service file in /etc/systemd/system/ollama.service:

[Unit] Description=Ollama Service After=network-online.target [Service] ExecStart=/usr/bin/ollama serve User=ollama Group=ollama Restart=always RestartSec=3 Environment="PATH=$PATH" [Install] WantedBy=default.target

Then start the service:

sudo systemctl daemon-reload sudo systemctl enable ollama
ollama --version

Access DeepSeek in the Terminal

Now that we've installed Ollama, how do we use it to access DeepSeek locally?


In this step, you're going to:


Find and Install DeepSeek R1

πŸ’‘ Why can't I find OpenAI's models on Ollama?
Ollama focuses on open-source models like DeepSeek.

OpenAI's models are closed systems, so the underlying architecture, codebase, and datasets used to develop OpenAI models are confidential. Because they're confidential, it's not possible to use OpenAI locally in your machine.

πŸ’‘ What are these different dropdown options?

The different dropdown options represent different model sizes for R1. Think of DeepSeek R1 in the web app as R1 at full capacity - if you wanted to run this version of R1 locally, you would need a computer with very large processing power and storage space (the dropdown tells us it requires 404GB of storage). This is far beyond what most computers can handle, as computers typically have less available storage and memory.

Model sizes let you choose a smaller, more accessible version of DeepSeek R1 for local use.

Smaller models (like 1.5b) are faster and require less memory to run locally, while larger models (like 8b) have deeper reasoning abilities and are more accurate. We're installing 1.5b first as a quick start, but we'll use a larger model next to the difference in performance.

πŸ’‘ What does "1.5b" mean?
In AI models like DeepSeek, "1.5 b" means the model has 1.5 billion parameters to learn patterns from data.

Think of parameters as tiny decision-makers inside the model, each helping it recognize patterns, analyze data, and improve reasoning. More parameters generally mean the model can handle more complex tasks, but bigger isn’t always better - it also depends on how well the model is trained.

πŸ’‘ What does this command do?
This command sets up DeepSeek's smallest model, i.e. the 1.5 billion parameter model, locally in your computer. Because the command uses run, your terminal will transform into a chat session with DeepSeek R1 too.

πŸ’‘ Extra for Experts: The terminal response starts by 'pulling manifest' - what does that mean?

When you run the Ollama command, it fetches the DeepSeek model's manifest, which is like a blueprint that tells your computer how to set up and run the model. It includes instructions for downloading and configuring everything correctly.

The actual brain of DeepSeek is the model itself, which gets downloaded after the manifest. Think of the manifest as the setup guide, while the model is the intelligence your computer will use to process prompts and generate responses.


Use Another DeepSeek R1 Model


πŸ™‹β€β™€οΈ How do I know how much storage my computer has?


If you're stuck picking a model size, we'd recommend going for the 8b option. If there are any issues with using it, you can always switch to the 7b option instead.

Test Prompts

πŸ’‘ Why can I still access DeepSeek while offline?
Local hosting through Ollama means you don't need another server to process your prompt. The DeepSeek model is running entirely on your device without needing internet connectivity.


πŸ’‘ What are the <think> tags?
The <think> tags are a terminal version of DeepSeek's real-time reasoning display, so you can still see how DeepSeek is generating its response.

You might've noticed that the the think tags were empty in your previous request. That's because Hello was a more straightforward prompt, so deep thinking (which triggers this real-time reasoning display) wasn't required.

Use Chatbox with Ollama

While the terminal is great for quick tests, you might miss the look of the web app. It does a much better job of organizing your chats and making conversations user friendly!

No worries, you can use a tool called Chatbox to organize your conversations in the terminal to look like the web app too. Let's set that up!


In this step, you're going to:


Install and Configure Chatbox

πŸ™‹β€β™€οΈ How do I find the correct option for my operating system?

πŸ’‘ What does the Model provider setting do?
In Chatbox, the Model provider determines the API that will connect you to the LLM model you want to use. We're using the Ollama API, since Ollama is the tool we're using to run DeepSeek locally.


πŸ’‘ Why are we leaving the API Host as the default?
To connect you with your local LLM, Ollama needs to set up an endpoint, which is like an address within your computer to run DeepSeek. Ollama sets up LLMs at a default location (127.0.0.1:11434), so we'll keep the default value in Chatbox. This setup lets Chatbox communicate directly with your locally-hosted DeepSeek model.


Chat with DeepSeek

πŸ™‹β€β™€οΈ DeepSeek made an error!
You might notice that the 1.5b model gave you an incorrect answer! Instead of three r's in strawberry, it only found two.

The 1.5b model is the most lightweight R1 model, so it's less able to analyse text and conduct proper reasoning.


πŸ™‹β€β™€οΈ DeepSeek didn't make an error!
How good is that! It's great if DeepSeek got that right with a smaller model. A comparison where both models produce the right result is still a great experiment. You could always try other problems or prompts to test the limits πŸ”₯

Temperature Test

Let's explore an advanced setting, called temperature, to see how you can customize DeepSeek R1 depending on the use case.


In this step, you're going to:


High Temperature Test

πŸ’‘ What is temperature?
Temperature controls the randomness of an LLM's output.

A higher temperature, like 2, gives you more creative and unpredictable responses, while a lower temperature, like 0, gives more focused and logical responses.

This is a detailed setting that you might not have access to over a web app, but possible with local hosting and APIs. Chatbox makes it easy for you to edit and customise your AI model's temperature.

Create a recipe for a dessert that includes avocados, chocolate, and sea salt.


In this example, you might notice that the recipe uses orange juice! Interesting ingredient choice...

Low Temperature Test

Create a recipe for a dessert that includes avocados, chocolate, and sea salt.


Set up a Third Chat

πŸ’‘ Why are we opening ChatGPT?
This new chat will act as the judge of the two responses we generate - can another AI tell the difference between high and low temperature responses?

You'll also get to learn a breakdown of how to detect low vs high temperature text along the way.

You are an AI master. I will give you two pieces of generated text that received the same prompt. One was generated with a high temperature, the other was generated with a low temperature. You are to identify which one was generated with a higher temperature setting.

The second response is still generating, but I have the first response ready for you now. Can you read the first response, then wait for the second response after?

Oooo, nice work ChatGPT. You might notice that ChatGPT can correctly point out the first response had a higher temperature setting, and why.

πŸ’‘ Why does temperature matter?
Different temperature settings work great for different scenarios.


πŸ’‘ Extra for Experts: I want to try another temperature experiment! We got you! Challenge your DeepSeek model to generate two responses - one with high temperature (2), one with low temperature (0) - to the following prompt:

Write a short 100 word story set in a world where gravity changes direction every day.

See the differences between a creative high-temperature story, versus a logical low-temperature story!

The Token Efficiency Showdown

Welcome to your 🀫 exclusive 🀫 secret mission! Are you ready for the ultimate test?

Your mission, should you choose to accept it, is to expose how efficiently DeepSeek and OpenAI use tokens. This lets you know which model gives you the most value. Let’s dive in!


In this secret mission, you're going to:

Clean Up

Now that we've explored the world of LLMs with DeepSeek and Ollama, it's time to clean up. This is important to keep your systems tidy.


Resources to delete:

Remove the DeepSeek models from Ollama (optional).

Uninstall Ollama (optional).

Uninstall Chatbox (optional).


If you no longer plan to use the DeepSeek models, you can remove them to free up disk space.

ollama rm deepseek-r1:1.5b ollama rm deepseek-r1:8b


🍎 MacOS

ollama stop sudo rm -rf /usr/local/bin/ollama sudo rm -rf ~/.ollama sudo rm -rf /Library/LaunchDaemons/com.ollama.ollama.plist sudo launchctl bootout system /Library/LaunchDaemons/com.ollama.ollama.plist

If you've installed Ollama via Homebrew, you can also uninstall it using

brew uninstall ollama brew cleanup

To check if Ollama is fully removed, run:

which ollama

If it returns ollama not found, Ollama is completely uninstalled.


πŸ–ΌοΈ Windows

The Ollama Windows installer registers an Uninstaller application.

Under Add or remove programs in Windows Settings, you can uninstall Ollama.


🐧 Linux

sudo systemctl stop ollama sudo systemctl disable ollama sudo rm /etc/systemd/system/ollama.service sudo rm $(which ollama) sudo rm -r /usr/share/ollama sudo userdel ollama sudo groupdel ollama
That's a wrap!

You've journeyed into the fascinating world of LLMs, and emerged victorious! πŸ†

You've learned how to:

πŸš€ p.s. Does it say "Still tasks to complete!" at the bottom of the screen?

This means you still have screenshots left to upload, or questions left to answer!

  1. Press Ctrl+F (Windows) or Command+F (Mac) on your keyboard.
  2. Search for the text Return to later.
  3. Jump straight to your incomplete tasks!
  4. πŸ™‹β€β™€οΈ Still stuck? Ask the community!