Categories: BLOG2

What You Should Know About LLMs (As a User and Marketer) — Whiteboard Friday

So, let’s start with the steps that they have to go through for ChatGPT, for example, to give you an answer to a question. Again, like search engines, they have to first gather the data.

Then they need to save the data in a format that they’re able to access, and then they need to give you an answer at the end, which is kind of like ranking. If we start with gathering the data, this is the bit that’s closest to the search engines that we know and love. So they’re basically accessing web pages, crawling the internet, and if they haven’t visited a web page or gotten another source for a piece of information, they just don’t know that answer. They’re kind of at a disadvantage here because search engines have been doing this, have been recording this information for decades, whereas they’ve kind of only just started.

So they’ve got a lot of catching up to do. There are a lot of different corners of the internet that they haven’t really been able to visit. One of the things that they can do, a piece of information that they can gather that other search engines can’t access, is chat data. So when you are using the platforms, they are gathering data about what you’re putting in and how you’re interacting with it, and that feeds into their training model.

So that’s one thing for you to be aware of when you’re working with platforms like ChatGPT is that if you’re putting in private data in there, it’s not necessarily private after you’ve done that. So you might want to look at your settings or look at using the APIs because they tend to promise they don’t train on API data. If we move on to the second stage, saving that information, this is kind of what we refer to as indexing in search, and this is where things diverge a little bit, but there’s still quite a lot of parallels.

So in the early days of search engines, actually the index, the data that they had saved wasn’t updated live the way we’re used to it. It wasn’t as soon as something came out onto the internet we could kind of be sure that it would appear in a search engine somewhere. It was more that they would update once every few months because it was very expensive. It was costly in terms of time and money for them to do those index updates. We’re in a similar situation with large language models at the moment.

You may have noticed that every so often they say, “Okay, we’ve updated things.” The information that it’s got is now live up till April or something like that. That’s because when they want to put more information into the models, they actually have to retrain the whole thing. So again, it’s very costly for them to do. Both of those limitations kind of feed into the answers that you’re getting at the end.

I’m sure you’ve seen this. You might be working with ChatGPT, and it hasn’t happened to see the information that you’re asking about, or the information it does have is out of date.

If you liked What You Should Know About LLMs (As a User and Marketer) — Whiteboard Friday by Robin Lord Then you'll love Miami SEO Expert

Robin Lord

Share
Published by
Robin Lord

Recent Posts

How to Optimize E-commerce Sitemaps with 1M+ Pages — Whiteboard Friday

Okay, let's talk about the out-of-stock products. Out-of-stock products can be treated as soft 404…

3 days ago

The Psychology Behind Search: Cognitive Biases in SEO

And it is. SEO action tip Use power words in your titles and meta descriptions.…

5 days ago

Google’s Surgical Strike on Reputation Abuse

These aren’t easy questions. On the one hand, many of these sites do clearly fit…

6 days ago

MozCon Field Guide: Exploring Summertime Seattle

Seattle Asian Art Museum Located at Volunteer Park in the Capitol Hill neighborhood of Seattle,…

1 week ago

Content Strategy for Startups: The Ultimate 8-Stage Roadmap — Whiteboard Friday

So, to be able to do this, you can actually conduct content audits for your…

1 week ago

How to Leverage BigQuery for Advanced Internal Link Analysis

Prerequisites First, you'll need a CSV containing all your internal links. At the very least,…

2 weeks ago