July 12, 2024

The Evolution of Large Language Models in Research

Emerging technologies are accelerating the pace of innovation, and the pace of research along with it. To stay at the forefront of their fields, it’s becoming increasingly imperative for researchers, scientists, academics to utilize these technologies effectively.

Large language models (LLMs) have shown their potential in impacting the research landscape. These models can process and generate vast amounts of text data, making them invaluable tools with a wide range of applications for making the research process more efficient and robust.

Understanding the current state of LLMs, even compared to just last year, is crucial for making the most of this evolving technology.

LLMs & The Hype Cycle

LLMs are in a pretty interesting period right now. When they came out a few years ago, there was quite a bit of chatter about them. Borrowing concepts from biology, we discussed these concepts in terms of neural networks, along with artificial general intelligence, consciousness, and even the sense of logic.

Due to the sheer volume of information and conversations happening all at once, the way we were collectively talking about LLMs and related technology quickly led to fear-mongering about robots taking over.

Even today, you’ll occasionally see articles like this published:

llm-research-blog-image-1

With the tide seemingly having shifted, what’s really going on here? Why is the hype sort of dying?

The easiest way to understand is to look at the Hype Cycle, which occurs with many technologies at various points of innovation.

hype-cycle

A few years ago, we experienced the Innovation Trigger with LLMs and ChatGPT. Driven by marketing and buzz, we were quickly propelled to the Peak of Inflated Expectations. Depending on the individual, society settled at different points on this curve. Many students, for instance, enjoy the benefits of these tools for their daily research tasks, thriving at the peak. Conversely, researchers in regulatory compliance or higher-stakes industries initially became disillusioned as they slid down the curve but are gradually becoming more enlightened.

As we build products to enhance research, the Research Solutions team must navigate the peaks and valleys of the Hype Cycle to understand how this technology can boost long-term productivity. We guide researchers through the oscillation between hype and disillusionment, helping them see the light at the end of the tunnel.

LLMs as a Tool

Think of a quintessential tool, like a hammer. When you pick it up, you immediately know its purpose—you can drive nails with it. But you can also do a number of secondary things: remove nails, tap conjoining parts into place, test the integrity of materials, even use it as a bookend in your bookshelf.

It helps to reframe the chatter around LLMs around the question, "how does it help me do my own job faster and better?" One of the classic use cases around a tool like ChatGPT relates to writing.

Think about it: we've been writing for a pretty long time, right? And when you consider where we started, we've come a pretty long way.

There were a lot of innovations that helped us physically write faster and more efficiently. And then when we entered the era of personal computing. Many early writing innovations centered around the typewriter aesthetic, whether on a keyboard or a phone. However, the real breakthroughs came with digital technology, which allowed us to offload cognitive tasks like finding synonyms or definitions, freeing up mental energy for more creative and productive writing.

When considering LLMs and their text generation capabilities, much of their value lies not so much in discovery and comprehension, but rather in aiding writing and ideation processes when you already have a clear idea of what you want to say.

Ultimately, while much is changing rapidly, we still bear the responsibility for doing a lot of the thinking ourselves when all is said and done.

Into The Unknown

Often when we are curious about a new topic, or a topic in which we’d like to learn more about, our first instinct is to use Google. Once you have your results, the next step is usually making a determination about the integrity and trustworthiness of the pages.

For example, if you were learning about Ozempic and went to search what a GLP-1 agonist is (Ozempic’s drug type), you would review the top results and likely select a page from a recognizably credible source, like the Mayo Clinic. You select this page and receive an informative overview: this is a newer drug that helps with blood sugar, diabetes, and obesity management. You’d also learn that it was approved within the last couple of decades, and researchers are still discovering additional uses and benefits. This suggests that we don't yet fully understand its long-term implications.

While this was helpful, it still leaves lots of open questions.

If you really wanted to dive in deep these reputable sites usually have a stamp or notation indicating that this information has been medically reviewed, and a list of references is made available. This process remains cumbersome because specific claims within the essay cannot be referenced in the text; we're not taken to the part of the article where these authors chose to make the claim. It's also challenging to assess the relevance and reasoning of these claims:

• Why are they actually making this citation?
• Was there a specific part of the article indicating that this claim is relevant, or is it just the paper in general?
• Is this really an appropriate citation?

As you dig through, you can get bogged down by all these questions about what exactly they were referring to when they made that citation.

Guessing The Teacher’s Password

So, how does this whole process work if we try it with ChatGPT?

You could simply request: Give me an overview of GLP-1 agonists and help me understand what they are, how they work, what are they used for, and whether they're dangerous.

When you ask ChatGPT a question or give it a prompt, it’s important to remember that it’s a generative tool, so what it’s doing is guessing what it thinks the next word should be.

This is kind of this broader concept in Education called “guessing the teacher's password.” It's somewhat dated but still very pertinent. It touches on a classic issue in education: you teach someone something, ask them a question to test their understanding, they give you an answer, and then you wonder, did they truly grasp it? Or are they simply repeating what they think they should say to check the box and move forward?

You likely already know that someone can memorize their way through a lot of tests and assessments. Much of the intuition around LLMs stems from their ability to memorize vast amounts of information using extensive compute resources. They don't just store information, they also recognize patterns, enabling them to make very quick predictions.

Therefore, a perspective shift is needed.

When we say the tool is sometimes hallucinating or making up an answer, we imply it's usually correct but occasionally wrong. In reality, it's the inverse: the tool is always generating responses based on patterns, and sometimes those responses happen to be correct.

It's nice when the output appears correct and covers a satisfactory amount of information. However, if we lack topical expertise, we can't truly fact-check the response for accuracy or completeness. There's no way to inspect how the LLM generates these claims or find primary citations written by humans that verify the information. Therefore, if you're unfamiliar with the topic and using ChatGPT as a discovery tool, the information may be present, but it's not actionable because you can't fully trust or verify it.

That said, if you're an expert on the given topic and can fact-check the information yourself, this becomes a valuable tool for generating various texts, confirming material, creating iterative drafts, and brainstorming ideas while controlling the entire process.

Ultimately, it depends on your expertise, the tasks at hand, and your familiarity with these tools.

Putting LLMs to Work in Research

With ChatGPT, there are various models and an entire plug-in ecosystem. However, it's important to ensure you experience a high level of trust and have the ability to explore freely.

So, let's try this with Scite Assistant using the same inquiry from above: Give me an overview of GLP-1 agonists and help me understand what they are, how they work, what are they used for, and whether they're dangerous.

Rather jumping in to simply provide a quick answer, it figures out how to deliver a correct answer. Its search strategy utilizes our own database, as Scite is its own citation index. It composes an answer by consulting various articles pulled from the database to find information on the topic.

Once finished, Scite Assistant fact checks the composed answer to make sure it's correct, with visibility as to how exactly it's coming up with this information.

The end product? A thorough answer with in-text referencing baked in.

It's still a generative text answer, similar to ChatGPT, but the in-text citations link the text to human-written research papers, supporting the AI-generated response. We can validate the text generated by the LLM by inspecting the original claims for accuracy using real papers from our database. This can be done directly within your workflow, without bouncing around between different browser tabs or needing to scan an entire paper to find the exact information needed, and eliminates the issue of blind trust.

Harnessing Scite for Confident AI-Powered Research

Researchers may be at different stages on the Hype Cycle regarding generative AI tools, but there's no denying these tools have unlocked countless innovative ways to produce and process information.

Given the inherent limitations and the need for careful scrutiny, especially in academic and research contexts, Scite Assistant sets itself apart by combining AI's generative capabilities with rigorous fact-checking against a comprehensive citation index. This ensures the information you receive is both reliable and verifiable. This hybrid approach allows users to confidently harness the power of AI, guiding researchers along the Slope of Enlightenment towards the Plateau of Productivity.

To see how Scite can elevate your research process, book a demo or download our Scite Assistant Prompt Guidebook and experience the future of research assistance firsthand.

Tag(s): citations AI ChatGPT

The Evolution of Large Language Models in Research

LLMs & The Hype Cycle

LLMs as a Tool

Into The Unknown

Guessing The Teacher’s Password

Putting LLMs to Work in Research

Harnessing Scite for Confident AI-Powered Research

Popular at Research Solutions

Securing Trust in ChatGPT: Quality Control and the Role of Citations

Smart Citations, ChatGPT, & The Future of Research Discovery & Evaluation

Navigating the Landscape of AI Content Creation in the Era of Authentic Research