LLMs need new prompting techniques to understand structured data

The large language models have trouble using data from tables

Reading time icon 3 min. read


Readers help support Windows Report. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help Windows Report sustain the editorial team Read more

LLMs using structured data to solve tasks

Large Language Models (LLMs) are facing challenges when they have to deal with data from tables. Furthermore, it is unknown if they comprehend it or not. Thus, a research team tries to verify the capability of AI to understand structured data by using a new benchmark system. Furthermore, they want to discover prompt techniques to improve the understanding of the LLMs.

Can LLMs be trained on structured data?

You can train LLMs on structured data. However, it is still being determined to which degree the LLM can learn and understand it. Thus, researchers are trying to figure out the best prompt techniques to teach LLMs how to handle data from tables and test which tables work best.

Researchers created a new benchmark called Structural Understanding Capabilities (SUC) to test the capability of LLMs to understand data from tables. Furthermore, to gather results, they used SUC and various prompting techniques.

To verify the efficiency of the structured data prompting techniques, researchers used the benchmark on both GPT-3.5 and GPT-4. Furthermore, they figured out that the results differ based on table format, content order, and partition marks. In addition, they used HTML tables, comma-separated values, and tab-separated values for the table format.

Results

HTML tables are the most efficient for the LLMs between all of the formats. Also, according to their research, the highest accuracy across seven tasks is only 65.43%. As a result, LLMs are far from perfect when it comes to structured data, and they need much more improvement. However, it is possible to enhance them with the right prompting techniques.

Researchers used a combination of self-augmented prompting with structured data to improve the LLM’s understanding of tables. Furthermore, they did this in three steps. The first step was to ask the AI to analyze a table, the second step was for the AI to generate a description based on the data, and the last step was to create a description using the previous information.

In a nutshell, LLMs are not yet ready to properly understand structured data. However, according to the research, you can train LLMs to understand data from tables using various input factors and self-augmented prompting. Furthermore, researchers will continue to study efficient ways to make LLMs understand tables better by integrating structural information to improve the LLMs’ performance with different structured data types.

This research could also help visual language models that use LLMs to improve prompt learning.

What are your thoughts? Is this research going to help LLMs? Let us know in the comments.

More about the topics: AI, artificial intelligence