Using ChatGPT for Exploratory Data Analysis

Introduction

In the midst of the emerging technologies in this data-driven decision-making era, we might often get cluttered through mountains of data analysis in an effort to get insightful information, with ChatGPT, the rules will soon alter. ChatGPT could uncover never-before-seen patterns and trends in your data because of its sophisticated natural language processing skills. In this article, we’ll look at how data analysis with ChatGPT can revolutionize the data and change the way we do business.

ChatGPT

OpenAI developed ChatGPT, is an AI chatbot, which was made available on November 30, 2022. It employs natural language processing to produce conversational discourse that sounds human. The “GPT (Generative Pre-training Transformer)” architecture, a class of neural network developed for tasks involving natural language processing, was used in its construction. The language model may produce a variety of written materials, such as emails, blog posts, essays, code, and articles, in addition to responding to user queries.

Prompt Engineering

The process of creating the ideal input text, or prompt, for a large language model (LLM) to receive an accurate response, is known as prompt engineering. In order to acquire the appropriate reaction from the LLM, prompt engineers modify the input given into it. They craft carefully worded prompts that explicitly state what they want and how they want it to the AI model. The goal of prompt engineering is to get the model to answer questions in a meaningful and accurate manner by asking the right questions in the right ways. To communicate with LLMs effectively, prompt engineering becomes a growing required set of skills.

Prompts are essential for using ChatGPT to its fullest ability. Despite the fact that ChatGPT is capable of carrying out any task, in order to utilize it to the fullest, taking into consideration giving proper and thorough suggestions would be helpful. You won’t be able to achieve the intended results unless you follow the considered prompts.

However, it’s really fascinating to see if ChatGPT can perform the calculations and provide us with accurate results. We shall see.

Getting Started with Analyzing Data

Now, let’s test out a few of ChatGPT prompts for data analysis. Since it’s blocked in Somalia, I utilized the free edition of ChatGPT while logging in using a Turkish account.

Note that, I used a random dataset from a pizza shop consisting of 5 variables (4 features and 1 target) namely; DeliveryID, DriverName, DeliveryTime, Distance (Distance of the delivery from end-to-end in miles), and Duration (Duration of delivery in minutes). The target variable is Delivery Time.

Here are some prompts that will get us some insightful points from the dataset. You can try these prompts on your own and see how things go!

Prompt 1:

Act as a data analyst while taking consideration of the dataset that I’m about to give you. Provide me with a precise, conclusive response to each question. Please refrain from giving me the answers’ source code. The dataset is given and it’s in a CSV file format. The header appears in the dataset’s first row.

Prompt 2:

Count the columns and rows in the dataset.

Prompt 3:

Summarize the dataset’s numerical and category variables.

Prompt 4:

Convert the data into a form of table

Prompt 5:

Give the value count of the data using the DriverName.

Prompt 6:

Check any missing values in the dataset.

Prompt 7:

Examine any outliers in the dataset.

Prompt 8:

What are key determinants that affect the DeliveryTime?

Prompt 9:

Compute the average of DeliveryTime.

Prompt 10:

Who is the fastest driver in terms of average speed?

Prompt 11:

How many deliveries did each driver make?

Prompt 12:

What is the total distance covered by each driver?

Prompt 13:

Calculate the average duration of deliveries for each driver?

Prompt 14:

Using the dataset, produce insightful conclusions.

Conclusion

Ultimately, ChatGPT can produce thoughtful conclusions from the data. Our research is insightful. Furthermore, ChatGPT met our expectations.

In this article, we’ve shown you how to use ChatGPT to evaluate data in a short period of time. We also realized the value of prompts in ChatGPT and how to use the proper ones to get exploratory data analysis (EDA) results.

I hope you found this article interesting and you can apply these skills in your daily workflow using your own datasets.

Mr. Abdullahi Mohamud Osman, Emerging Technologies Lead, SIMAD iLab, SIMAD University

[email protected]

Leave a Reply

Your email address will not be published. Required fields are marked *