This is absolutely scriblings from my readinds of a great article here: https://medium.com/data-science-at-microsoft/automating-data-analytics-with-chatgpt-827a51eaa2c
ChatGPT excels at providing the high-level knowledge component. Professionals can turn to ChatGPT to get advice on solving business problems, doing data analysis, and writing code, for example
What if we could teach ChatGPT to leverage such tools and the thought process behind them to analyze problems within specific domains, particularly business analytics? By exploring this possibility, we can potentially expand ChatGPT’s capabilities and transform it into a valuable tool for data analytics professionals
Goal
The objective is to allow users to ask complex analytical questions of business data, which often exists in structured SQL databases. Ultimately, the goal is for ChatGPT to deliver the answers in the best possible format, complete with rich visualizations that make it easier for users to comprehend the results. By achieving this, users can derive valuable insights from business data without needing to possess advanced technical skills.
Approach
1) Use ChatGPT’s broad knowledge in data and business analytics to plan execution at both high and detailed levels, with the help from context that we provide.
2) Guide ChatGPT to break a complex problem or question into addressable steps. For this we can use a popular technique in LLM prompt engineering known as chain of thought (CoT). Additionally, we can augment this approach with a more advanced CoT technique called ReAct to enable ChatGPT to re-evaluate the planned approach based by observing the results of intermediate steps.
3) Give ChatGPT the necessary tools to perform data retrieval and data analysis. Here we can take advantage of the capabilities of ChatGPT to write SQL queries and Python data analysis code in designing the tool.
4) Design the prompt to instruct ChatGPT to perform the specific action at each step.
5) As ChatGPT is merely the “brain,” supplement the approach with inter-system communication.
6) Build the end user application.
Process
Similar to a manual analytic process, the automated analytic application process is designed with three main stages:
Data acquisition: This stage involves retrieving the data from the source system to answer the business question. The automation of this stage requires knowledge of the data schema of the source system and the requisite business knowledge for selecting the right data.
Analytical computation: This stage encompasses the performance of everything from simple computation such as aggregations to statistical analysis and Machine Learning.
Presentation: This stage involves visualizing and presenting the data to the user.
ChatGPT agents
The ChatGPT agents perform tasks on their own by using the tools, observing the results, and adjusting their actions based on their observations. There are two agents in this design:
Data engineer: This agent is responsible for performing data acquisition from the source system (in this case a SQL database). The data engineer agent receives instructions from the data scientist agent.
Data scientist: The main agent in this solution, responsible for producing the end result as prompted by a human user’s request. The data scientist agent can request that the data engineer agent acquire the necessary data, and then the data scientist agent uses tools to perform data analytics to produce a final answer.
The use of two separate agents follows the design thinking of dividing a potentially complex task into multiple sub-tasks that are easier for ChatGPT to work on.
References:
https://medium.com/data-science-at-microsoft/automating-data-analytics-with-chatgpt-827a51eaa2c
No comments:
Post a Comment