Thursday, April 11, 2024

What is PandasAI

PandasAI is a Python library that makes it easy to ask questions to your data (CSV, XLSX, PostgreSQL, MySQL, BigQuery, Databrick, Snowflake, etc.) in natural language. xIt helps you to explore, clean, and analyze your data using generative AI.

Beyond querying, PandasAI offers functionalities to visualize data through graphs, cleanse datasets by addressing missing values, and enhance data quality through feature generation, making it a comprehensive tool for data scientists and analysts.

Features

Natural language querying: Ask questions to your data in natural language.

Data visualization: Generate graphs and charts to visualize your data.

Data cleansing: Cleanse datasets by addressing missing values.

Feature generation: Enhance data quality through feature generation.

Data connectors: Connect to various data sources like CSV, XLSX, PostgreSQL, MySQL, BigQuery, Databrick, Snowflake, etc.

How does PandasAI work?

PandasAI uses a generative AI model to understand and interpret natural language queries and translate them into python code and SQL queries. It then uses the code to interact with the data and return the results to the user.

How to get started with PandasAI?

# Using poetry (recommended)

poetry add pandasai

# Using pip

pip install pandasai

import os

import pandas as pd

from pandasai import Agent

# Sample DataFrame

sales_by_country = pd.DataFrame({

    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],

    "sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]

})


# By default, unless you choose a different LLM, it will use BambooLLM.

# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)

os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"


agent = Agent(sales_by_country)

agent.chat('Which are the top 5 countries by sales?')

## Output

# China, United States, Japan, Germany, Australia


It is also possible to have OpenAI key directly used with PandasAI 

import os

from pandasai import SmartDataframe

import pandas as pd

from pandasai.llm import OpenAI


# pandas dataframe

sales_by_country = pd.DataFrame({

    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],

    "sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]

})


llm = OpenAI(api_token="sk-UndamporiPazhamporiSavalaVada")

sdf = SmartDataframe(sales_by_country, config={"llm": llm})


response = sdf.chat('Which are the top 5 countries by sales?')

print(response)

# Output: China, United States, Japan, Germany, Australia


No comments:

Post a Comment