Table of Contents
Introduction
Hey there, everyone! Welcome to our cozy corner of the internet where we’re delving deep into the realm of ChatGPT reviews.
That’s right, we’re exploring what people are saying about the wonders of ChatGPT and how it fares in the realm of AI chatbots. Grab a drink, get settled in, and let’s jump right in!
Also, check Machine Learning projects:
- Machine Learning Project 1: Honda Motor Stocks best Prices analysis
- Machine Learning Project 2: Diversity in Tech Companies Best EDA Analysis
- Machine Learning Project 3: Exploring Indian Cuisine Best Analysis
- Machine Learning Project 4: Exploring Video Game Data
- Machine Learning Project 5: Best Students Performance EDA
- Machine Learning Project 6: Obesity type Best EDA and classification
- Machine Learning Project 7: Machine Learning Engineer Salary in 2024
Why Does It Matter?
Ever been curious about how ChatGPT is holding up in the vast online world? Well, we’ve got the inside scoop. We’re breaking it down:
- Average Perplexity: Sounds fancy, right? It’s all about how unpredictable the chatbot can be. More surprises, more excitement!
- Burstiness Scores: You know that friend who talks a mile a minute and then suddenly goes silent? That’s burstiness. We’re seeing if ChatGPT has that same vibe.
- Predictability: How easy is it to guess what ChatGPT will say next? Spoiler alert: it’s not always what you anticipate! ๐๐๐๐
In-Depth Analysis of ChatGPT Reviews
Importing Libraries ๐ฅ
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns import re import string import emoji import nltk import spacy from tqdm import tqdm from nltk.corpus import stopwords from gensim.models import Word2Vec from tensorflow.keras.preprocessing.text import Tokenizer from sklearn.feature_extraction.text import CountVectorizer from sklearn.ensemble import RandomForestClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.metrics import classification_report from sklearn.model_selection import train_test_split from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay from sklearn.preprocessing import MinMaxScaler, LabelEncoder, OrdinalEncoder
2024-05-25 18:49:14.411021: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-05-25 18:49:14.411217: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-05-25 18:49:14.567808: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Importing Data
Dataset Link: https://www.kaggle.com/code/zain280/in-depth-analysis-of-chatgpt-reviews
df = pd.read_csv('/kaggle/input/chatgpt-reviews-daily-updated/chatgpt_reviews.csv') df = df.sample(20000) df.head()
reviewId | userName | content | score | thumbsUpCount | reviewCreatedVersion | at | appVersion | |
---|---|---|---|---|---|---|---|---|
19863 | d7f901e2-3032-4e7c-9ba0-cd8d1443b32d | Arshavir Mandegar | kinda weird and scary but fascinating at the s… | 5 | 0 | 1.2024.073 | 2024-03-23 03:08:40 | 1.2024.073 |
67801 | e05d7901-c968-4f5a-bd00-fbaf4a87e355 | ayan khalid | great app, always been helpin me out. | 5 | 0 | 1.0.0039 | 2023-09-04 16:39:37 | 1.0.0039 |
71308 | 76e6efef-5daa-4a2f-8bc8-aa698f9d5048 | Archana Rajput | kep it in well done | 5 | 0 | NaN | 2023-12-17 16:16:52 | NaN |
122762 | 9a1d150d-ff0b-4686-b017-499d46047db3 | Rakibul Hasan | Good app | 5 | 0 | NaN | 2023-12-08 17:03:22 | NaN |
50227 | cf94f0c6-186a-4787-8bee-bcd0b5216345 | Aasim Saquafi | Sometime it doesn’t work | 1 | 0 | 1.2023.313 | 2024-01-05 15:41:41 | 1.2023.313 |
Header View of ChatGPT Reviews
pd.DataFrame(df['content']).head()
content | |
---|---|
19863 | kinda weird and scary but fascinating at the s… |
67801 | great app, always been helpin me out. |
71308 | kep it in well done |
122762 | Good app |
50227 | Sometime it doesn’t work |
Overview of Data
Distribution of Rows and Columns
df.shape
(20000, 8)
Data Information
df.info()
<class 'pandas.core.frame.DataFrame'> Index: 20000 entries, 19863 to 108190 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 reviewId 20000 non-null object 1 userName 19999 non-null object 2 content 19998 non-null object 3 score 20000 non-null int64 4 thumbsUpCount 20000 non-null int64 5 reviewCreatedVersion 18313 non-null object 6 at 20000 non-null object 7 appVersion 18313 non-null object dtypes: int64(2), object(6) memory usage: 1.4+ MB
Data Description
df.describe()
score | thumbsUpCount | |
---|---|---|
count | 20000.000000 | 20000.000000 |
mean | 4.502300 | 0.477100 |
std | 1.083953 | 12.118301 |
min | 1.000000 | 0.000000 |
25% | 5.000000 | 0.000000 |
50% | 5.000000 | 0.000000 |
75% | 5.000000 | 0.000000 |
max | 5.000000 | 1193.000000 |
Sum of Null Values
df.isnull().sum()
reviewId 0 userName 1 content 2 score 0 thumbsUpCount 0 reviewCreatedVersion 1687 at 0 appVersion 1687 dtype: int64
ml process
kaggle machine learning projects
machine learning project manager
machine learning project management
machine learning projects for masters students
Data Cleaning ๐งน
Dropping Duplicates
df = df.drop_duplicates()
Dropping Rows with Null Values
df = df.dropna()
Revised Data Shape
df.shape
(18214, 8)
Score Value Counts
df['score'].value_counts()
score 5 14008 4 2218 1 969 3 714 2 305 Name: count, dtype: int64
ThumbsUpCount Value Counts
df['thumbsUpCount'].value_counts()
thumbsUpCount 0 17444 1 420 2 118 3 47 5 24 ... 128 1 338 1 126 1 47 1 152 1 Name: count, Length: 66, dtype: int64
Visualizing Data
Histogram of Numerical Columns
df.hist(figsize=(15, 5)) plt.show()
Scatter Plot of Numerical Columns
sns.scatterplot(x='score', y='thumbsUpCount', data=df) plt.show()
step machine learning
step of machine learning
ml projects
ml project
machine learning python projects
machine learning projects in python
Feature Engineering with Machine Learning
Selecting Relevant Features
df.drop(columns=['reviewId', 'userName', 'at'], inplace=True)
df.head()
content | score | thumbsUpCount | reviewCreatedVersion | appVersion | |
---|---|---|---|---|---|
19863 | kinda weird and scary but fascinating at the s… | 5 | 0 | 1.2024.073 | 1.2024.073 |
67801 | great app, always been helpin me out. | 5 | 0 | 1.0.0039 | 1.0.0039 |
50227 | Sometime it doesn’t work | 1 | 0 | 1.2023.313 | 1.2023.313 |
1650 | thank you chatgpt | 5 | 0 | 1.2024.122 | 1.2024.122 |
63566 | It is the best chat AI ..๐๐๐ but no pic or videos | 4 | 0 | 1.2023.263 | 1.2023.263 |
Encoding Columns
le = LabelEncoder() df['reviewCreatedVersion'] = le.fit_transform(df['reviewCreatedVersion'])
oe = OrdinalEncoder() df['appVersion'] = oe.fit_transform(df[['appVersion']])
df.head()
content | score | thumbsUpCount | reviewCreatedVersion | appVersion | |
---|---|---|---|---|---|
19863 | kinda weird and scary but fascinating at the s… | 5 | 0 | 41 | 41.0 |
67801 | great app, always been helpin me out. | 5 | 0 | 8 | 8.0 |
50227 | Sometime it doesn’t work | 1 | 0 | 22 | 22.0 |
1650 | thank you chatgpt | 5 | 0 | 48 | 48.0 |
63566 | It is the best chat AI ..๐๐๐ but no pic or videos | 4 | 0 | 12 | 12.0 |
Applying NLP to Review Content
df['content']
19863 kinda weird and scary but fascinating at the s... 67801 great app, always been helpin me out. 50227 Sometime it doesn't work 1650 thank you chatgpt 63566 It is the best chat AI ..๐๐๐ but no pic or videos ... 38261 this app is very helpful for containing study ... 78273 it's the Best 118182 mind blowing... ๐ 97082 awesome 108190 good Name: content, Length: 18214, dtype: object
Converting to Lowercase
df['content'] = df['content'].str.lower()
df['content']
19863 kinda weird and scary but fascinating at the s... 67801 great app, always been helpin me out. 50227 sometime it doesn't work 1650 thank you chatgpt 63566 it is the best chat ai ..๐๐๐ but no pic or videos ... 38261 this app is very helpful for containing study ... 78273 it's the best 118182 mind blowing... ๐ 97082 awesome 108190 good Name: content, Length: 18214, dtype: object
Removing HTML Tags
def remove_html_tags(text): clean_text = re.sub('<.*?>', '', text) return clean_text
df['content'] = df['content'].apply(remove_html_tags)
df['content']
19863 kinda weird and scary but fascinating at the s... 67801 great app, always been helpin me out. 50227 sometime it doesn't work 1650 thank you chatgpt 63566 it is the best chat ai ..๐๐๐ but no pic or videos ... 38261 this app is very helpful for containing study ... 78273 it's the best 118182 mind blowing... ๐ 97082 awesome 108190 good Name: content, Length: 18214, dtype: object
Removing URLs
def remove_urls(text): url_pattern = re.compile(r'https?://\S+|www\.\S+') clean_text = re.sub(url_pattern, '', text) return clean_text
df['content'] = df['content'].apply(remove_urls)
df['content']
19863 kinda weird and scary but fascinating at the s... 67801 great app, always been helpin me out. 50227 sometime it doesn't work 1650 thank you chatgpt 63566 it is the best chat ai ..๐๐๐ but no pic or videos ... 38261 this app is very helpful for containing study ... 78273 it's the best 118182 mind blowing... ๐ 97082 awesome 108190 good Name: content, Length: 18214, dtype: object
Removing Punctuation
def remove_punctuation(text): punctuation = string.punctuation clean_text = text.translate(str.maketrans('', '', punctuation)) return clean_text
df['content'] = df['content'].apply(remove_punctuation)
df['content']
19863 kinda weird and scary but fascinating at the s... 67801 great app always been helpin me out 50227 sometime it doesnt work 1650 thank you chatgpt 63566 it is the best chat ai ๐๐๐ but no pic or videos ... 38261 this app is very helpful for containing study ... 78273 its the best 118182 mind blowing ๐ 97082 awesome 108190 good Name: content, Length: 18214, dtype: object
machine learning project github
machine learning ideas
ml project ideas
Chat Word Treatment
chat_words_mapping = { "lol": "laughing out loud", "brb": "be right back", "btw": "by the way", "afk": "away from keyboard", "rofl": "rolling on the floor laughing", "ttyl": "talk to you later", "np": "no problem", "thx": "thanks", "omg": "oh my god", "idk": "I don't know", "np": "no problem", "gg": "good game", "g2g": "got to go", "b4": "before", "cu": "see you", "yw": "you're welcome", "wtf": "what the f*ck", "imho": "in my humble opinion", "jk": "just kidding", "gf": "girlfriend", "bf": "boyfriend", "u": "you", "r": "are", "2": "to", "4": "for", "b": "be", "c": "see", "y": "why", "tho": "though", "smh": "shaking my head", "lolz": "laughing out loud", "h8": "hate", "luv": "love", "pls": "please", "sry": "sorry", "tbh": "to be honest", "omw": "on my way", "omw2syg": "on my way to see your girlfriend", } def expand_chat_words(text): words = text.split() expanded_words = [chat_words_mapping.get(word.lower(), word) for word in words] return ' '.join(expanded_words)
df['content'] = df['content'].apply(expand_chat_words)
df['content']
19863 kinda weird and scary but fascinating at the s... 67801 great app always been helpin me out 50227 sometime it doesnt work 1650 thank you chatgpt 63566 it is the best chat ai ๐๐๐ but no pic or videos ... 38261 this app is very helpful for containing study ... 78273 its the best 118182 mind blowing ๐ 97082 awesome 108190 good Name: content, Length: 18214, dtype: object
Removing Stop Words
def remove_stop_words(text): tokens = nltk.word_tokenize(text) stop_words = set(stopwords.words('english')) filtered_tokens = [token for token in tokens if token not in stop_words] preprocessed_text = ' '.join(filtered_tokens) return preprocessed_text
df['content'] = df['content'].apply(remove_stop_words)
df['content']
19863 kinda weird scary fascinating time 67801 great app always helpin 50227 sometime doesnt work 1650 thank chatgpt 63566 best chat ai ๐๐๐ pic videos ... 38261 app helpful containing study material types qu... 78273 best 118182 mind blowing ๐ 97082 awesome 108190 good Name: content, Length: 18214, dtype: object
Replacing Emojis with Meanings
def replace_emojis_with_meanings(text): def replace(match): emoji_char = match.group() emoji_meaning = emoji.demojize(emoji_char) return emoji_meaning emoji_pattern = re.compile("[" u"\U0001F600-\U0001F64F" u"\U0001F300-\U0001F5FF" u"\U0001F680-\U0001F6FF" u"\U0001F1E0-\U0001F1FF" u"\U00002500-\U00002BEF" u"\U00002702-\U000027B0" u"\U00002702-\U000027B0" u"\U000024C2-\U0001F251" u"\U0001f926-\U0001f937" u"\U00010000-\U0010ffff" u"\u2640-\u2642" u"\u2600-\u2B55" u"\u200d" u"\u23cf" u"\u23e9" u"\u231a" u"\ufe0f" u"\u3030" "]+", flags=re.UNICODE) text_with_meanings = emoji_pattern.sub(replace, text) return text_with_meanings
df['content'] = df['content'].apply(replace_emojis_with_meanings)
df['content']
19863 kinda weird scary fascinating time 67801 great app always helpin 50227 sometime doesnt work 1650 thank chatgpt 63566 best chat ai :OK_hand::OK_hand::OK_hand: pic v... ... 38261 app helpful containing study material types qu... 78273 best 118182 mind blowing :thumbs_up: 97082 awesome 108190 good Name: content, Length: 18214, dtype: object
Word Tokenization
def word_tokenization(text): return nltk.word_tokenize(text)
df['token_content'] = df['content'].apply(word_tokenization)
df['content']
19863 kinda weird scary fascinating time 67801 great app always helpin 50227 sometime doesnt work 1650 thank chatgpt 63566 best chat ai :OK_hand::OK_hand::OK_hand: pic v... ... 38261 app helpful containing study material types qu... 78273 best 118182 mind blowing :thumbs_up: 97082 awesome 108190 good Name: content, Length: 18214, dtype: object
cv machine learning
machine learning cv
machine learning projects github
POS Tagging
nlp = spacy.load('en_core_web_sm', disable=['ner', 'textcat']) def batch_pos_tagging(texts): docs = list(nlp.pipe(texts, batch_size=50)) return [[(token.text, token.pos_) for token in doc] for doc in docs] batch_size = 50 num_batches = len(df) // batch_size + 1 pos_tags = [] for i in tqdm(range(num_batches)): start = i * batch_size end = start + batch_size batch_texts = df['content'][start:end].tolist() pos_tags.extend(batch_pos_tagging(batch_texts)) df['POS_Tags'] = pos_tags
100%|โโโโโโโโโโ| 365/365 [00:18<00:00, 19.53it/s]
df['POS_Tags']
19863 [(kinda, INTJ), (weird, ADJ), (scary, ADJ), (f... 67801 [(great, ADJ), (app, NOUN), (always, ADV), (he... 50227 [(sometime, ADV), (does, AUX), (nt, PART), (wo... 1650 [(thank, VERB), (chatgpt, NOUN)] 63566 [(best, ADJ), (chat, NOUN), (ai, VERB), (:, PU... ... 38261 [(app, PROPN), (helpful, ADJ), (containing, VE... 78273 [(best, ADJ)] 118182 [(mind, NOUN), (blowing, VERB), (:, PUNCT), (t... 97082 [(awesome, ADJ)] 108190 [(good, ADJ)] Name: POS_Tags, Length: 18214, dtype: object
df.head()
content | score | thumbsUpCount | reviewCreatedVersion | appVersion | token_content | POS_Tags | |
---|---|---|---|---|---|---|---|
19863 | kinda weird scary fascinating time | 5 | 0 | 41 | 41.0 | [kinda, weird, scary, fascinating, time] | [(kinda, INTJ), (weird, ADJ), (scary, ADJ), (f… |
67801 | great app always helpin | 5 | 0 | 8 | 8.0 | [great, app, always, helpin] | [(great, ADJ), (app, NOUN), (always, ADV), (he… |
50227 | sometime doesnt work | 1 | 0 | 22 | 22.0 | [sometime, doesnt, work] | [(sometime, ADV), (does, AUX), (nt, PART), (wo… |
1650 | thank chatgpt | 5 | 0 | 48 | 48.0 | [thank, chatgpt] | [(thank, VERB), (chatgpt, NOUN)] |
63566 | best chat ai :OK_hand::OK_hand::OK_hand: pic v… | 4 | 0 | 12 | 12.0 | [best, chat, ai, :, OK_hand, :, :OK_hand, :, :… | [(best, ADJ), (chat, NOUN), (ai, VERB), (:, PU… |
Bag of Words
df['content'] = df['content'].apply(lambda x: ' '.join(x) if isinstance(x, list) else x) df['token_content'] = df['token_content'].apply(lambda x: ' '.join(x) if isinstance(x, list) else x) df['POS_Tags'] = df['POS_Tags'].apply(lambda x: ' '.join(str(i) for i in x) if isinstance(x, list) else x) vectorizer = CountVectorizer(ngram_range=(2, 2)) bow_c = vectorizer.fit_transform(df['content']) bow_t = vectorizer.fit_transform(df['token_content']) bow_pos = vectorizer.fit_transform(df['POS_Tags']) df['content'] = bow_c.toarray() df['token_content'] = bow_t.toarray() df['POS_Tags'] = bow_pos.toarray()
df.head()
content | score | thumbsUpCount | reviewCreatedVersion | appVersion | token_content | POS_Tags | |
---|---|---|---|---|---|---|---|
19863 | 0 | 5 | 0 | 41 | 41.0 | 0 | 0 |
67801 | 0 | 5 | 0 | 8 | 8.0 | 0 | 0 |
50227 | 0 | 1 | 0 | 22 | 22.0 | 0 | 0 |
1650 | 0 | 5 | 0 | 48 | 48.0 | 0 | 0 |
63566 | 0 | 4 | 0 | 12 | 12.0 | 0 | 0 |
ml projects ideas
project manager artificial intelligence
best machine learning courses reddit
machine learning projects for resume
Predictive Modeling
Train-Test Split of Data
X_train, X_test, y_train, y_test = train_test_split(df.drop(columns=['score']), df['score'], test_size=0.2, random_state=41)
Decision Tree Model and Evaluation
dt = DecisionTreeClassifier() dt.fit(X_train, y_train)
DecisionTreeClassifier
DecisionTreeClassifier()
y_pred = dt.predict(X_test)
print(classification_report(y_test, y_pred, zero_division=0))
precision recall f1-score support 1 0.34 0.07 0.12 202 2 0.00 0.00 0.00 67 3 0.00 0.00 0.00 161 4 0.00 0.00 0.00 430 5 0.77 0.99 0.87 2783 accuracy 0.76 3643 macro avg 0.22 0.21 0.20 3643 weighted avg 0.61 0.76 0.67 3643
cm = confusion_matrix(y_test, y_pred) cm_display = ConfusionMatrixDisplay(confusion_matrix=cm) fig, ax = plt.subplots(figsize=(5, 5)) cm_display.plot(ax=ax) plt.title('Confusion Matrix') plt.show()
Random Forest Model and Evaluation
rf = RandomForestClassifier()
rf.fit(X_train, y_train)
RandomForestClassifier
RandomForestClassifier()
y_pred = rf.predict(X_test)
print(classification_report(y_test, y_pred, zero_division=0))
precision recall f1-score support 1 0.37 0.05 0.09 202 2 0.25 0.03 0.05 67 3 0.00 0.00 0.00 161 4 0.00 0.00 0.00 430 5 0.77 0.99 0.87 2783 accuracy 0.76 3643 macro avg 0.28 0.21 0.20 3643 weighted avg 0.61 0.76 0.67 3643
cm = confusion_matrix(y_test, y_pred) cm_display = ConfusionMatrixDisplay(confusion_matrix=cm) fig, ax = plt.subplots(figsize=(5, 5)) cm_display.plot(ax=ax) plt.title('Confusion Matrix') plt.show()
ChatGPT reviews
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns import plotly.express as px import warnings warnings.filterwarnings('ignore')
df=pd.read_csv('/kaggle/input/chatgpt-reviews-daily-updated/chatgpt_reviews.csv') df
reviewId | userName | content | score | thumbsUpCount | reviewCreatedVersion | at | appVersion | |
---|---|---|---|---|---|---|---|---|
0 | 1ea528a6-6d5d-4c9a-b266-9df306f20ed7 | abdulwaheed aminat | amazing app,easy to navigate. | 5 | 0 | 1.2024.101 | 2024-05-12 23:38:52 | 1.2024.101 |
1 | 9df43688-8a80-419e-b36d-61c95fd17d2a | Benedette Morison | The app is recommendable and reliable, especia… | 5 | 0 | 1.2024.115 | 2024-05-12 23:35:02 | 1.2024.115 |
2 | 80b9806a-c5eb-44da-9f0d-6cd864c1f4cf | Android Trigger | Superb ai app | 5 | 0 | 1.2024.122 | 2024-05-12 23:27:05 | 1.2024.122 |
3 | ec5ea9e5-86bc-4de0-b170-f2871833ce74 | Brian Peters | Best thing that ever happened to me. | 5 | 0 | 1.2024.122 | 2024-05-12 23:17:46 | 1.2024.122 |
4 | 1e396118-3934-4ce6-8390-b2d56771e343 | Gautam kumar Patel | this is very good app | 5 | 0 | 1.2024.108 | 2024-05-12 23:12:56 | 1.2024.108 |
… | … | … | … | … | … | … | … | … |
113904 | 462686ff-e500-413c-a6b4-2badc2e3b21d | m.santhosh Kumar | Update 2023 | 5 | 0 | NaN | 2023-07-27 16:26:31 | NaN |
113905 | f10e0d48-ecb6-42db-b103-46c0046f9be9 | Andrew Bourgeois | its grear | 5 | 0 | NaN | 2023-09-23 16:25:18 | NaN |
113906 | df909a49-90b5-4dac-9b89-c4bd5a7c2f75 | Dern Bob | Funtastic App | 5 | 0 | NaN | 2023-11-08 13:57:14 | NaN |
113907 | abe43878-973f-4e96-a765-c4af5c7f7b20 | Abdur rahman arif | hi all | 5 | 0 | NaN | 2023-07-25 15:32:57 | NaN |
113908 | 0151001d-b81c-41b5-8927-f56738989625 | Tushar Deran | expert application | 5 | 0 | NaN | 2023-11-30 18:11:41 | NaN |
113909 rows ร 8 columns
df.head()
reviewId | userName | content | score | thumbsUpCount | reviewCreatedVersion | at | appVersion | |
---|---|---|---|---|---|---|---|---|
0 | 1ea528a6-6d5d-4c9a-b266-9df306f20ed7 | abdulwaheed aminat | amazing app,easy to navigate. | 5 | 0 | 1.2024.101 | 2024-05-12 23:38:52 | 1.2024.101 |
1 | 9df43688-8a80-419e-b36d-61c95fd17d2a | Benedette Morison | The app is recommendable and reliable, especia… | 5 | 0 | 1.2024.115 | 2024-05-12 23:35:02 | 1.2024.115 |
2 | 80b9806a-c5eb-44da-9f0d-6cd864c1f4cf | Android Trigger | Superb ai app | 5 | 0 | 1.2024.122 | 2024-05-12 23:27:05 | 1.2024.122 |
3 | ec5ea9e5-86bc-4de0-b170-f2871833ce74 | Brian Peters | Best thing that ever happened to me. | 5 | 0 | 1.2024.122 | 2024-05-12 23:17:46 | 1.2024.122 |
4 | 1e396118-3934-4ce6-8390-b2d56771e343 | Gautam kumar Patel | this is very good app | 5 | 0 | 1.2024.108 | 2024-05-12 23:12:56 | 1.2024.108 |
df.columns
Index(['reviewId', 'userName', 'content', 'score', 'thumbsUpCount', 'reviewCreatedVersion', 'at', 'appVersion'], dtype='object')
machine learning project for resume
best machine learning projects
cool machine learning projects
df.describe().T
count | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|
score | 113909.0 | 4.494544 | 1.094733 | 1.0 | 5.0 | 5.0 | 5.0 | 5.0 |
thumbsUpCount | 113909.0 | 0.611277 | 13.717219 | 0.0 | 0.0 | 0.0 | 0.0 | 1193.0 |
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 113909 entries, 0 to 113908 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 reviewId 113909 non-null object 1 userName 113908 non-null object 2 content 113905 non-null object 3 score 113909 non-null int64 4 thumbsUpCount 113909 non-null int64 5 reviewCreatedVersion 103913 non-null object 6 at 113909 non-null object 7 appVersion 103913 non-null object dtypes: int64(2), object(6) memory usage: 7.0+ MB
df.isna().sum()
reviewId 0 userName 1 content 4 score 0 thumbsUpCount 0 reviewCreatedVersion 9996 at 0 appVersion 9996 dtype: int64
sns.heatmap(df.isna() ) plt.show()
plt.figure(figsize=(10, 6)) sns.histplot(df['score'], bins=20, kde=True, color='skyblue') plt.title('Distribution of Scores') plt.xlabel('Score') plt.ylabel('Frequency') plt.show()
from wordcloud import WordCloud reviews_text = ' '.join(df['content'].dropna()) wordcloud = WordCloud(width=800, height=400, background_color='white').generate(reviews_text) plt.figure(figsize=(12, 6)) plt.imshow(wordcloud, interpolation='bilinear') plt.title('Word Cloud of Reviews') plt.axis('off') plt.show()
ChatGPT Chronicles: Daily Dive into User Opinions
This analysis report provides an in-depth examination of user reviews for the ChatGPT Android App. The dataset consists of user reviews that are updated daily, containing valuable information like review ID, user name, review content, score, thumbs up count, review creation version, timestamp, and app version.
machine learning projects
machine learning projects with source code
machine learning projects github
machine learning projects for final year
machine learning projects for students
The main objective of this analysis is to extract valuable insights regarding user sentiment, identify patterns, and gain a better understanding of user satisfaction levels. These findings will serve as a basis for potential app improvements and enhancements.
The analysis uncovered several important discoveries. To begin with, the distribution of ratings shows that most users give high scores, suggesting they are generally satisfied with the app. However, there are also instances of lower ratings, indicating areas that could be improved.
Additionally, when looking at average scores for each app version, we can see differences in user satisfaction. Understanding these differences can help prioritize bug fixes and enhancements.
Furthermore, by conducting a correlation analysis between factors such as score, thumbs up count, and review length, we can gain insights into what influences user satisfaction and engagement. Lastly, analyzing scores over time using a time series approach can reveal trends and fluctuations in user sentiment.
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns import plotly.express as px import warnings warnings.filterwarnings('ignore')
data = pd.read_csv("/kaggle/input/chatgpt-reviews-daily-updated/chatgpt_reviews.csv")
data.head()
reviewId | userName | content | score | thumbsUpCount | reviewCreatedVersion | at | appVersion | |
---|---|---|---|---|---|---|---|---|
0 | 1ea528a6-6d5d-4c9a-b266-9df306f20ed7 | abdulwaheed aminat | amazing app,easy to navigate. | 5 | 0 | 1.2024.101 | 2024-05-12 23:38:52 | 1.2024.101 |
1 | 9df43688-8a80-419e-b36d-61c95fd17d2a | Benedette Morison | The app is recommendable and reliable, especia… | 5 | 0 | 1.2024.115 | 2024-05-12 23:35:02 | 1.2024.115 |
2 | 80b9806a-c5eb-44da-9f0d-6cd864c1f4cf | Android Trigger | Superb ai app | 5 | 0 | 1.2024.122 | 2024-05-12 23:27:05 | 1.2024.122 |
3 | ec5ea9e5-86bc-4de0-b170-f2871833ce74 | Brian Peters | Best thing that ever happened to me. | 5 | 0 | 1.2024.122 | 2024-05-12 23:17:46 | 1.2024.122 |
4 | 1e396118-3934-4ce6-8390-b2d56771e343 | Gautam kumar Patel | this is very good app | 5 | 0 | 1.2024.108 | 2024-05-12 23:12:56 | 1.2024.108 |
plt.figure(figsize=(10, 6)) sns.histplot(data['score'], bins=20, kde=True, color='skyblue') plt.title('Distribution of Scores') plt.xlabel('Score') plt.ylabel('Frequency') plt.show()
avg_score_by_version = data.groupby('appVersion')['score'].mean().reset_index() plt.figure(figsize=(12, 6)) sns.barplot(x='appVersion', y='score', data=avg_score_by_version, palette='viridis') plt.title('Average Score by App Version') plt.xlabel('App Version') plt.ylabel('Average Score') plt.xticks(rotation=90) plt.show()
projects on machine learning
machine learning project
project machine learning
machine learning certification
certification machine learning
plt.figure(figsize=(12, 6)) sns.boxplot(x='appVersion', y='score', data=data, palette='pastel') plt.title('Boxplot of Scores by App Version') plt.xlabel('App Version') plt.ylabel('Score') plt.xticks(rotation=90) plt.show()
# Select only numeric columns numeric_data = data.select_dtypes(include=[np.number]) plt.figure(figsize=(10, 8)) corr = numeric_data.corr() sns.heatmap(corr, annot=True, cmap='coolwarm', fmt=".2f") plt.title('Correlation Heatmap') plt.show()
plt.figure(figsize=(10, 6)) sns.histplot(data['thumbsUpCount'], bins=30, kde=True, color='green') plt.title('Distribution of Thumbs Up Count') plt.xlabel('Thumbs Up Count') plt.ylabel('Frequency') plt.show()
plt.figure(figsize=(10, 6)) sns.scatterplot(x='score', y='thumbsUpCount', data=data, color='purple', alpha=0.5) plt.title('Score vs Thumbs Up Count') plt.xlabel('Score') plt.ylabel('Thumbs Up Count') plt.show()
ml model
machine learning projects
projects machine learning
review_count_by_user = data['userName'].value_counts().reset_index() review_count_by_user.columns = ['User Name', 'Review Count'] plt.figure(figsize=(12, 6)) sns.barplot(x='Review Count', y='User Name', data=review_count_by_user.head(10), palette='magma') plt.title('Top 10 Users by Review Count') plt.xlabel('Review Count') plt.ylabel('User Name') plt.show()
data['review_length'] = data['content'].apply(lambda x: len(str(x))) plt.figure(figsize=(10, 6)) sns.histplot(data['review_length'], bins=50, kde=True, color='brown') plt.title('Distribution of Review Length') plt.xlabel('Review Length') plt.ylabel('Frequency') plt.show()
machine learning projects github
machine learning projects for final year
machine learning projects for students
from wordcloud import WordCloud reviews_text = ' '.join(data['content'].dropna()) wordcloud = WordCloud(width=800, height=400, background_color='white').generate(reviews_text) plt.figure(figsize=(12, 6)) plt.imshow(wordcloud, interpolation='bilinear') plt.title('Word Cloud of Reviews') plt.axis('off') plt.show()
Conclusion
We’ve come to the end of our ChatGPT review journey.
Main Takeaways:
- Average Perplexity: ChatGPT adds excitement to conversations.
- Burstiness: It has a unique rhythm, like a talkative pal.
- Predictability: Keeps you on your toes with surprises.
ml projects github
ml projects for final year
ml projects for students
User feedback reveals the ups and downs of ChatGPT. It serves as a useful assistant and a fun chat companion. These observations shed light on the future of AI and how it’s enhancing our online interactions.
Thanks for being a part of this journey. More AI adventures coming soon!
0 Comments