Not supported for mobile device.

Please use the website on a desktop or larger screen.

ML Hyper-Trainer

gamified machine learning

6 Challenges

MVP

Sentiment Analysis

🎭

· Target: 83% accuracy

Data Preview

Features

Scaling

Train

Dataset Overview

25,000 rows

19

TOTAL COLUMNS

13

NUMERIC FEATURES

3

MISSING COLUMNS

7

OUTLIER COLUMNS

Data Quality Issues Detected

Review Text: 0.2% missing values

Avg Word Sentiment: 2.1% missing values

User Rating: 15.5% missing values

Word Count: contains outliers

Exclamation Count: contains outliers

Question Count: contains outliers

ALL CAPS Ratio: contains outliers

Review Age (Days): contains outliers

Helpful Votes: contains outliers

Total Votes: contains outliers

ColumnTypeSample ValuesDistributionMissingOutliersImportance

Sentiment

TARGET

Target: 0 = Negative, 1 = Positive

Target
10

None

No

Review Text

Raw movie review text (bag-of-words TF-IDF)

Text
loved itboring

0.2%

No

92%

Word Count

Number of words in review

Numeric
12864245

μ=230 σ=174

None

Yes

41%

TF-IDF Top Features

Top 5000 TF-IDF weighted word features

Numeric
0.4200.71

μ=0.08 σ=0.21

None

No

87%

Avg Word Sentiment

Average lexicon sentiment score of words

Numeric
0.78-0.620.85

μ=0.12 σ=0.54

2.1%

No

69%

Exclamation Count

Number of '!' in review

Numeric
037

μ=0.8 σ=2.3

None

Yes

28%

Question Count

Number of '?' in review

Numeric
012

μ=0.4 σ=1.1

None

Yes

15%

ALL CAPS Ratio

Percentage of words in ALL CAPS

Numeric
0.010.150.05

μ=0.04 σ=0.08

None

Yes

34%

Positive Adjectives

Count of positive adjectives

Numeric
305

μ=2.5 σ=2.1

None

No

72%

Negative Adjectives

Count of negative adjectives

Numeric
041

μ=2.1 σ=2.4

None

No

76%

User Rating

Optional star rating left by user

Numeric
514

μ=3.4 σ=1.4

15.5%

No

95%

Is Verified Purchase

Whether reviewer bought the product

Binary
10

None

No

11%

Review Age (Days)

Days since review was posted

Numeric
1436542

μ=420 σ=650

None

Yes

2%

Helpful Votes

Number of people who found review helpful

Numeric
12045

μ=5.2 σ=45.3

None

Yes

25%

Total Votes

Total votes on the review

Numeric
15250

μ=8.4 σ=52.1

None

Yes

18%

Has Spoilers

User flagged review as containing spoilers

Binary
01

None

No

8%

Flesch Reading Ease

Readability score of text

Numeric
758265

μ=72 σ=15

None

No

14%

Random Tag 1

Meaningless metadata tag

Categorical
XYZ

None

No

0%

Random Tag 2

Meaningless metadata tag

Categorical
ABC

None

No

0%

💡 Review the data carefully — understanding your features helps you make better preprocessing choices.

── PIPELINE SCORE ────

C

71/100

Accuracy modifier: ×1.05

Features

100

Scaling

65

Outliers

30

Architect

75

Remove low-importance features (<25%) to reduce noise.

Some features are highly skewed — try Log or Sqrt normalization.

You have outlier columns — consider clipping or imputing them.

Step 1 of 3

Score: 71/100