Not supported for mobile device.

Please use the website on a desktop or larger screen.

ML Hyper-Trainer

gamified machine learning

6 Challenges

MVP

Titanic Survival

๐Ÿšข

ยท Target: 80% accuracy

Data Preview

Features

Scaling

Train

Dataset Overview

891 rows

23

TOTAL COLUMNS

10

NUMERIC FEATURES

7

MISSING COLUMNS

7

OUTLIER COLUMNS

Data Quality Issues Detected

โ€ข Age: 19.9% missing values

โ€ข Fare: 0.2% missing values

โ€ข Embarked: 0.2% missing values

โ€ข Cabin: 77.1% missing values

โ€ข Fare per Person: 0.2% missing values

โ€ข Deck: 77.1% missing values

โ€ข Age Group: 19.9% missing values

โ€ข Age: contains outliers

โ€ข SibSp: contains outliers

โ€ข Parch: contains outliers

โ€ข Fare: contains outliers

โ€ข Family Size: contains outliers

โ€ข Fare per Person: contains outliers

โ€ข Name Length: contains outliers

ColumnTypeSample ValuesDistributionMissingOutliersImportance

Survived

TARGET

Target: 0 = No, 1 = Yes

Target
01

โ€”

None

No

Pclass

Ticket class (1st, 2nd, 3rd)

Categorical
13

ฮผ=2.3 ฯƒ=0.84

None

No

72%

Sex

Passenger gender

Binary
malefemale

โ€”

None

No

88%

Age

Age in years

Numeric
223826

ฮผ=29.7 ฯƒ=14.5

19.9%

Yes

58%

SibSp

# siblings / spouses aboard

Numeric
01

ฮผ=0.52 ฯƒ=1.1

None

Yes

31%

Parch

# parents / children aboard

Numeric
01

ฮผ=0.38 ฯƒ=0.81

None

Yes

22%

Fare

Passenger fare (USD)

Numeric
7.2571.287.92

ฮผ=32.2 ฯƒ=49.7

0.2%

Yes

65%

Embarked

Port of embarkation (C/Q/S)

Categorical
SCQ

โ€”

0.2%

No

19%

Cabin

Cabin number (mostly missing)

Categorical
C85C123

โ€”

77.1%

No

8%

Ticket

Ticket number (high cardinality)

Text
A/5 21171PC 17599STON/O2

โ€”

None

No

5%

Is Alone

1 if traveling alone, 0 otherwise

Binary
01

โ€”

None

No

15%

Family Size

Total family members aboard

Numeric
012

ฮผ=1.9 ฯƒ=1.6

None

Yes

35%

Title

Passenger title (Mr, Mrs, Miss, etc)

Categorical
MrMrsMiss

โ€”

None

No

65%

Fare per Person

Fare divided by family size

Numeric
3.635.67.9

ฮผ=19.9 ฯƒ=35.6

0.2%

Yes

45%

Deck

Extracted from Cabin

Categorical
?C

โ€”

77.1%

No

25%

Ticket Length

Length of ticket string

Numeric
987

ฮผ=6.8 ฯƒ=2.7

None

No

5%

Name Length

Length of passenger name

Numeric
235122

ฮผ=26.9 ฯƒ=9.2

None

Yes

12%

Random Noise 1

Randomly generated noise feature

Numeric
0.40.10.9

ฮผ=0.5 ฯƒ=0.28

None

No

1%

Random Noise 2

Randomly generated noise feature

Numeric
124588

ฮผ=50 ฯƒ=28

None

No

2%

Random Noise 3

Random categorical feature

Categorical
ABC

โ€”

None

No

0%

Age Group

Age binned into categories

Categorical
Young AdultAdult

โ€”

19.9%

No

45%

Is Child

1 if Age < 16, else 0

Binary
01

โ€”

None

No

38%

Has Cabin

1 if Cabin is known, else 0

Binary
01

โ€”

None

No

28%

๐Ÿ’ก Review the data carefully โ€” understanding your features helps you make better preprocessing choices.

โ”€โ”€ PIPELINE SCORE โ”€โ”€โ”€โ”€

C

71/100

Accuracy modifier: ร—1.05

Features

100

Scaling

65

Outliers

30

Architect

75

โšก Remove low-importance features (<25%) to reduce noise.

โšก Some features are highly skewed โ€” try Log or Sqrt normalization.

โšก You have outlier columns โ€” consider clipping or imputing them.

Step 1 of 3

Score: 71/100