Data Analysis For Question Answering

by Ismail ouahbi & Hamza khalid  jun 13th , 2022home

Data analysis is a process of inspecting, cleansing, transforming, and modelling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.[1] Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains.[2] In today's business world, data analysis plays a role in making decisions more scientific and helping businesses operate more effectively. (credit wikipedia ) 

Step by Step

As a starting point , I've taken the "titanic" data set , which is a famous data that you can start your data science journey with , you can download it through this website :

kaggle

Problem Introduction

introduction

Data Set description

data description

Hypothesis & Questions

Questions to answer

Data` statistics

statistical measurements

Data preprocessing


Null Values

null values

Null Values-sum-

null values sum

Dealing With Missing Values


For Fare column

dealing with missing data(Fare column)

For Cabin column

dealing with missing data(Cabin column)

For Age column

dealing with missing data(Age column)

Analysis Part


Heatmap(correlation matrix)

heatmap with annotations

PCA Part

New dataset after pca
Pca plot

Stay tuned for more advanced projects , and don't forget to download my entire project paper for more details & code (using python & R) , to do click the button below :

French version : download all code
English version : download all code

Thanks for reading .

 by Ismail ouahbi and  Hamza khalid     |   home