Welcome to the world of the Python!! Beautiful is better than ugly!!!!
Hey Folks!!!
Welcome back..... I am pursuing a further specialization in business analytics and this blog is regarding working with python on RadishSurvey Data Set. Just sit back and relax and run along the simple steps to find amazing analysis...... ;p Its like reading a story.
We all love food. In today's analysis we have a survey on people who likes radishes (different variety). We would like to understand who loves which variety and few more insights.
Ok!!! So ready lets start.
1) Data Loading
Welcome
to the world of python. Today let us understand how do we work with strings. We
have a data with us that includes 300 lines of survey data each line consists
of a name and radish variety.
Source of
data: “http://opentechschool.github.io/python-data-intro/files/radishsurvey.txt”
2) Reading
the data
Data
can be read through the following code below:
Brief Understanding: We
read the survey data and using the split function we have split the variety and
name through a – and printed the results as ‘name’ voted for ‘Radish Variety’. Please see below the
snapshot for the same. Do not forget to change the path in your code while
implementing
Suppose we want to name and the number of votes for White icicle type
Code: For counting and listing the name of
the people who like white icicle radish type. We have read the survey using the
open statement. We have separated the data through split function. In this case
compared to earlier scenario we are making use of multiple assignment where the
statement name,vote = parts is assigning both names and vote to parts.
Meanwhile the strip() function has been used to separate the new line from the original line.Suppose the line was “hello-mam\n” it would become “hello-mam”
Meanwhile the strip() function has been used to separate the new line from the original line.Suppose the line was “hello-mam\n” it would become “hello-mam”
In the below code we are using the If statement to
compare the vote to white icicle. If it is same then increment the count
function by 1 which had been initialized as 0 in the start of the programme.
We are getting the final count of people liking the white
icicle through the print function
As seen there are 59 people who like white icicle
Now moving forward we know there are
lot of varieties of radishes the major problem we observe is in order to get
the count of the other varieties we need to repeat the above code and change
the name of the variety. Such a tedious task!!! So what to do follow Step 4
4) Making a generic function to count the
number of people who like the different varieties of radishes
One can modify the argument according to the type he wants to see the count for.
What is the problem here!!! Scratch
your head
Actually unless you know the name
of the variety you cannot find the count. Memorizing the names are a very
tedious and difficult task so we need to find an alternative where we create a
dictionary that saves all the names .
5) Counting the votes for all the
categories
Meanwhile moving ahead there are few problems with the code
like Red King and red King has been taken as 2 different categories while it is
actually one hence we need to clean the data we would use a function named
str.capitalize() which would make the 1st letter of each string
capital hence find the below mentioned code
We have reached towards the end of the solution
We still have an issue for
double voting . Please find below the code to tackle the same
This gives us an understanding of people who has voted twice
and gives a fresh count for each radish categories
6) Find out the winner
All are interested in finding the winner. Winner stands
alone!! One of the famous books of Paulo Coelho.Lets see the code below:
The loop shown above keeps track of one name,
winner_name
, and the number
of votes cast for it. Each iteration through the loop it checks if there is a
name with more votes than the current winner, and updates it if so.
Please check the second blog for finding out how we worked with charts on the same data set....
Happy Analytics :)
Comments
Post a Comment