Commit b51046bb authored by rtugade's avatar rtugade
Browse files

Updated css for better printing

parent 8ce8f65b
......@@ -4,90 +4,185 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Bike Sharing Hourly Count Prediction Using Machine Learning Models"
"<p style=\"text-align:center; font-size: 24px; font-family: Times New Roman, Times, serif\">\n",
"<b>Analyzing Bike Sharing Systems: A Comparison of Machine Learning Models</b>\n",
"</p>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Prince Joseph Erneszer Javier\n",
"<br>Reynaldo Tugade Jr.\n",
"<br>MS Data Science\n",
"<br>Asian Institute of Management"
"<p style=\"text-align:center; font-size: 14px; font-family: Times New Roman, Times, serif\">\n",
"Prince Joseph Erneszer Javier, Reynaldo Tugade Jr.\n",
"</p>\n",
"\n",
"<p style=\"text-align:center; font-size: 12px; font-family: Times New Roman, Times, serif\">\n",
"MS Data Science<br>\n",
"Asian Institute of Management\n",
"</p>\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Abstract\n",
"<div style=\"width: 90%\">\n",
"<p style=\"text-align:left; font-size: 20px; font-family: Times New Roman, Times, serif;\">\n",
"Abstract\n",
"</p>\n",
"\n",
"<p style=\"text-align:justify\">\n",
"<p style=\"text-align:justify; font-family: Times New Roman, Times, serif\">\n",
"In performing data analysis, a common task is to search the most appropriate algorithm(s) to best resemble a given system. In this report, we demonstrate the suitability of using a neural network in predicting the potential number of users using combined historical rental and weather information. The idea is to augment previous machine learning models and discover the possibility of getting better test accuracy. We used K-Nearest Neighbor, Linear Regression, Ridge Regression, Lasso Regression, Linear Support Vector Machine, Decision Trees, Random Forest, and Gradient Boosting Method as baseline models for machine learning. We used a 3 layer fully-connected feed-forward network with 56 hidden nodes. This report shows that such configuration works well the most with 74.6% accuracy compared to GBM and RF with 73.4% and 72.7% respectively. \n",
"</p>"
"</p>\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"<div style=\"width: 90%\">\n",
"\n",
"<p style=\"text-align:left; font-size: 20px; font-family: Times New Roman, Times, serif\">\n",
"Introduction\n",
"</p>\n",
"\n",
"<p style=\"text-align:justify\">\n",
"<p style=\"text-align:justify; font-family: Times New Roman, Times, serif\">\n",
"Bicycle-sharing systems are paid services where an individual can rent an available bicycle on a short-term basis. Used to be considered as a service only available in small and closed communities (e.g. campuses, private subdivisions), bicycle-sharing systems are becoming mainstream modes for public-transport in several countries. Few of these systems include Paris' \"Vellib\" which started operating in 2005, Hangzhou's bicycle hub in China which houses more than 50,000 bicycles and even locally with Asian Development Bank's (ADB) Sustainable Transport Initiative program. \n",
"</p>\n",
"\n",
"<p style=\"text-align:justify\">\n",
"<p style=\"text-align:justify; font-family: Times New Roman, Times, serif\">\n",
"From a sustainability perspective, bicycle-sharing systems have its benefits. One, it promotes better flexible mobility. Bicycle stations can be placed anywhere especially in areas where there's a perceived concentration of people traveling. Second, it impacts emission reduction due to no fuel use and reduces congestion. Third, it's relatively cheap and very convenient specifically since it helps improve multimodal transport connections. \n",
"</p>\n",
"\n",
"<p style=\"text-align:justify\">\n",
"In this regard, we wish to explore interesting behavior found in bicycle-sharing systems. The richness of data involved in bicycle-sharing systems can provide more information from a mobility sensing perspective. In this report, we will explore potential users based on previous rentals and weather data. We will leverage on learned Machine Learning and Neural network techniques to contrast and compare which among these models best resemble the bicycle-sharing system.</p>"
"<p style=\"text-align:justify; font-family: Times New Roman, Times, serif\">\n",
"In this regard, we wish to explore interesting behavior found in bicycle-sharing systems. The richness of data involved in bicycle-sharing systems can provide more information from a mobility sensing perspective. In this report, we will explore potential users based on previous rentals and weather data. We will leverage on learned Machine Learning and Neural network techniques to contrast and compare which among these models best resemble the bicycle-sharing system.</p>\n",
"\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data\n",
"\n",
"The original dataset comes from [Capital Bikeshare](https://archive.ics.uci.edu/ml/datasets/bike+sharing+dataset) which is compiled by the Laboratory of Artificial Intelligence and Decision Support (LIAAD), University of Porto and placed in [UCL](https://archive.ics.uci.edu/ml/datasets/bike+sharing+dataset). The dataset contains time-series rental and weather information. The following list consists of the available features found in the dataset:\n",
"\n",
"|Feature Variable|Possible values|<p align=\"left\">Description</p>|\n",
"|:-|:-|:-|\n",
"|instant|0to17378|<p align=\"left\">Index of the record</p>|\n",
"|dteday|2011-01-01 to 2012-12-31|<p align=\"left\">Date</p>|\n",
"|season|1,2,3,4|<p align=\"left\">Season (Spring,Summer,Fall,Winter)</p>|\n",
"|yr|0,1|<p align=\"left\">Year of occurrence</p>|\n",
"|mnth|1,2,3,...,12|<p align=\"left\">Month</p>|\n",
"|hr|0,1,2,...,23|<p align=\"left\">Hour</p>|\n",
"|holiday|0,1|<p align=\"left\">Whether current day is a holiday.</p>|\n",
"|weekday|0,1,2,...,6|<p align=\"left\">Day of the week</p>|\n",
"|weathersit|1,2,3,4|<p align=\"left\">Weather information based meteorological events</p>|\n",
"|_|_|<p align=\"left\">(1) Clear,Fewclouds,Partlycloudy</p>|\n",
"|_|_|<p align=\"left\">(2) Misty plus still generally cloudy environment</p>|\n",
"|_|_|<p align=\"left\">(3) Light Snow,Light Rain with occasional Thunderstorms,Light Rain with scattered clouds</p>|\n",
"|_|_|<p align=\"left\">(4) Heavy Rain with Ice pellets,Thunderstorm with Mist,Snow with Fog</p>|\n",
"|temp|0.02 to 1.00|<p align=\"left\">Normalized feeling temperature in Celsius</p>|\n",
"|atemp|0.0000 to 1.0000|<p align=\"left\">Normalized feeling temperature in Celsius</p>|\n",
"|hum|0.00 to 1.00|<p align=\"left\">Normalized humidity</p>|\n",
"|windspeed|0.0000 to 0.8507|<p align=\"left\">Normalized windspeed</p>|\n",
"|casual|0 to 367|<p align=\"left\">Count of casual users</p>|\n",
"|registered|0 to 886|<p align=\"left\">Count of registered users</p>|\n",
"|cnt|1 to 977|<p align=\"left\">Count of total rental bikes including both casual and registered</p>|\n",
"\n",
"* Temperature variable _(temp)_ is computed using the following equation: \n",
"<div style=\"width: 90%\">\n",
"\n",
"<p style=\"text-align:left; font-size: 20px; font-family: Times New Roman, Times, serif\">\n",
"Data\n",
"</p>\n",
"\n",
"<p style=\"text-align:justify; font-family: Times New Roman, Times, serif\">\n",
"The original dataset comes from the Capital Bikeshare website which is compiled by the Laboratory of Artificial Intelligence and Decision Support (LIAAD), University of Porto and placed in UCL. The dataset contains time-series rental and weather information. The following list consists of the available features found in the dataset:\n",
"</p>\n",
"<br>\n",
"<div style=\"overflow-x:auto;\">\n",
" <table>\n",
" <tr>\n",
" <td><p style=\"text-align: left\"><b>Feature Variable</b></p></td>\t\n",
" <td><p style=\"text-align: left\"><b>Possible Values</b></p></td>\t\n",
" <td><p style=\"text-align: left\"><b>Description</b></p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">instant</p></td>\t\n",
" <td><p style=\"text-align: left\">0 to 17378</p></td>\t\n",
" <td><p style=\"text-align: left\">Index of the record</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">dteday</p></td>\t\n",
" <td><p style=\"text-align: left\">2011-01-01 to 2012-12-31</p></td>\n",
" <td><p style=\"text-align: left\">Date</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">season</p></td>\n",
" <td><p style=\"text-align: left\">1,2,3,4</p></td>\t\n",
" <td><p style=\"text-align: left\">Season (Spring, Summer, Fall, Winter)</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">yr</p></td>\n",
" <td><p style=\"text-align: left\">0,1</p></td>\t\n",
" <td><p style=\"text-align: left\">Year of occurrence</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">mnth</p></td>\t\n",
" <td><p style=\"text-align: left\">1,2,3,...,12</p></td>\t\n",
" <td><p style=\"text-align: left\">Month</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">hr</p></td>\t\n",
" <td><p style=\"text-align: left\">0,1,2,...,23</p></td>\t\n",
" <td><p style=\"text-align: left\">Hour</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">holiday</p></td>\t\n",
" <td><p style=\"text-align: left\">0,1</p></td>\t\n",
" <td><p style=\"text-align: left\">Whether current day is a holiday. Based from (dc.gov)[Holidays]</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">weekday</p></td>\n",
" <td><p style=\"text-align: left\">0,1,2,...,6</p></td>\t\n",
" <td><p style=\"text-align: left\">Day of the week</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">weathersit</p></td>\t\n",
" <td><p style=\"text-align: left\">1,2,3,4</p></td>\t\n",
" <td><p style=\"text-align: left\">Weather information based meteorological events</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">_</p></td>\t\n",
" <td><p style=\"text-align: left\">_</p></td>\t\n",
" <td><p style=\"text-align: left\">(1) Clear, Few clouds, Partly cloudy</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">_</p></td>\n",
" <td><p style=\"text-align: left\">_</p></td>\t\n",
" <td><p style=\"text-align: left\">(2) Misty plus still generally cloudy environment</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">_</p></td>\n",
" <td><p style=\"text-align: left\">_</p></td>\t\n",
" <td><p style=\"text-align: left\">(3) Light Snow, Light Rain with occasional Thunderstorms,Light Rain with scattered clouds</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">_</p></td>\n",
" <td><p style=\"text-align: left\">_</p></td>\n",
" <td><p style=\"text-align: left\">(4) Heavy Rain with Ice pellets, Thunderstorm with Mist, Snow with Fog</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">temp</p></td>\n",
" <td><p style=\"text-align: left\">0.02 to 1.00</p></td>\n",
" <td><p style=\"text-align: left\">Normalized feeling temperature in Celsius.</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">atemp</p></td>\t\n",
" <td><p style=\"text-align: left\">0.0000 to 1.0000</p></td>\t\n",
" <td><p style=\"text-align: left\">Normalized feeling temperature in Celsius. </p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">humz</p></td>\t\n",
" <td><p style=\"text-align: left\">0.00 to 1.00</p></td>\t\n",
" <td><p style=\"text-align: left\">Normalized humidity</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">windspeed</p></td>\t\n",
" <td><p style=\"text-align: left\">0.0000 to 0.8507</p></td>\t\n",
" <td><p style=\"text-align: left\">Normalized windspeed</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">casual</p></td>\t\n",
" <td><p style=\"text-align: left\">0 to 367</p></td>\t\n",
" <td><p style=\"text-align: left\">Count of casual users</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">registered</p></td>\t\n",
" <td><p style=\"text-align: left\">0 to 886</p></td>\t\n",
" <td><p style=\"text-align: left\">Count of registered users</p></td>\n",
" </tr><tr>\n",
" <td><p style=\"text-align: left\">cnt</p></td>\n",
" <td><p style=\"text-align: left\">1 to 977</p></td>\t\n",
" <td><p style=\"text-align: left\">Count of total rental bikes including both casual and registered</p></td>\n",
" </tr>\n",
" </table>\n",
"</div>\n",
"\n",
"<p style=\"text-align:justify; font-family: Times New Roman, Times, serif\">\n",
"* Temperature variable <i>(temp)</i> is computed using the following equation: \n",
"</p>\n",
"\n",
"\\begin{equation}\n",
"\\frac{t - t_{min}}{t_{max} - t_{min}}, t_{min}= -8, t_{max}= +39\n",
"\\end{equation}\n",
"\n",
"<p style=\"text-align:justify; font-family: Times New Roman, Times, serif\">\n",
"* Absolute temperature variable _(atemp)_: \n",
"</p>\n",
"\n",
"\\begin{equation}\n",
"\\frac{t - t_{min}}{t_{max} - t_{min}}, t_{min}= -16, t_{max}=+50\n",
"\\end{equation}"
"\\end{equation}\n",
"\n",
"</div>"
]
},
{
......@@ -200,7 +295,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## References"
"## References\n",
"\n",
"https://archive.ics.uci.edu/ml/datasets/bike+sharing+dataset"
]
},
{
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment