About Me Introduction Data Gathering Data Cleaning Exploring Data Clustering ARM and Networking Decision Trees Naive Bayes Support Vector Machine Conclusions

Data Gathering

In the Data Science Life cycel, the step after problem definition is Data Gathering. In the course of this project both APIs and manual downloads will be demonstrated. Furthermore, the usage of APIs ( Application Programming Interface) will be demonstrated in both Python and R.







Python API

Pythons versatility allows users to work with a variety of different file types such as .csv, .xls, .txt and .json among others. As such, it comes naturally to use Python for API since it allows for the retrieval of various types of files and data. In this section, a simple API process with Python is demonstrated.

The information brought bellow is from the Open Data Project of the city of Brussels. The code written to obtain this information can be found here.



This data set contains information about location of ATMS on the territory of the City of Brussels.
Its information columns include the owning bank of the ATM, The ATMS address and its coordinates. All of the infromation about this data set can be seen here.





R API

Although R is not as well suited to work with different types of files and data, it still has enormous ability since it is optimized for statical and mathematical computations.
Brought bellow is an example of how to use APIs in R and its results.


The information brought bellow is from the Open Data Project of the city of Brussels. The code written to obtain this information can be found here.



This data set contains information about locations of Waterfountains on the territory of the City of Brussels.
Its information columns include the type of water fountain, its location, latitude and longitude of the water fountain. All of the infromation about this data set can be seen here.



Two additional data sets were downloaded from the data set. The first is the Moby Bikes data set. The data set contains information about a bike sharing company in Dublin Ireland.
The second data set contains information about Smart Home .

NETID : MZ569