datavizRun
Dataviz of gps run traces
A few days ago I started a small project about data visualisation. The data being gps traces of a run club I joined in 2021.
Here is the new repository called datavizRun.
The motivations are:
- to improve my data science skills
- to make sens of data
- to produce nice visuals
- to answer questions about those data
- to have fun playing with data
Status of the project afer one week
I think I passed the stages of:
- gathering data
- formating/cleaning data
- reaching the 20% Pandas level
- interacting with the formated data and getting answers
- producing basic visualizations
- identify flaws in my data, eg when the conversion of .gpx files to my data structure revealed some information are missing
DataBase
I did choose to represent my data as:
- a list of DataFrame
- single DataFrame highlighting information about the list
Pipeline
I have for now three notebooks:
- one for importing the raw data, eg the .gpx files and save them as .csv files
- one for datavisualization and making request about the database such as what is the longest run?, what is the average run distance? … and this is pretty easy as since I have information in a Pandas DataFrame I can easily extract many insigths
- one for a second cleaning pass
What’s next?
For sure to generate more visuals, add more raw files meaning re-running my notebooks in the right order to update the database. The list of to-do is infinite:
- improve my code = get faster answer about the database
- create more visuals
- add dynamic to the visuals as for now everything is static and ideally put everything into a TouchDesigner. project