Welcome, Guest. Please Login or Register    
 
smk mk64 mksc mkdd mkds mkwii mk7 mk8
general   mafia   smk   mk64   mksc   mkdd   mkds   mkw   mk7   mk8   |   problems   |   discord   irc
 
  Home Search Members Login Register
 
Pages: 1
Send Topic Print
Players Page Data Archive (Read 110 times)
Rasern
n00b
*
Offline

I'll get those
plumbers!!

6969 days karting
Texas
Gender: male
Players Page Data Archive
06/17/21 at 18:53:24
 
Exactly 1 year ago I had a need to pull down a copy of the combined rankings for every course and flap off the players page, but doing this by hand was a pain. I was inspired by basebalkid's Excel macros that do this for you, but it needed a fresh coat of paint. So I created a document that reduced the number of simultaneous web queries from well over 100 down to just 36. This improved performance dramatically, and now it takes just 2-3 minutes to pull down all the data.

So I just kept pulling the data. I'd take the file, do a save as with an appropriate file name, and rerun the query each week. Then I'd mark the file read-only so I don't accidentally muck it up. I figured there might be a future use for it doing historical trends. I'm up to a year's worth of data, so I wanted to share in case someone more brilliant than me could find a use for it. I'll continue to update this each week (assuming I'm not forgetful).

You can find the archive here:
https://drive.google.com/drive/folders/1NBf0P1ZC6L3Xj4QFpWIUWNg0tX0rMYj5?usp=...

Each file contains multiple tabs. 1 each for AF, ARR, PRSR, and the course and flap records for all 16 tracks.

I could further automate the file, but I've been busy/lazy. Some things I would like to add:
1. Automate the process to run the web queries weekly (or upon detection of Alex P's post), save a copy of the file, and make it read-only. Right now I'm at risk of missing a week's worth of data if I forget to run the queries and save a new file once a week. Although it only takes 2 minutes, I've had some close calls...
2. Fix the issue with the web queries failing if it searches for a table with no results. See if a page like Luigi's Circuit has exactly 1300 records, but I attempt to pull down records 1301-1400, the whole query fails. Writing a catch statement would save me from having to manually update the query when a page rolls from having 1300 records to 1301 records (if I don't manually update the query, the final record is truncated).

If you know Power Query / VBA and are interested in helping me with these two ideas, I'd appreciate it.

So there you have it. Hopefully this is of use to somebody. Having a secret archive without telling anyone doesn't help anyone. Enjoy!

ETA: Justin Paris's sheet probably pulls the data more elegantly than mine; I forgot it existed. However, the point of this post was to share the archived data.
Back to top
 
« Last Edit: 06/17/21 at 19:27:18 by Rasern »  

Remorseless king of evil. Ruler of Dark Land. Supreme leader of the Koopa Troop.
View Profile mariomaster777   IP Logged
Jocelyn SITEK
Karter
**
Offline



seen 3786 mj vids
France
Gender: male
Re: Players Page Data Archive
Reply #1 - 06/19/21 at 03:05:48
 
Hi ! Here some things that come in my mind:

1. YES. Having data is always amazing for future analysis !

2. Computing a lot of data and having complex logic is a pain with VBA. I think it would be better with a python script (or any other language) that runs on your computer (6 years ago I remember doing a small python script that retrieved the difference between times between 2 selected players). As this website doesn't have any restrictions on webscrapping, it should be fast.

3. The storage. Having an excel file is convenient for many reasons, but for data analysis or database storage purpose, it would be better to store the info in CSV files or JSON.

4. Best would be to host the code on the cloud that would automatically store the data in a SQL or NoSQL cloud database ! But this solution is not free...

5. Be careful on the storage format. By example storing "9" for Titan C instead of "TC". Or storing "8.0938" instead of current "8.0938 (Titan B)". It's just easier afterwards to use the data.

6. Maybe... we can also get older data ! Thanks to the WayBackMachine website, we can find some old data. There are some snapshots of this website on it ! Go check you can find very cool things.

I don't know if I'll ever do that, maybe if I feel like it, but at least I shared what I thought !

If any of you guys need my help, feel free to ping me on the discord !
Back to top
 
 

MKDD: #127 combined / Hero B
View Profile   IP Logged
Pages: 1
Send Topic Print