Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data-loader issue #324

Closed
PaulSteffen-betclic opened this issue Jun 18, 2019 · 7 comments
Closed

data-loader issue #324

PaulSteffen-betclic opened this issue Jun 18, 2019 · 7 comments

Comments

@PaulSteffen-betclic
Copy link

Hi,

I followed installation instructions on issue #232 but I ran into a few issues.
The first 9 steps described by @mcekovic are ok but the load at the 10th step fail :

  • when I run data-load-1.0-SNAPSHOT\bin\data-load.bat -lt -c 15 -bd=%tennis_atp-data-directory%, I have the following Exception:

Loading Tennis Data
Allocating DB connections...............
Loading players
Loading file 'C:\tennis_atp\atp_players.csv'
Exception in thread "main" java.lang.NumberFormatException: For input string: "player_id"

  • I tried to remove headers on rankings & players .csv files:
    The atp_players.csv load and atp_rankings**s.csv load seems to be ok but load fail at the atp_matches_1969.csv step with the following Exception:

Loading file 'C:\tennis_atp\atp_matches_1969.csv'
Exception in thread "main" groovy.lang.MissingMethodException: No signature of method: static org.strangeforest.tcb.dataload.MatchLoader.mapLevel() is applicable for argument types: (String, null, String, Short, String) values: [A, null, Hobart, 1968, 713]
Possible solutions: mapLevel(java.lang.String, short, java.lang.String, int, java.lang.String)

  • I tried to fill NA of draw_size column with 0, then the atp_matches****.csv load seems to be ok but I had some Exceptions as follow :

Loading file 'C:\tennis_atp\atp_matches_1971.csv'
Invalid set: 6-Feb
java.lang.NumberFormatException: For input string: "Feb"

which didn't stop the loading.

  • Finally, the load fail when scraping results, with the following Exception :

Fetching tournament URL 'http://www.atptour.com/en/scores/archive/birmingham/350/1970//results'
Unknown tournament level: null
Exception in thread "main" java.text.ParseException: Unparseable date: "-"

@mcekovic
Copy link
Owner

Seems there was recently massive update in the Jeff's repository, fixing bugs and updating the data. However, seems another layer of issues is introduced :(.
I will be able to check the data and eventually adapt UTS loaders for the new changes only in August.

@mcekovic
Copy link
Owner

In the meanwhile, you could pull the Jeff's repository at point in time Sep 26, 2018 (last commit b1a180f124c65708a65e09ba593b86ba8f13bf79), i.e. reverting last 4 commits from May 2019 and starting the data load.

@PaulSteffen-betclic
Copy link
Author

Thx. I'm looking forward to your fixes :)
I tried to use b1a180f124c65708a65e09ba593b86ba8f13bf79 version of Jeff's repository, but the load fail with atp_rankings_00s.csv file with the following Exception :

Exception in thread "main" java.lang.NumberFormatException: For input string: "bioTableWrap bioTableWrapAlt">"

@mcekovic
Copy link
Owner

You need to unfortunately edit local files and correct the errors:

atp_rankings_00s.csv:
-20070212,1535,104756,"bioTableWrap bioTableWrapAlt"">" +20070212,1535,104756,1

atp_rankings_10s.csv:
-20160613,1709,,>

@PaulSteffen-betclic
Copy link
Author

Thx !
But I have always the same final Exception :

Fetching tournament URL 'http://www.atptour.com/en/scores/archive/birmingham/350/1970//results'
Unknown tournament level: null
Exception in thread "main" java.text.ParseException: Unparseable date: "-"

@mcekovic
Copy link
Owner

Now ATP website has issues in the generated HTML. Unfortunately, I will be able to adapt UTS for it only in August.

@mcekovic
Copy link
Owner

Data load should been fixed now, meaning it should complete, but the data quality is not guaranteed (see JeffSackmann/tennis_atp#108)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants