Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many failing tests #301

Closed
tjburch opened this issue Dec 12, 2022 · 6 comments
Closed

Many failing tests #301

tjburch opened this issue Dec 12, 2022 · 6 comments

Comments

@tjburch
Copy link
Collaborator

tjburch commented Dec 12, 2022

We've got a bunch of tests that we need to address. Running locally I get:

  • All in test_amateur_draft.py
  • Most in test_league_batting_stats.py
  • Most in test_league_pitching_stats.py
  • One in test_playerid_lookup.py
  • Basically all in test_standings.py
  • One in test_statcast_running.py
  • The only one in test_team_game_logs.py
  • About half in test_team_results.py

Guessing it's just like fixtures breaking or something like that. Better to get resolved sooner than later.

@BrayanMnz
Copy link
Contributor

BrayanMnz commented Dec 13, 2022

Hi @tjburch,
if you can add more details on how to replicate this - maybe I can take a look into this.

also, I think we can improve the contributing.md and create a template for issues in order to make it more clear and concise for those who wants to collaborate to the project.

@tjburch
Copy link
Collaborator Author

tjburch commented Dec 13, 2022

Thanks @BrayanMnz. @TheCleric is also starting to look when he has available time, so keep posted to this page.

Clone the repo, run pip install -e . from top level, and then run pytest and it should light up like a Christmas tree. You should be able to see which tests fail and what the error message is

@tjburch
Copy link
Collaborator Author

tjburch commented Dec 19, 2022

I figured out at least some of the cases, those that touch bbref. Basically we're throwing a lot of requests their way and they get rate limited. Looking directly at the get_soup output in standings.py

<h2 class="text-gray-600 leading-1.3 text-3xl lg:text-2xl font-light">You are being rate limited</h2>
</header>
<section class="w-240 lg:w-full mx-auto mb-8 lg:px-8">
<div class="w-1/2 md:w-full" id="what-happened-section">
<h2 class="text-3xl leading-tight font-normal mb-4 text-black-dark antialiased" data-translate="what_happened">What happened?</h2>
<p>The owner of this website (www.baseball-reference.com) has banned you temporarily from accessing this website.</p>
</div>

Not sure the best solution here. @TheCleric, any suggestions?

@smoot618
Copy link

I figured out at least some of the cases, those that touch bbref. Basically we're throwing a lot of requests their way and they get rate limited. Looking directly at the get_soup output in standings.py

<h2 class="text-gray-600 leading-1.3 text-3xl lg:text-2xl font-light">You are being rate limited</h2>
</header>
<section class="w-240 lg:w-full mx-auto mb-8 lg:px-8">
<div class="w-1/2 md:w-full" id="what-happened-section">
<h2 class="text-3xl leading-tight font-normal mb-4 text-black-dark antialiased" data-translate="what_happened">What happened?</h2>
<p>The owner of this website (www.baseball-reference.com) has banned you temporarily from accessing this website.</p>
</div>

Not sure the best solution here. @TheCleric, any suggestions?

Hey!

So, I just ran into this error an you'll need to do a wait condition. Something below should work (it's in a jupyter notebook) as an example using time.sleep(10):

import numpy as np
import pandas as pd
import time
import seaborn as sns
import pybaseball as pyball
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

from pybaseball import *
from pybaseball import statcast, utils
from pybaseball.plotting import plot_bb_profile

pd.set_option('display.max_columns', None)
%matplotlib inline

We get 2 teams from AL conference and 2 teams from the NL conference

def get_team_names(year):

NYY_df = schedule_and_record(year, 'NYY')
STL_df = schedule_and_record(year, 'STL')
BOS_df = schedule_and_record(year, 'BOS')
NYM_df = schedule_and_record(year, 'NYM')

NYY_df_teams = NYY_df.Opp.unique()
NYY_df_teams_list = list(NYY_df_teams)

STL_df_teams = STL_df.Opp.unique()
STL_df_teams_list = list(STL_df_teams)

BOS_df_teams = BOS_df.Opp.unique()
BOS_df_teams_list = list(BOS_df_teams)

NYM_df_teams = NYM_df.Opp.unique()
NYM_df_teams_list = list(NYM_df_teams)

AL_team = NYY_df_teams_list + BOS_df_teams_list
NL_team = STL_df_teams_list + NYM_df_teams_list 

# Since not every team plays every other team, we get opponents from 2 seperate teams 
# and weed out duplicates
all_team = AL_team + NL_team
mlb_teams = set(all_team) 

return mlb_teams

def get_schedule_record_all_teams(year, team_names):

empt_team_schedule_list = []
for team in team_names:
    print(team)
    team_schedule = schedule_and_record(year, team)
    time.sleep(10)
    empt_team_schedule_list.append(team_schedule)

schedule_df = pd.concat(empt_team_schedule_list)

return schedule_df

def main(year):

team_names = get_team_names(year)
schedule_df = get_schedule_record_all_teams(year, team_names)


return team_names, schedule_df

team_names, schedule_df = main(2010)

Honestly, I just ran into this so my account's been temp banned as well, but I'm gonna grab my other computer and attempt with the wait condition.

@tjburch
Copy link
Collaborator Author

tjburch commented Feb 7, 2023

The good news is #296 took care of the rate limits (thanks @TheCleric). The bad news is now the FG error in #315 is causing it's own failing tests (see: https://github.com/jldbc/pybaseball/actions/runs/4116782091/jobs/7107420647)

@tjburch
Copy link
Collaborator Author

tjburch commented Feb 13, 2023

Closing per #318 and Bryan Peabody's great detective work.

@tjburch tjburch closed this as completed Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants