Skip to content

Commit

Permalink
Merge pull request #23 from claromes/match-part2
Browse files Browse the repository at this point in the history
Extract missing data
  • Loading branch information
claromes committed Jan 27, 2024
2 parents e5170b4 + 65d2116 commit b4ec251
Show file tree
Hide file tree
Showing 6 changed files with 180 additions and 34 deletions.
48 changes: 46 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ pip install volleystats

# Documentation

- [Extracted Data](#extracted-data)
- [Usage](#usage)
- [Match](#match)
- [Competition Matches](#competition-matches)
Expand All @@ -39,6 +40,41 @@ pip install volleystats
- [Troubleshooting](#troubleshooting)
- [Match files collected from batch file](#match-files-collected-from-batch-file)

## Extracted Data

- Competition
- Competition ID
- Home Team
- Guest Team
- Home Points
- Guest Points
- Date
- Location

- Match
- Match ID
- Match date
- Home Team
- Guest Team
- Coach
- Location
- Total Points
- Break Points
- Win-Lost
- Total Serves
- Serve Erros
- Serve Points
- Total Receptions
- Reception Erros
- Positive Pass Percentage (Pos%)
- Excellent/ Perfect Pass Percentage (Exc.%)
- Total Attacks
- Attack Erros
- Blocked Attack
- Attack Points (Exc.)
- Attack Points Percentage (Exc.%)
- Block Points

## Usage

```
Expand Down Expand Up @@ -204,23 +240,30 @@ volleystats: finished

- /MatchStatistics?mID=`<Match_ID>`&ID=`<Competition_ID>`

## Federations, Confederations and Leagues Acronym
## Federations, Confederations and Leagues Acronyms

**European Volleyball**

- `fshv`: [Albanian Volleyball Federation](https://fshv-web.dataproject.com/MainHome.aspx)
- `bvl`: [Baltic League](https://bvl-web.dataproject.com/MainHome.aspx)
- `bevl`: [Belgium Volleyball Federation](https://bevl-web.dataproject.com/MainHome.aspx)
- `osbih`: [Bosnia and Herzegovina Volleyball Federation](https://osbih-web.dataproject.com/MainHome.aspx)
- `bvf`: [Bulgarian Volleyball Federation](https://bvf-web.dataproject.com/MainHome.aspx)
- `bvl`: [Baltic League](https://bvl-web.dataproject.com/MainHome.aspx)
- `vbl`: [Bundesliga](https://vbl-web.dataproject.com/MainHome.aspx)
- `hos`: [Croatian Volleyball Federation](https://hos-web.dataproject.com/MainHome.aspx)
- `cvf`: [Czech Volleyball Federation](https://cvf-web.dataproject.com/MainHome.aspx)
- `evf`: [Estonian Volleyball Federation](https://evf-web.dataproject.com/MainHome.aspx)
- `fbf`: [Faroe Islands Volleyball Association](https://fbf-web.dataproject.com/MainHome.aspx)
- `lml`: [Finland Volleyball League](https://lml-web.dataproject.com/MainHome.aspx)
- `eope`: [Hellenic Volleyball Federation](https://eope-web.dataproject.com/MainHome.aspx)
- `hvl`: [Hellenic Volleyball League](https://hvl-web.dataproject.com/MainHome.aspx)
- `hvf`: [Hungary Volleyball Federation](https://hvf-web.dataproject.com/MainHome.aspx)
- `bli`: [Icelandic Volleyball Association](https://bli-web.dataproject.com/MainHome.aspx)
- `iva`: [Israel Volleyball Association](https://iva-web.dataproject.com/MainHome.aspx)
- `fipav`: [Italian Volleyball Federation](https://fipav-web.dataproject.com/MainHome.aspx)
- `vfrk`: [Volleyball Federation of Republic of Kazakhstan](https://vfrk-web.dataproject.com/MainHome.aspx)
- `latvf`: [Latvian Volleyball Federation](https://latvf-web.dataproject.com/MainHome.aspx)
- `lnv`: [Ligue Nationale de Volley](https://lnv-web.dataproject.com/MainHome.aspx)
- `lvf`: [Lithuanian Volleyball Federation](https://lvf-web.dataproject.com/MainHome.aspx)
- `mva`: [Malta Volleyball Association](https://mva-web.dataproject.com/MainHome.aspx)
- `nvbf`: [Norwegian Volleyball Federation](https://nvbf-web.dataproject.com/MainHome.aspx)
Expand All @@ -233,6 +276,7 @@ volleystats: finished
- `svbf`: [Swedish Volleyball Federation](https://svbf-web.dataproject.com/MainHome.aspx)
- `swi`: [Swiss Volley](https://swi-web.dataproject.com/MainHome.aspx)
- `tvf`: [Turkish Volleyball Federation](https://tvf-web.dataproject.com/MainHome.aspx)
- `uvf`: [Ukrainian Volleyball Federation](https://uvf-web.dataproject.com/MainHome.aspx)
- `pvlu`: [Professional Volleyball League of Ukraine](https://pvlu-web.dataproject.com/MainHome.aspx)

**South American Volleyball**
Expand Down
41 changes: 19 additions & 22 deletions WIP.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,11 @@
- [x] Match date
- [x] Home Team
- [x] Guest Team
- [ ] Coach
- [ ] Location
- [x] Coach
- [x] Location
- [ ] Final result
- [ ] Result per SET

- Vote
- [ ] Vote by player

- Points
- [x] Total Points by player
- [ ] Total Points by player per SET
Expand All @@ -30,29 +27,29 @@
- [x] Totals

- Serve
- [ ] Total Serves by player
- [ ] Serve Erros by player
- [ ] Serve Points by player
- [ ] Totals
- [x] Total Serves by player
- [x] Serve Erros by player
- [x] Serve Points by player
- [x] Totals

- Reception
- [ ] Total Receptions by player
- [ ] Reception Erros by player
- [ ] Positive Pass Percentage by player
- [ ] Excellent/ Perfect Pass Percentage by player
- [ ] Totals
- [x] Total Receptions by player
- [x] Reception Erros by player
- [x] Positive Pass Percentage by player
- [x] Excellent/ Perfect Pass Percentage by player
- [x] Totals

- Attack
- [ ] Total Attacks by player
- [ ] Attack Erros by player
- [ ] Blocked Attack by player
- [ ] Attack Points by player
- [ ] Attack Points Percentage by player
- [ ] Totals
- [x] Total Attacks by player
- [x] Attack Erros by player
- [x] Blocked Attack by player
- [x] Attack Points by player
- [x] Attack Points Percentage by player
- [x] Totals

- Block
- [ ] Block Points by player
- [ ] Totals
- [x] Block Points by player
- [x] Totals

- Feds, Confs and Leagues
- [x] European
Expand Down
11 changes: 8 additions & 3 deletions volleystats/spiders/competition.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,14 +41,19 @@ def parse(self, response):
match_date = parse_short_date(match_date_text)

match_location = match.xpath("./div/div/div/p[2]/span[1]/text()").get()

if match_location:
match_location = match_location.lower()

home_team = match.xpath("./div/div/div[5]/p/span/*/text() | ./div/div/div[5]/p/span/text()").get().lower()
home_team = match.xpath("./div/div/div[5]/p/span/*/text() | ./div/div/div[5]/p/span/text()").get()
if home_team:
home_team = home_team.lower()

home_points = match.xpath("./div/div/div[7]/p[1]/span[1]/b/text()").get()

guest_team = match.xpath("./div/div/div[9]/p/span/*/text() | ./div/div/div[9]/p/span/text()").get().lower()
guest_team = match.xpath("./div/div/div[9]/p/span/*/text() | ./div/div/div[9]/p/span/text()").get()
if guest_team:
guest_team = guest_team.lower()

guest_points = match.xpath("./div/div/div[7]/p[1]/span[3]/b/text()").get()

competition = {
Expand Down
104 changes: 98 additions & 6 deletions volleystats/spiders/match.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,27 +27,73 @@ def parse(self, response):
if enGB == 'EN':
match_date = parse_engb_date(match_date_text)

home_team_string = response.xpath("normalize-space(//span[@id='Content_Main_LBL_HomeTeam']/text())").get().replace(' ', '-').lower()
home_team_string = response.xpath("normalize-space(//span[@id='Content_Main_LBL_HomeTeam']/text())").get().replace(' ', '-')
home_team = re.sub('[^A-Za-z0-9]+', '-', home_team_string)
if home_team:
home_team = home_team.lower()

coach = response.xpath("//span[@id='Content_Main_ctl17_RP_MatchStats_Coach_Home_0']/text()").get()
if coach:
coach = coach.lower()
coach = parse_coach(coach)

location = response.xpath("//span[@id='Content_Main_LB_Stadium']/text()").get()
if location:
location = location.lower()

home_players = response.xpath("//div[@id='Content_Main_ctl17_RP_MatchStats_RPL_MatchStats_0']/div[3]/div/div/table/tbody/tr")

for player in home_players:
player_number = player.xpath('./td[1]/p/span/text()').get()
player_name = player.xpath('./td[2]/p/span/b/text()').get().lower()
player_name = player.xpath('./td[2]/p/span/b/text()').get()
if player_name:
player_name = player_name.lower()

points_tot = player.xpath('./td[8]/p/span/text()').get()
points_BP = player.xpath('./td[9]/p/span/text()').get()
points_WL = player.xpath('./td[10]/p/span/text()').get()

serve_tot = player.xpath('./td[12]/p/span/text()').get()
serve_err = player.xpath('./td[13]/p/span/text()').get()
serve_ace = player.xpath('./td[14]/p/span/text()').get()

reception_tot = player.xpath('./td[15]/p/span/text()').get()
reception_err = player.xpath('./td[16]/p/span/text()').get()
reception_pos = player.xpath('./td[17]/p/span/text()').get()
reception_exec = player.xpath('./td[18]/p/span/text()').get()

attack_tot = player.xpath('./td[21]/p/span/text()').get()
attack_err = player.xpath('./td[22]/p/span/text()').get()
attack_block = player.xpath('./td[23]/p/span/text()').get()
attack_exc = player.xpath('./td[24]/p/span/text()').get()
attack_exc_perc = player.xpath('./td[25]/p/span/text()').get()

block_points = player.xpath('./td[27]/p/span/text()').get()

yield {
'Match ID': self.match_id,
'Match Date': match_date,
'Home Team': home_team,
'Home Coach': coach,
'Stadium': location,
'Number': player_number,
'Name': player_name,
'Total Points': points_tot,
'Break Points': points_BP,
'W-L': points_WL
'W-L': points_WL,
'Total Serve': serve_tot,
'Serve Errors': serve_err,
'Ace': serve_ace,
'Total Receptions': reception_tot,
'Reception Erros': reception_err,
'Positive Pass Percentage': reception_pos,
'Excellent/ Perfect Pass Percentage': reception_exec,
'Total Attacks': attack_tot,
'Attack Erros': attack_err,
'Blocked Attack': attack_block,
'Attack Points (Exc.)': attack_exc,
'Attack Points Percentage (Exc.%)': attack_exc_perc,
'Block Points': block_points
}

self.match_date = match_date
Expand Down Expand Up @@ -87,27 +133,73 @@ def parse(self, response):
if enGB == 'EN':
match_date = parse_engb_date(match_date_text)

guest_team_string = response.xpath("normalize-space(//span[@id='Content_Main_LBL_GuestTeam']/text())").get().replace(' ', '-').lower()
guest_team_string = response.xpath("normalize-space(//span[@id='Content_Main_LBL_GuestTeam']/text())").get().replace(' ', '-')
guest_team = re.sub('[^A-Za-z0-9]+', '-', guest_team_string)
if guest_team:
guest_team = guest_team.lower()

coach = response.xpath("//span[@id='Content_Main_ctl17_RP_MatchStats_Coach_Guest_0']/text()").get()
if coach:
coach = coach.lower()
coach = parse_coach(coach)

location = response.xpath("//span[@id='Content_Main_LB_Stadium']/text()").get()
if location:
location = location.lower()

guest_players = response.xpath("//div[@id='Content_Main_ctl17_RP_MatchStats_RPL_MatchStats_0']/div[5]/div/div/table/tbody/tr")

for player in guest_players:
player_number = player.xpath('./td[1]/p/span/text()').get()
player_name = player.xpath('./td[2]/p/span/b/text()').get().lower()
player_name = player.xpath('./td[2]/p/span/b/text()').get()
if player_name:
player_name = player_name.lower()

points_tot = player.xpath('./td[8]/p/span/text()').get()
points_BP = player.xpath('./td[9]/p/span/text()').get()
points_WL = player.xpath('./td[10]/p/span/text()').get()

serve_tot = player.xpath('./td[12]/p/span/text()').get()
serve_err = player.xpath('./td[13]/p/span/text()').get()
serve_ace = player.xpath('./td[14]/p/span/text()').get()

reception_tot = player.xpath('./td[15]/p/span/text()').get()
reception_err = player.xpath('./td[16]/p/span/text()').get()
reception_pos = player.xpath('./td[17]/p/span/text()').get()
reception_exc = player.xpath('./td[18]/p/span/text()').get()

attack_tot = player.xpath('./td[21]/p/span/text()').get()
attack_err = player.xpath('./td[22]/p/span/text()').get()
attack_block = player.xpath('./td[23]/p/span/text()').get()
attack_exc = player.xpath('./td[24]/p/span/text()').get()
attack_exc_perc = player.xpath('./td[25]/p/span/text()').get()

block_points = player.xpath('./td[27]/p/span/text()').get()

yield {
'Match ID': self.match_id,
'Match Date': match_date,
'Guest Team': guest_team,
'Guest Coach': coach,
'Stadium': location,
'Number': player_number,
'Name': player_name,
'Total Points': points_tot,
'Break Points': points_BP,
'W-L': points_WL
'W-L': points_WL,
'Total Serve': serve_tot,
'Serve Errors': serve_err,
'Ace': serve_ace,
'Total Receptions': reception_tot,
'Reception Erros': reception_err,
'Positive Pass Percentage (Pos%)': reception_pos,
'Excellent/ Perfect Pass Percentage (Exc.%)': reception_exc,
'Total Attacks': attack_tot,
'Attack Erros': attack_err,
'Blocked Attack': attack_block,
'Attack Points (Exc.)': attack_exc,
'Attack Points Percentage (Exc.%)': attack_exc_perc,
'Block Points': block_points
}

self.match_date = match_date
Expand Down
8 changes: 8 additions & 0 deletions volleystats/utils.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import re
from datetime import datetime

# '28/10/2022 - 19:30' or '28.10.2022 - 19:30' to 2022-10-28
Expand All @@ -20,3 +21,10 @@ def parse_engb_date(date_string):
parsed_engb_date = datetime.strptime(str_2, '%d-%B-%Y').date()

return parsed_engb_date

# (coach: schimtz guilherme) to schimtz guilherme
def parse_coach(coach_name):
regex = re.compile(r'\(coach: |\)')
coach = regex.sub('', coach_name)

return coach
2 changes: 1 addition & 1 deletion volleystats/version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '0.7'
__version__ = '0.8'

0 comments on commit b4ec251

Please sign in to comment.