-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Processing ONIX Feeds
RayBB edited this page May 17, 2024
·
1 revision
Cory McCloud from Bibliometa has been collaborating with us to get ONIX records from approved partners into Open Library's catalog. As a start, ~50,000 bookcovers and ONIX feed entries have been added to archive.org items for an initial batch of processing lead by Salman Shah
<Product>
<RecordReference>0198520816</RecordReference>
<NotificationType>03</NotificationType>
<RecordSourceName>Cambridge University Press</RecordSourceName>
<ProductIdentifier>
<ProductIDType>02</ProductIDType>
<IDValue>0198520816</IDValue>
</ProductIdentifier>
<ProductIdentifier>
<ProductIDType>03</ProductIDType>
<IDValue>9780198520818</IDValue>
</ProductIdentifier>
<ProductIdentifier>
<ProductIDType>15</ProductIDType>
<IDValue>9780198520818</IDValue>
</ProductIdentifier>
<Barcode>02</Barcode>
<ProductForm>BC</ProductForm>
<ProductFormDetail>B102</ProductFormDetail>
<Series>
<TitleOfSeries>New Surveys in the Classics</TitleOfSeries>
<NumberWithinSeries>34</NumberWithinSeries>
</Series>
<Title>
<TitleType>01</TitleType>
<TitleText>Roman Art</TitleText>
</Title>
<Contributor>
<SequenceNumber>1</SequenceNumber>
<ContributorRole>A01</ContributorRole>
<PersonName>Peter Stewart</PersonName>
<PersonNameInverted>Stewart, Peter</PersonNameInverted>
<NamesBeforeKey>Peter</NamesBeforeKey>
<KeyNames>Stewart</KeyNames>
<ProfessionalAffiliation>
<Affiliation>Courtauld Institute of Art, London</Affiliation>
</ProfessionalAffiliation>
</Contributor>
<NoEdition />
<Language>
<LanguageRole>01</LanguageRole>
<LanguageCode>eng</LanguageCode>
</Language>
<BASICMainSubject>ART015060</BASICMainSubject>
<BASICVersion>2.2</BASICVersion>
<Subject>
<SubjectSchemeIdentifier>12</SubjectSchemeIdentifier>
<SubjectSchemeVersion>2.0</SubjectSchemeVersion>
<SubjectCode>ACG</SubjectCode>
</Subject>
<Subject>
<SubjectSchemeIdentifier>01</SubjectSchemeIdentifier>
<SubjectSchemeVersion>22</SubjectSchemeVersion>
<SubjectCode>709/.37</SubjectCode>
</Subject>
<Subject>
<SubjectSchemeIdentifier>03</SubjectSchemeIdentifier>
<SubjectCode>N5760 .S66 2004</SubjectCode>
</Subject>
<AudienceCode>05</AudienceCode>
<MediaFile>
<MediaFileTypeCode>04</MediaFileTypeCode>
<MediaFileFormatCode>03</MediaFileFormatCode>
<MediaFileLinkTypeCode>01</MediaFileLinkTypeCode>
<MediaFileLink>http://assets.cambridge.org/97801985/20818/cover/9780198520818.jpg</MediaFileLink>
</MediaFile>
<Imprint>
<NameCodeType>01</NameCodeType>
<NameCodeTypeName>Cambridge University Press code</NameCodeTypeName>
<NameCodeValue>OUP</NameCodeValue>
<ImprintName>Oxford University Press</ImprintName>
</Imprint>
<Publisher>
<PublishingRole>01</PublishingRole>
<PublisherName>Oxford University Press</PublisherName>
</Publisher>
<CityOfPublication>Oxford</CityOfPublication>
<CountryOfPublication>GB</CountryOfPublication>
<PublishingStatus>04</PublishingStatus>
<PublicationDate>20040422</PublicationDate>
<SalesRights>
<SalesRightsType>01</SalesRightsType>
<RightsTerritory>ROW</RightsTerritory>
</SalesRights>
<SalesRights>
<SalesRightsType>03</SalesRightsType>
<RightsCountry>KP</RightsCountry>
</SalesRights>
<Measure>
<MeasureTypeCode>01</MeasureTypeCode>
<Measurement>9.25</Measurement>
<MeasureUnitCode>in</MeasureUnitCode>
</Measure>
<Measure>
<MeasureTypeCode>02</MeasureTypeCode>
<Measurement>6.14</Measurement>
<MeasureUnitCode>in</MeasureUnitCode>
</Measure>
<Measure>
<MeasureTypeCode>03</MeasureTypeCode>
<Measurement>.43</Measurement>
<MeasureUnitCode>in</MeasureUnitCode>
</Measure>
<Measure>
<MeasureTypeCode>08</MeasureTypeCode>
<Measurement>.6</Measurement>
<MeasureUnitCode>lb</MeasureUnitCode>
</Measure>
<SupplyDetail>
<SupplierName>Cambridge University Press</SupplierName>
<AvailabilityCode>IP</AvailabilityCode>
<ProductAvailability>21</ProductAvailability>
<PackQuantity>30</PackQuantity>
<Price>
<PriceTypeCode>01</PriceTypeCode>
<BICDiscountGroupCode>ACUPRP </BICDiscountGroupCode>
<DiscountCoded>
<DiscountCodeType>02</DiscountCodeType>
<DiscountCode>P</DiscountCode>
</DiscountCoded>
<PriceStatus>02</PriceStatus>
<PriceAmount>21.99</PriceAmount>
<CurrencyCode>GBP</CurrencyCode>
<TaxRateCode1>Z</TaxRateCode1>
<TaxRatePercent1>0.00</TaxRatePercent1>
<TaxableAmount1>21.99</TaxableAmount1>
<TaxAmount1>0.00</TaxAmount1>
<TaxRateCode2>Z</TaxRateCode2>
<TaxRatePercent2>0.00</TaxRatePercent2>
<TaxableAmount2>0.00</TaxableAmount2>
<TaxAmount2>0.00</TaxAmount2>
</Price>
<Price>
<PriceTypeCode>01</PriceTypeCode>
<BICDiscountGroupCode>ACUPRP </BICDiscountGroupCode>
<DiscountCoded>
<DiscountCodeType>02</DiscountCodeType>
<DiscountCode>P</DiscountCode>
</DiscountCoded>
<PriceStatus>02</PriceStatus>
<PriceAmount>25.66</PriceAmount>
<CurrencyCode>EUR</CurrencyCode>
</Price>
</SupplyDetail>
<SupplyDetail>
<SupplierName>Cambridge University Press</SupplierName>
<ReturnsCodeType>02</ReturnsCodeType>
<ReturnsCode>C</ReturnsCode>
<AvailabilityCode>IP</AvailabilityCode>
<ProductAvailability>22</ProductAvailability>
<PackQuantity>30</PackQuantity>
<Price>
<PriceTypeCode>01</PriceTypeCode>
<ClassOfTrade>P</ClassOfTrade>
<DiscountCoded>
<DiscountCodeType>02</DiscountCodeType>
<DiscountCode>P</DiscountCode>
</DiscountCoded>
<PriceStatus>02</PriceStatus>
<PriceAmount>35.99</PriceAmount>
<CurrencyCode>USD</CurrencyCode>
</Price>
<Price>
<PriceTypeCode>01</PriceTypeCode>
<ClassOfTrade>P</ClassOfTrade>
<DiscountCoded>
<DiscountCodeType>02</DiscountCodeType>
<DiscountCode>P</DiscountCode>
</DiscountCoded>
<PriceStatus>02</PriceStatus>
<PriceAmount>40.95</PriceAmount>
<CurrencyCode>CAD</CurrencyCode>
</Price>
</SupplyDetail>
<SupplyDetail>
<SupplierName>Cambridge University Press</SupplierName>
<ReturnsCodeType>02</ReturnsCodeType>
<ReturnsCode>C</ReturnsCode>
<AvailabilityCode>IP</AvailabilityCode>
<ProductAvailability>22</ProductAvailability>
<PackQuantity>30</PackQuantity>
<Price>
<PriceTypeCode>02</PriceTypeCode>
<PriceStatus>02</PriceStatus>
<PriceAmount>62.95</PriceAmount>
<CurrencyCode>AUD</CurrencyCode>
</Price>
<Price>
<PriceTypeCode>02</PriceTypeCode>
<PriceStatus>02</PriceStatus>
<PriceAmount>69.95</PriceAmount>
<CurrencyCode>NZD</CurrencyCode>
</Price>
</SupplyDetail>
</Product>
- Full Path: /Product/Title/TitleText
- Accompanying Fields: TitleType
Title Type Code Value | Title Type Code Description |
---|---|
00 | Undefined |
01 | Distinctive Title |
02 | ISSN Key Title of Serial |
03 | Title in Original Language |
04 | Title Acronymn |
05 | Abbreviated Title |
06 | Title in other language |
07 | Thematic title of journal issue |
08 | Former title |
10 | Distributor's title |
- Full Path: /Product/Publisher/PublisherName
- Accompanying Fields: PublishingRole
Publishing Role Code Value | Publishing Role Code Description |
---|---|
01 | Publisher |
02 | Co Publisher |
03 | Sponsor |
04 | Publisher of original language version |
05 | Host/distributor of electronic content |
06 | Published for/on behalf of |
07 | Published in association with |
08 | Published on behalf of |
09 | New or acquiring publisher |
- Full Path: /Product/CountryOfPublication
- Full Path: /Product/CityOfPublication
-
Full Path: /Product/MediaFile/MediaFileLink
-
Accompanying Fields: MediaFileTypeCode
Media File Type Code Value | Media File Type Code Description |
---|---|
01 | URL |
02 | DOI |
03 | PURL |
04 | URN |
05 | FTP Address |
06 | filename |
- Accompanying Fields: MediaFileFormatCode
Media File Format Code Value | Media File Format Code Description |
---|---|
02 | GIF |
03 | JPEG |
05 | TIF |
- Accompanying Fields: MediaFileLinkTypeCode
Media File Link Type Code Value | Media File Link Type Code Description |
---|---|
01 | URL |
02 | DOI |
03 | PURL |
04 | URN |
05 | FTP Address |
06 | filename |
- Language = Language (there may be more than one)
- Measure = the physical book dimensions which we can add
Some of the fields will need to be discussed:
- Contributor: We definitely want this data, it will be very valuable for creating new authors and merging authors. But, contributor could be author, editor (could be a lot of things). Question: Which means # author? We need to ask @cory
- Series: We'll want to ask #openlibrary-g how we want to handle
series
(the title and number) - ProductIDTypes: We'll need to ask @cory for a list of ProductIDTypes (e.g. which of these is ISBN?) Suffice to say, a book may have multiple of these IDs and we'll want to add them
- Subject we'll have to ask (where/how to lookup the subject code)
The import strategy to import these books is to have a bot which parses through the XML in the ONIX Record. The following steps are followed while processing the ONIX Feed:
- Check if there are any errors in the XML File which contains the ONIX Records.
- Parsing through a single ONIX Record which is under which is under the Product Tag. Inside a single Product Tag, we parse through tags like
ProductIdentifier
,Title
,Publisher
,CountryOfPublication
,CityOfPublication
,MediaFile
,Language
to obtain the required results.
Getting Started & Contributing
- Setting up your developer environment
- Using
git
in Open Library - Finding good
First Issues
- Code Recipes
- Testing Your Code, Debugging & Performance Profiling
- Loading Production Site Data ↦ Dev Instance
- Submitting good Pull Requests
- Asking Questions on Gitter Chat
- Joining the Community Slack
- Attending Weekly Community Calls @ 9a PT
- Applying to Google Summer of Code & Fellowship Opportunities
Developer Resources
- FAQs: Frequently Asked Questions
- Front-end Guide: JS, CSS, HTML
- Internationalization
- Infogami & Data Model
- Solr Search Engine Manual
- Imports
- BookWorm / Affiliate Server
- Writing Bots
Developer Guides
- Developing the My Books & Reading Log
- Developing the Books page
- Understanding the "Read" Button
- Using cache
- Creating and Logging into New Users
- Feature Flagging
Other Portals
- Design
- Librarianship
- Communications
- Staff (internal)
Legacy
Old Getting Started
Orphaned Editions Planning
Canonical Books Page