GitHub - diverged/USPT-Go: Go package for extracting standardized patent objects from USPTO's bulk patent zip files.

USPTGo - USPTO Bulk Data Processing in Go

A Go package which accepts U.S. Patent and Trademark Office (USPTO) bulk data zip files, and returns standardized objects of structured, formatted patent contents.

For a standalone tool implementation of this package, see USPTO-Bulk-Data-Tool.

At this time, the USPTGo package supports the following USPTO bulk data products:

Patent Grant Full Text Data (No Images) (2004 - Present)
Patent Application Full Text Data (No Images) (2004 - Present)

Usage

func USPTGo(cfg *types.USPTGoConfig) (<-chan *types.USPTGoDoc, <-chan error, error)

Process a bulk data zip by passing an instance of USPTGoConfig to the USPTGo function, which returns two buffered channels, and an error.

type USPTGoConfig struct {
	InputPath         string // Path to the input zip file
	ReturnRawSplitDoc bool   // Optional - returns the raw split XML document in addition to the parsed document.  True by default.  False will save memory.
	Logger            Logger // Optional - provide a logging interface
}

The first channel returned contains individual documents from the inputted zip file:

type USPTGoDoc struct {
	USPTGoMetadata USPTGoMetadata
	RawSplitDoc    []byte // Entire XML document as represented in the originating bulk file
	Patent         Patent
	Trademark      Trademark 
}

type Patent struct {
	XMLName             xml.Name            `xml:"-" json:"-"` // `xml:"us-patent-grant"` OR `xml:"us-patent-application"`
	MetaLang            string              `xml:"lang,attr" json:"lang"`
	MetaDtdVersion      string              `xml:"dtd-version,attr" json:"dtd-version"`
	MetaFileName        string              `xml:"file,attr" json:"file-name"`
	MetaStatus          string              `xml:"status,attr" json:"status"`
	MetaFileType        string              `xml:"id,attr" json:"id"`
	MetaCountry         string              `xml:"country,attr" json:"country"`
	MetaDateProduced    string              `xml:"date-produced,attr" json:"date-produced"`
	MetaDatePubl        string              `xml:"date-publ,attr" json:"date-publ"`
	UsBibliographicData UsBibliographicData `xml:"-" json:"-"` // `xml:"us-bibliographic-data-grant"` OR `xml:"us-bibliographic-data-application"`
	Description         struct {
		Content string `xml:",innerxml"`
	} `xml:"description"`
	Abstract struct {
		Content string `xml:",innerxml"`
	} `xml:"abstract"`
	Claims struct {
		Content string `xml:",innerxml"`
	} `xml:"claims"`
	StructuredClaims []*models.Claim
}

type UsBibliographicData struct {
	XMLName              xml.Name `xml:"-" json:"-"` // `xml:"us-bibliographic-data-grant"` OR `xml:"us-bibliographic-data-application"`
	PublicationReference struct {
		DocumentID struct {
			Country   string `xml:"country"`
			DocNumber string `xml:"doc-number"`
			KindCode  string `xml:"kind"`
			Date      string `xml:"date"`
		} `xml:"document-id"`
	} `xml:"publication-reference"`
	ApplicationReference struct {
		ApplType   string `xml:"appl-type,attr"`
		DocumentID struct {
			Country   string `xml:"country"`
			DocNumber string `xml:"doc-number"`
			Date      string `xml:"date"`
		} `xml:"document-id"`
	} `xml:"application-reference"`
	ClassificationNational struct {
		Country               string `xml:"country"`
		MainClassification    string `xml:"main-classification"`
		FurtherClassification string `xml:"further-classification"`
	} `xml:"classification-national"`
	InventionTitle struct {
		Content string `xml:",innerxml"`
		Text    string `xml:",chardata"`
		ID      string `xml:"id,attr"`
	} `xml:"invention-title"`
	NumberOfClaims int `xml:"number-of-claims"`
}

The second channel contains errors encountered, including information like whether or not a document was skipped.

type USPTGoError struct {
	Err     error  // The error encountered
	Skipped bool   // Whether the file was skipped
	Name    string // Zip name, Index within Zip, Document ID, etc.
	Whence  string // verb phrase, e.g. "opening the file", "reading the file", etc.
	Type    string // Zip, Part of Zip, Patent Doc, etc.
	ZipInfo OriginZip
}

Example

Minimal example:

package main

import (
    "github.com/diverged/uspt-go"
    "github.com/diverged/uspt-go/types"
)

func main() {
    cfg := &types.USPTGoConfig{
        // Initialize your config
    }

    docChan, errChan, err := usptgo.USPTGo(cfg)
    if err != nil {
        // Handle initialization error
    }

    // Example of how to use the returned channels
    for doc := range docChan {
        // Process each document
    }

    for err := range errChan {
        // Handle each error
    }
}

For a more complete example of how to make use of this package, see USPTO-Bulk-Data-Tool.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
internal		internal
types		types
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
usptgo.go		usptgo.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

USPTGo - USPTO Bulk Data Processing in Go

Usage

Example

License

About

Releases

Packages

Languages

License

diverged/USPT-Go

Folders and files

Latest commit

History

Repository files navigation

USPTGo - USPTO Bulk Data Processing in Go

Usage

Example

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages