Skip to content

Middleware

zhengchun edited this page Dec 11, 2017 · 6 revisions

Overview

Middleware is one of the Antch core components for HTTP download. Middleware component can changes HTTP Request/HTTP Response behavior for web spider.

Built-in Middleware List

Contribution Middleware

Customize Middleware

This section represents about how to write your middleware.

The Middleware interface:

type HttpMessageHandler interface {
	Send(*http.Request) (*http.Response, error)
}

type Middleware func(HttpMessageHandler) HttpMessageHandler

Create your middleware that will customize HTTP User-Agent header.

type UserAgent struct{
    next antch.HttpMessageHandler
}

func (ua *UserAgent) Send(req *http.Request) (*http.Response, error){
    req.Header.Set("User-Agent","antbot")
    // passed req to the next middleware.
    return ua.next.Send(req)
}

Next is to register your middleware to crawler via UseMiddleware().

useragentMiddleware := func() antch.Middleware {
    return func(next antch.HttpMessageHandler) antch.HttpMessageHandler {
        return &UserAgent{next}
    }
}
crawler := antch.NewCrawler()
crawler.UseMiddleware(useragentMiddleware())
Clone this wiki locally