-
Notifications
You must be signed in to change notification settings - Fork 41
Middleware
zhengchun edited this page Dec 11, 2017
·
6 revisions
Middleware
is one of the Antch core components for HTTP download. Middleware
component can changes HTTP Request/HTTP Response behavior for web spider.
- RFPDupeFilter - For filter duplicate Requests.
This section represents about how to write your middleware.
The Middleware interface:
type HttpMessageHandler interface {
Send(*http.Request) (*http.Response, error)
}
type Middleware func(HttpMessageHandler) HttpMessageHandler
Create your middleware that will customize HTTP User-Agent
header.
type UserAgent struct{
next antch.HttpMessageHandler
}
func (ua *UserAgent) Send(req *http.Request) (*http.Response, error){
req.Header.Set("User-Agent","antbot")
// passed req to the next middleware.
return ua.next.Send(req)
}
Next is to register your middleware to crawler via UseMiddleware()
.
useragentMiddleware := func() antch.Middleware {
return func(next antch.HttpMessageHandler) antch.HttpMessageHandler {
return &UserAgent{next}
}
}
crawler := antch.NewCrawler()
crawler.UseMiddleware(useragentMiddleware())