Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

網站爬下來的資料如何串接Linebot #10

Closed
kuan35 opened this issue May 23, 2024 · 9 comments
Closed

網站爬下來的資料如何串接Linebot #10

kuan35 opened this issue May 23, 2024 · 9 comments

Comments

@kuan35
Copy link

kuan35 commented May 23, 2024

當使用者輸入latest,linebot將爬取下來的文章連結、標題、圖片傳給使用者

@kuan35
Copy link
Author

kuan35 commented May 23, 2024

以下為程式碼

import` os
import requests
from bs4 import BeautifulSoup
from flask import Flask, request, abort
from linebot.v3 import WebhookHandler
from linebot.v3.exceptions import InvalidSignatureError
from linebot.v3.messaging import Configuration, ApiClient, MessagingApi, ReplyMessageRequest, TextMessage
from linebot.v3.webhooks import MessageEvent, TextMessageContent

line_access_token = os.environ.get('LINE_ACCESS_TOKEN', 'your_default_access_token')
line_secret = os.environ.get('LINE_SECRET', 'your_default_secret')
port = int(os.environ.get('PORT', 5000))  # 默認端口 5000,如果沒有設置其他端口

configuration = Configuration(access_token=line_access_token)
handler = WebhookHandler(line_secret)

app = Flask(__name__)

def get_latest_article():
    url = 'https://ccc.technews.tw/'
    response = requests.get(url)
    if response.status_code != 200:
        return "無法訪問該網址。"
    else:
        soup = BeautifulSoup(response.text, 'html.parser')
        latest_article = soup.find('div', class_='content')
        if latest_article:
            title = latest_article.find('h1', class_='entry-title').text.strip()
            link = latest_article.find('a')['href']
            images = latest_article.find_all('img')
            if images:
                last_image = images[-1]['src']
                return f"最新文章標題:{title}\n最新文章連結:{link}\n最後一張圖片:{last_image}"
            else:
                return f"最新文章標題:{title}\n最新文章連結:{link}\n該文章沒有圖片。"
        else:
            return "找不到最新文章。"

@app.route("/callback", methods=['POST'])
def callback():
    signature = request.headers.get('X-Line-Signature', '')
    body = request.get_data(as_text=True)

    try:
        handler.handle(body, signature)
    except InvalidSignatureError:
        abort(400)
    return 'OK'

@handler.add(MessageEvent, message=TextMessageContent)
def handle_message(event):
    text = event.message.text
    if text.lower() == "latest":
        latest_info = get_latest_article()
        with ApiClient(configuration) as api_client:
            line_bot_api = MessagingApi(api_client)
            line_bot_api.reply_message_with_http_info(
                ReplyMessageRequest(
                    reply_token=event.reply_token,
                    messages=[TextMessage(text=latest_info)]
                )
            )

if __name__ == "__main__":
    app.run(port=port)

@kuan35
Copy link
Author

kuan35 commented May 23, 2024

輸出結果

* Serving Flask app '__main__'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000/
Press CTRL+C to quit
127.0.0.1 - - [23/May/2024 23:07:05] "POST / HTTP/1.1" 404 -

螢幕擷取畫面 2024-05-23 233918

@kuan35
Copy link
Author

kuan35 commented May 23, 2024

原爬蟲程式碼

import requests
from bs4 import BeautifulSoup


url = 'https://ccc.technews.tw/'


response = requests.get(url)
if response.status_code != 200:
    print("無法訪問該網址。")
else:
    
    soup = BeautifulSoup(response.text, 'html.parser')

    
    latest_article = soup.find('div', class_='content')

    if latest_article:
        
        title = latest_article.find('h1', class_='entry-title').text.strip()
        link = latest_article.find('a')['href']
        print(f"最新文章標題:{title}")
        print(f"最新文章連結:{link}")

        
        images = latest_article.find_all('img')
        if images:
            last_image = images[-1]
            print("最後一張圖片:", last_image['src'])
        else:
            print("該文章沒有圖片。")
    else:
        print("找不到最新文章。")

@kuan35
Copy link
Author

kuan35 commented May 23, 2024

其他串接部分皆與老師上課一致

@joshhu
Copy link
Owner

joshhu commented May 23, 2024

程式碼太亂看不懂,請先格式化一下你的問題,上網搜一下markdown格式的寫法,謝謝。

@kuan35
Copy link
Author

kuan35 commented May 23, 2024

已更改,感謝老師的建議
👍

@kuan35 kuan35 changed the title 網站爬下來的資料如何串接linebot 網站爬下來的資料如何串接Linebot May 23, 2024
@joshhu
Copy link
Owner

joshhu commented May 24, 2024

程式第一行有誤,多一個「`」。另外下列更改

@app.route("/callback", methods=['POST'])

改成

@app.route("/", methods=['POST'])

修改後可順利執行
image

testcrawler.zip

@kuan35
Copy link
Author

kuan35 commented May 24, 2024

成功了!謝謝老師👍

@joshhu
Copy link
Owner

joshhu commented May 24, 2024

好,我把這個issue標記關閉,謝謝。

@joshhu joshhu closed this as completed May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants