網站爬下來的資料如何串接Linebot #10

kuan35 · 2024-05-23T15:18:27Z

當使用者輸入latest，linebot將爬取下來的文章連結、標題、圖片傳給使用者

kuan35 · 2024-05-23T15:20:23Z

以下為程式碼

import` os
import requests
from bs4 import BeautifulSoup
from flask import Flask, request, abort
from linebot.v3 import WebhookHandler
from linebot.v3.exceptions import InvalidSignatureError
from linebot.v3.messaging import Configuration, ApiClient, MessagingApi, ReplyMessageRequest, TextMessage
from linebot.v3.webhooks import MessageEvent, TextMessageContent

line_access_token = os.environ.get('LINE_ACCESS_TOKEN', 'your_default_access_token')
line_secret = os.environ.get('LINE_SECRET', 'your_default_secret')
port = int(os.environ.get('PORT', 5000))  # 默認端口 5000，如果沒有設置其他端口

configuration = Configuration(access_token=line_access_token)
handler = WebhookHandler(line_secret)

app = Flask(__name__)

def get_latest_article():
    url = 'https://ccc.technews.tw/'
    response = requests.get(url)
    if response.status_code != 200:
        return "無法訪問該網址。"
    else:
        soup = BeautifulSoup(response.text, 'html.parser')
        latest_article = soup.find('div', class_='content')
        if latest_article:
            title = latest_article.find('h1', class_='entry-title').text.strip()
            link = latest_article.find('a')['href']
            images = latest_article.find_all('img')
            if images:
                last_image = images[-1]['src']
                return f"最新文章標題：{title}\n最新文章連結：{link}\n最後一張圖片：{last_image}"
            else:
                return f"最新文章標題：{title}\n最新文章連結：{link}\n該文章沒有圖片。"
        else:
            return "找不到最新文章。"

@app.route("/callback", methods=['POST'])
def callback():
    signature = request.headers.get('X-Line-Signature', '')
    body = request.get_data(as_text=True)

    try:
        handler.handle(body, signature)
    except InvalidSignatureError:
        abort(400)
    return 'OK'

@handler.add(MessageEvent, message=TextMessageContent)
def handle_message(event):
    text = event.message.text
    if text.lower() == "latest":
        latest_info = get_latest_article()
        with ApiClient(configuration) as api_client:
            line_bot_api = MessagingApi(api_client)
            line_bot_api.reply_message_with_http_info(
                ReplyMessageRequest(
                    reply_token=event.reply_token,
                    messages=[TextMessage(text=latest_info)]
                )
            )

if __name__ == "__main__":
    app.run(port=port)

kuan35 · 2024-05-23T15:34:15Z

輸出結果

* Serving Flask app '__main__'
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000/
Press CTRL+C to quit
127.0.0.1 - - [23/May/2024 23:07:05] "POST / HTTP/1.1" 404 -

kuan35 · 2024-05-23T15:34:58Z

原爬蟲程式碼

import requests
from bs4 import BeautifulSoup


url = 'https://ccc.technews.tw/'


response = requests.get(url)
if response.status_code != 200:
    print("無法訪問該網址。")
else:
    
    soup = BeautifulSoup(response.text, 'html.parser')

    
    latest_article = soup.find('div', class_='content')

    if latest_article:
        
        title = latest_article.find('h1', class_='entry-title').text.strip()
        link = latest_article.find('a')['href']
        print(f"最新文章標題：{title}")
        print(f"最新文章連結：{link}")

        
        images = latest_article.find_all('img')
        if images:
            last_image = images[-1]
            print("最後一張圖片：", last_image['src'])
        else:
            print("該文章沒有圖片。")
    else:
        print("找不到最新文章。")

kuan35 · 2024-05-23T15:35:45Z

其他串接部分皆與老師上課一致

joshhu · 2024-05-23T17:17:55Z

程式碼太亂看不懂，請先格式化一下你的問題，上網搜一下markdown格式的寫法，謝謝。

kuan35 · 2024-05-23T18:02:17Z

已更改，感謝老師的建議
👍

joshhu · 2024-05-24T04:20:10Z

程式第一行有誤，多一個「`」。另外下列更改

@app.route("/callback", methods=['POST'])

改成

@app.route("/", methods=['POST'])

修改後可順利執行

testcrawler.zip

kuan35 · 2024-05-24T06:48:59Z

成功了!謝謝老師👍

joshhu · 2024-05-24T09:09:33Z

好，我把這個issue標記關閉，謝謝。

kuan35 changed the title ~~網站爬下來的資料如何串接linebot~~ 網站爬下來的資料如何串接Linebot May 23, 2024

joshhu closed this as completed May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

網站爬下來的資料如何串接Linebot #10

網站爬下來的資料如何串接Linebot #10

kuan35 commented May 23, 2024 •

edited

Loading

kuan35 commented May 23, 2024 •

edited

Loading

kuan35 commented May 23, 2024 •

edited

Loading

kuan35 commented May 23, 2024 •

edited

Loading

kuan35 commented May 23, 2024 •

edited

Loading

joshhu commented May 23, 2024

kuan35 commented May 23, 2024

joshhu commented May 24, 2024

kuan35 commented May 24, 2024

joshhu commented May 24, 2024

網站爬下來的資料如何串接Linebot #10

網站爬下來的資料如何串接Linebot #10

Comments

kuan35 commented May 23, 2024 • edited Loading

當使用者輸入latest，linebot將爬取下來的文章連結、標題、圖片傳給使用者

kuan35 commented May 23, 2024 • edited Loading

以下為程式碼

kuan35 commented May 23, 2024 • edited Loading

輸出結果

kuan35 commented May 23, 2024 • edited Loading

原爬蟲程式碼

kuan35 commented May 23, 2024 • edited Loading

其他串接部分皆與老師上課一致

joshhu commented May 23, 2024

kuan35 commented May 23, 2024

joshhu commented May 24, 2024

kuan35 commented May 24, 2024

joshhu commented May 24, 2024

kuan35 commented May 23, 2024 •

edited

Loading

kuan35 commented May 23, 2024 •

edited

Loading

kuan35 commented May 23, 2024 •

edited

Loading

kuan35 commented May 23, 2024 •

edited

Loading

kuan35 commented May 23, 2024 •

edited

Loading