-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removed firecrawl-py, fixed and improved firecrawl tool #5896
Conversation
Fireclaw-py uses APGL license |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a preliminary review of your code and found a few issues. Could you please address them first? Also, I would prefer if we could use type annotations in the method parameters.
I forgot to make the changes. After running it successfully, I submitted it directly and will make the changes in three hours
---- Replied Message ----
From ***@***.***> Date 07/03/2024 09:03 To ***@***.***> Cc ***@***.***>***@***.***> Subject Re: [langgenius/dify] Removed firecrawl-py, fixed and improved firecrawl tool (PR #5896)
@laipz8200 requested changes on this pull request.
I did a preliminary review of your code and found a few issues. Could you please address them first? Also, I would prefer if we could use type annotations in the method parameters.
In api/core/tools/provider/builtin/firecrawl/firecrawl_appx.py:
class FirecrawlApp: - def __init__(self, api_key=None, base_url=None): + def __init__(self, api_key=None, api_url=None):
There's no need to change the variable names; the original ones are more suitable.
In api/core/tools/provider/builtin/firecrawl/firecrawl_appx.py:
self.api_key = api_key - self.base_url = base_url or 'https://api.firecrawl.dev' - if self.api_key is None and self.base_url == 'https://api.firecrawl.dev': - raise ValueError('No API key provided') + if not self.api_key:
The self-hosted version doesn't require an API key. Making this change could cause errors.
In api/core/tools/provider/builtin/firecrawl/tools/crawl.py:
crawl_result = app.crawl_url( url=tool_parameters['url'], - params=options,
Where are we handling options now?
In api/core/tools/provider/builtin/firecrawl/firecrawl_appx.py:
- def _prepare_headers(self): - return { - 'Content-Type': 'application/json', - 'Authorization': f'Bearer {self.api_key}' - } - - def _post_request(self, url, data, headers): - return requests.post(url, headers=headers, json=data) + def crawl_url(self, url, wait=False, poll_interval=5, idempotency_key=None, **kwargs): + endpoint = f'{self.api_url}/v0/crawl' + headers = self._prepare_headers(idempotency_key) + data = {'url': url, **kwargs} + response = self._request('POST', endpoint, data, headers) + job_id = response['jobId'] # 确保使用正确的键名
Please avoid using non-English comments in the code.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Please resolve this comments by clicking those buttons. |
ok |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* refs/heads/fix/dataset_operator: (33 commits) feat: update dataset sort feat: add dataset_permissions tenant_id chore: optimize memory fetch performance (#6039) feat: support moonshot and glm base models for volcengine provider (#6029) Optimize db config (#6011) fix: token count includes base64 string of input images (#5868) chore: skip pip upgrade preparation in api dockerfile (#5999) feat(*): Swtich to dify_config. (#6025) fix: the input field of tool panel not worked as expected (#6003) Add 2 firecrawl tools : Scrape and Search (#6016) test(test_rerank): Remove duplicate test cases. (#6024) chore: optimize memory messages fetch count limit (#6021) Revert "feat: knowledge admin role" (#6018) feat: add Llama 3 and Mixtral model options to ddgo_ai.yaml (#5979) fix: add status_code 304 (#6000) 6014 i18n add support for spanish (#6017) [Feature] Support loading for mermaid. (#6004) fix: update workflow trace query (#6010) Removed firecrawl-py, fixed and improved firecrawl tool (#5896) fix API tool's schema not support array (#6006) ...
Checklist:
Important
Please review the checklist below before submitting your pull request.
dev/reformat
(backend) andcd web && npx lint-staged
(frontend) to appease the lint godsDescription
Improved code, str will appear in the error Fixes #5897
Type of Change
Testing Instructions
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Test A Run docker,use firecrawl,url is https://www.163.com/news/article/J64CN5IM000189FH.html?clickfrom=w_yw
Test B