Table of Contents
Surfer is a digital footprint exporter, designed to aggregate all your personal data from various online platforms into a single folder.
Currently, your personal data is scattered across hundreds of platforms and the companies operating these platforms have no incentive to give this data back to you. Surfer solves this problem by navigating to websites and scraping data from these websites.
We believe that personal data aggregation is the key to enabling truly useful, universal personal assistants.
- Twitter Posts
- Twitter Bookmarks
- LinkedIn Profile
- GitHub Repositories
- YouTube
- Notion
- ChatGPT History
- Gmail
- iMessages (coming soon!)
- Reddit (coming soon!)
- Click on "Export" to initiate the data extraction process.
- The app waits for the target page to load completely.
- The system checks if the user is signed in to the platform being scraped.
- If not signed in, the user is prompted to sign in.
- If signed in, the process continues.
- Once signed in, the app interacts with the platform's user interface.
- The app then scrapes the user's data from the platform.
- Finally, the extracted data is exported and saved to your local storage.
"platform_name": "X Corp",
"name": "Twitter",
"runID": "twitter-001-1724267514217",
"timestamp": 1724267623318,
"content": [
"Twitter Post 1",
"Twitter Post 2",
"Twitter Post 3",
...
]
}
To download the app, head over to https://surfsup.ai. Or you can go to the releases page.
For instructions on setting up the app locally and contributing to the project, please refer to the Contributing Guidelines, Helper Functions Documentation, and Guide to Adding New Platforms.
See the open issues for a full list of proposed features (and known issues).
- Data being maintained/updated everyday
- Scheduled exports
- Obtain a code signing certificate for Windows
- Replace
setTimeout
withawait
for script execution to ensure elements exist before scraping - Implement robust error handling for the scraping process
- Add support for more online platforms
- Add verbosity to runs
- Implement concurrent scraping to allow for multiple scraping jobs to run simultaneously
- Adding knowledge graphs, chatting with data, visualizations, etc
- Adding sub-tasks within platforms (i.e. Twitter Bookmarks, LinkedIn Connections Data, etc)
- Integrate with other agentic frameworks like LangChain for advanced personal AI assistants
- Explore integration with wearable devices for enhanced personal data tracking and acknowledgment
Distributed under the MIT License. See LICENSE
for more information.
Surfer Discord Server - @SahilLalani0 - @JackBlair87 - @T0M_3D
Project Link: https://github.com/CEREBRUS-MAXIMUS/Surfer-Data