Releases · strohne/Facepager

17 Aug 08:54

strohne

v4.5.3

12b0068

Version 4.5.3 Latest

Latest

Please be aware of changing access restrictions implemented by the platforms. If something does not work as expected, first read the reference of the specific API. You will find links to the references in the presets.

The latest Mac build is not yet ready, you'll find the previous version below.

Latest changes:

Application rate limit handling in the Facebook module. The speed is throttled to one request per minute as soon as 95% of the rate limit is reached and unthrottled to full speed when the requests calmed down.
Signed installer for Mac. This simplifies the installation, though it is not yet perfect (notarization is still an issue).
Add first, last, min, and max modifiers to pull out IDs from the fetched data. Usage for example in the pagination setup for Mastodon: *.id|last.
Bug fix in the transfer nodes function. The function can be used in crawling scenarios to create new seed nodes from already fetched nested data.
Bug fix in the re modifier. The re modifier can be used to extract data with regular expressions. For example, in the column setup you can use snippet.description|re:#[^\W]+ to extract hashtags from the snippet.decription field.

Sorry, the binaries are huge due to lots of functionality from PySide behind the scenes. Feel free to contact me if you have ideas for improving the packages.

Assets 5

28 Aug 11:03

strohne

v4.4.4

793ae72

Version 4.4.4

Note that the binary for Mac is delayed and will be updated later. If this release crashes on your computer, try an older version!

Please be aware of changing access restrictions implemented by the platforms. Preregistered access is volatile. Due to changes in the Facebook API you can't request metadata about pages with the preregistered access anymore. Fetching posts still works.

Latest changes:

Load multiple API docs with different basepaths
Refactored settings handling to support Crowdtangle and Twitter v2 (see the Presets and the API viewer)
Transfer nodes feature for crawling
Timestamp conversion modifier, see https://github.com/strohne/Facepager/wiki/Webscraping#supported-modifiers
Revised token response workflow for OAuth (e.g. for using VKontakte)
Signed paginated URLs in Amazon module
Markdown formatting in the preset window
Bug fixes

Sorry, the binaries are huge due to the awesome builtin web browser (Qt for Python). If you have ideas how to improve the installers, feel free to contact me.

Assets 4

21 Nov 14:55

strohne

v4.3.10

bf29ee5

Version 4.3.10

Please notice: the release date of GitHub doesn't reflect the latest changes. You will find newer releases, after November 2020, in the download section below.

Latest changes:

Load multiple API docs with different basepaths
Fix decoding issue
Fix "content already consumed" error
Option to keep complete data in the offcut node (useful for webscraping)
Fixed Facebook login - the red success error message over which some have stumbled should be history :)
Maximum size parameter to cap large downloads
Adjusted the Generic module to support Twitter API v2 (for the academic research track)
Google announced the YouTube login procedure will be changed in January 2021. Embedded browsers will not be supported anymore, instead users must login using the system browser. You'll find the option "External OAuth2" in the settings. If the login to YouTube fails, try external OAuth2.
New login process. You will be presented two options: use the preregistered Facepager app or your own app. When using the Facepager app, we need to maintain an anonymized user list due to the API providers terms. You will find an explanation in the privacy policy of Facepager.
Experimental screenshot and rendered HTML feature (see Preset)
HEAD verb in Generic Module. Useful for resolving shortlinks or redirects. See the preset in the scraping category.
Updated SSL certificate.
Updated internal browser
Under the hood refactorings.

Sorry, the binaries are huge. You always want the latest technology, don't you?

Note: Facepager is under perpetual reconstruction. Keep an eye on the status log. If you encounter any bugs or black cats, update to the latest version and report in the issues section.

Assets 4

14 May 06:59

strohne

v4.2.22

7ea7669

Version 4.2.22

Note: Facepager is under perpetual reconstruction. Keep an eye on the status log. If you encounter any bugs or black cats, update to the latest version and report in the issues section.

Latest changes:

Open browser with an URL generated from the query settings: hold control key and click Fetch data
Select style in the settings (maybe try out Fusion style on Mac?)
Move some settings into an extra window, these settings are saved when closing Facepager
Update preset version number
Fix timer function
Minor improvements and bug fixes

Assets 4

29 Mar 00:27

strohne

v4.2.16

3276d4f

Version 4.2.16

Note: Facepager is under perpetual reconstruction. Keep an eye on the status log. If you encounter any bugs or black cats, update to the latest version and report in the issues section.

Latest changes:

Find nodes function
Progress indicator when adding large amounts of nodes
Fixed missing icons
Open db files with command line (Windows: right click the file, open with Facepager)
Faster delete (not that fast, though)
Extract data for multiple nodes
Parse Twitter dates using a new "shortdate"-modifier, e.g. in the column setup add created_at|shortdate. This converts dates such as "Tue Mar 29 08:11:25 +0000 2011" to ISO 8601 dates such as "2011-03-29T08:11:25+00:00"
Parse JavaScript with the js-modifier. This is useful for extracting data from JavaScript that is embedded in HTML. First extract the script tags, then pipe the content into to js-modifier and select the property you want to extract. For example, in the Generic module use the Extract data-function with the following key: text|xpath://script/text()|js:fancydata
Fetch data for multiple nodes in Twitter Streaming module (by increasing the threads)
Support multiple identical parameter names
Categories in the preset window are now ordered alphabetically, filenames are changed when a preset is renamed
Minor improvements

Assets 4

24 Mar 09:30

strohne

v4.2.8

0c4b4cc

Version 4.2.8

Note: At the moment, Facepager is under heavy reconstruction, a bunch of features is under development. Keep an eye on the status log. If you encounter any bugs or black cats, update to the latest version and report in the issues section. After updating to a new version, reinstalling default API definitions from GitHub may be necessary: start Facepager, wait for the message in the status log, restart Facepager.

Latest changes:

Webscraping features in the Generic Module: Set the response format to "text" and you will find the HTML source code of downloaded pages in the text property. Then, you can use CSS selectors, XPath and regular expression to extract data. See the wiki for a very brief explanation.
Preview in the Extract data dialog. This will greatly help you with webscraping. Type keys such as text|xpath://a and you will directly see the HTML content of all a-elements. Clicking Apply creates new nodes. Or you can devlop your keys here and enter them into the column setup or even in the placeholders for further fetching actions.
Renaming of keys in the column setup or when extracting data. Prefix your key with newname=, for example links=text|xpath://@href will save all links contained in the text property under the new key links.
Resume canceled data collection, even with pagination. See the tooltip of the Resume collection checkbox.
Option to stop pagination based on data from a request (e.g. stop if the value "hasnextpage" is empty)
Login using cookies: Authorization=header; Name=Cookie; Click settings button next to login button, choose "Cookie", add URL of website, then click Login. After logging into the website, the cookies are transfered to the access token field. Close the login window.
Detect rate limit in Generic module (status 429)
Timestamp modifier |timestamp in keys converts timestamp to date & time.
API key support in YouTube module
Improved error logging (request errors won't stop the whole process, error nodes are created instead)
Separate all connections. Each request gets its own session now.
Support empty keys for extracting the object ID (e.g. to get IDs of Twitter followers or friends).
Pro tip: You can build a pipeline by creating multiple presets in the same category and then apply the category.
Bug fix: bring tool windows to front on OSX (preset window, api viewer, extract data dialog)

Assets 4

02 Jan 00:50

strohne

v4.1.7

3540b7b

Version 4.1.7

Latest changes:

Experimental webscraping features: In the Generic Module, if you set the response format to "text", you will find the HTML source code of downloaded pages in the text property. Then, you can use CSS selectors and XPath to extract data. See the wiki for a very brief explanation.
Improved handling of Facebook app rate limit
Support paging in URL path: set paging param to placeholder <page> and use the placeholder in the base path or resource field. Useful for AJAX based paging of websites (e.g. ' load more' buttons)
Add tooltips in nodes view
Bug fix in Twitter and YouTube paging mechanism
Bug fix in export mechanism
Bug fix in Twitter app-only login (necessary for Premium API)
Improved speed of the file handler (file://), broken releases fixed at v4.1.7

Assets 4

31 Oct 11:35

strohne

v4.0.10

a04cac6

Version 4.0.10

Latest changes:

Facepager module supports login for own Facebook pages: 1. click settings, 2. add page ID (last part of the URL), 3. login.
API definitions and presets are both downloaded from GitHub and may be updated with a reload button
Scrape links out of webpages and crawl the web (response format option 'link')
Save response data to files. All functions of the former Files module are incorporated into the Generic module. Just specify download folder and file name, set response format to file. All responses including json are downloaded to the folder.
Removed wide format export option. If you need this, see https://github.com/strohne/Facepager/wiki/Data-Analysis
Load CSV files with additional data as seed nodes
Minor bug fixes

Assets 4

15 Apr 21:27

strohne

v4.0.4

332dfd7

Version 4.0.4

Updated on 2019-08-29.

Changes on the surface:

The Generic module can process XML, e.g. for downloading RSS-feeds. See the arxiv.org preset for an example.
The Generic module can convert HTML to JSON. Thus, Facepager can be used for very simple webscraping tasks.
The Generic module can save arbitrary text data. Why? I needed to download millions of small files, they messed up my file system. Storing them in a database is much more convenient.
Introduced an API viewer, based on OpenAPI. You can plug in your own OpenAPI files.
Faster selection of nodes, fixed hanging interface for large amounts of nodes.

Recent changes under the hood:

Updated from Python2 to Python3
Updated from PySide1 to PySide2
Updated documentation to use OpenAPI format

Caveats:

Facebook announced to close access to public pages API. Probably, in a few days the Facepager module will not work anymore. Google introduced restricted&sensitive scopes, restrictions are announced for October. Packaging the Mac version cost a great deal of nerves. Why is it worth the hassle? One of the primary goals of Facepager is to help people learn about automated methods. Documentation is a bit lagged but we are working on it. Feel free to provide presets or to contribute to the wiki. Find out what works and see the limitations. Trial and error.
The resulting OSX file is huge, sorry for that. The internal webbrowser (QtWebView) blows up the file size.

Assets 4

06 Oct 15:22

strohne

v3.10.2

0985ef3

Version 3.10.2

See installation hints on https://github.com/strohne/Facepager#installer

New features in v3.10:

Generic module and Files module come with OAuth2 now.
Post and put requests in Generic module and Files module
Upload files with placeholders <filename|file> or <filename|file|base64> (replace "filename" with the filename and select the folder in the settings)
Upload multipart/form-data. How to format the data will soon be documented in the Wiki (JSON with name-value pairs).
Convert XML or HTML responses to JSON. This way Facepager goes beyond JSON APIs, e.g. for using Amazon.
Amazon module (experimental)
Custom categories for presets to improve the ordering.
Improved handling of rate limits (automatic retry)
Some convenience improvements

It took quite a long time to refactor the code for these features. Why is it worth the effort? Facepager is on its way to becoming a versatile cloud computing tool. You can now connect to Google Cloud Console. Try out the Getting Started for speech recognition: https://github.com/strohne/Facepager/wiki/Getting-Started-with-Google-Cloud-Platform

The macOS version is tested with HighSierra and probably will not work with older versions. Sorry for that.

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: strohne/Facepager

Version 4.5.3

Version 4.4.4

Version 4.3.10

Version 4.2.22

Version 4.2.16

Version 4.2.8

Version 4.1.7

Version 4.0.10

Version 4.0.4

Version 3.10.2