All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- Upgrade to Spark 3.2.4
- Docker setup
- Added coinbase transactions to network representations (address = 'coinbase', address_id = 0)
- Added new columns (
total_received_adj
,total_spent_adj
) tocluster
table (see #34)
- Fix division by zero on zero fee and value txs
- Upgrade to Spark 3.2.3
- Changed computation of
plainClusterRelations
- Changed handling of missing exchange rates values; don't fill with zeros, remove blocks/txs instead.
- sbt scalafmt and scala style plugin
- Standardized dev-makefile
- Added columns to
summary_statistics
table
- Upgraded to Spark 3.2.1
- Update Spark Cassandra connector to version 3.2.0
- Fixed test data ingest script
- GraphFrames based address clustering
- Removed tag handling (see graphsense/graphsense-tagpack-tool)
- Upgrade to Spark 3
- Improved Cassandra schema
- Changed package name
- Added command-line arguments to spark submit script
- Add command-line arguments for coinjoin filtering and address prefixes
- Adapted TagPack schema
- Changed schema of address/cluster relation tables
- Speed-up tests
- Made clustering deterministic wrt number of partitions
- Added Dockerfile for
spark-submit
job
- Updated dependencies
- Fixed tests
- Updated data model, field name changed in raw Cassandra schema (graphsense/graphsense-blocksci@c418dab)
- Changed primary key of
address_transactions
table - Added list of transactions (max 100) to address/cluster relations
- Added tag labels to address/cluster relations table (#15)
- Store table
tag_by_label
in transformed keyspace (graphsense/graphsense-dashboard#98) - Reintegrated clustering library
- Upgraded Scala (2.12)/Spark (2.4.5) + dependencies
- Allow NULL values in tag categories
- Exchange rates (by height) are stored in transformed keyspace
- Added new columns (
category
,abuse
) tocluster_tags
andcluster
table - Upgraded dependencies (scallop, scalatest, spark-fast-tests)
- Adjusted Cassandra schema, use integers for cluster IDs in cluster relations
- Changed model of cluster graph; every entity has now integer IDs
- Parser for command-line arguments (scallop)
- Fixed bug in cluster graph computation
- Update to Apache Spark 2.4
- Removed
srcCategory
/dstCategory
in address/cluster relations - Fixed data types in case classes/Cassandra schema
- Added argument to remove CoinJoin inputs in clustering
- Added
spark-fast-tests
library - Added Travis CI
- Changed GraphSense backend to
graphsense-blocksci
- Moved clustering code to library
graphsense-clustering
- Added in/out degrees to tables
address
/cluster
/cluster_addresses
- Added keyspace summary statistics (table
summary_statistics
) - Added table
plain_cluster_relations