Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] 'Found duplicate entities' while migrating theHive from 3.4.0-1 to 4.1.17-1 #2341

Closed
packetvitality opened this issue Feb 7, 2022 · 2 comments
Labels
bug TheHive4 TheHive4 related issues

Comments

@packetvitality
Copy link

Request Type

Bug

Work Environment

Question Answer
OS version (server) Docker https://hub.docker.com/r/thehiveproject/thehive4
Virtualized Env. True
Dedicated RAM 16 GB
vCPU 8
TheHive version / git hash 4.1.17-1
Package Type Docker
Database Cassandra
Index type Lucene
Attachments storage MinIO

Problem Description

I used the migration tool to move from theHive 3.4.0-1 to 4.1.17-1. Upon starting theHive after the migration, I noted many logs indicating duplications, for example:

[info] o.t.t.s.ImpactStatusIntegrityCheckOps [|1012ed57] Found duplicate entities:

  • ImpactStatus(NotApplicable)
  • ImpactStatus(NotApplicable)

[info] o.t.t.s.ObservableTypeIntegrityCheckOps [|1383fe4a] Found duplicate entities:

  • ObservableType(hash,false)
  • ObservableType(hash,false)
    [info] o.t.t.s.ObservableTypeIntegrityCheckOps [|0924e2c1] Found duplicate entities:
  • ObservableType(fqdn,false)
  • ObservableType(fqdn,false)
    [info] o.t.t.s.ObservableTypeIntegrityCheckOps [|29d7837d] Found duplicate entities:
  • ObservableType(file,true)
  • ObservableType(file,true)

[info] o.t.t.s.ResolutionStatusIntegrityCheckOps [|4908101d] Found duplicate entities:

  • ResolutionStatus(Other)
  • ResolutionStatus(Other)

Within the GUI, I can notice the result of these duplicates in the observables. The screenshot below shows duplicate selection options for when new observables.

image

Steps to Reproduce

Start a clean version of theHive with no data, allow it to come up, creating database schemas, etc.

Stop theHive docker container.

Start theHive in docker without theHive service running by adding the following to my docker-compose file:
entrypoint: sleep infinity

Enter theHive as root
docker exec -it --workdir /root --user root thehive bash

Copy log file
cp /opt/thehive/conf/logback-migration.xml /etc/thehive/

Start the migration tool

/opt/thehive/bin/migrate \
  --output /etc/thehive/application.conf \
  --main-organisation [org] \
  --es-uri http://[ip]:9200 \
  --es-index the_hive \
  --es-single-type true

I referenced other issues and tried with and without the --es-single-type true option.

I read through similar issues 2331 , 2333, and 2334 but I am unsure how to resolve.

thehive_migration_duplication_logs.txt

@packetvitality packetvitality added bug TheHive4 TheHive4 related issues labels Feb 7, 2022
@packetvitality
Copy link
Author

packetvitality commented Feb 10, 2022

Re-tried using the latest version of the docker container
docker pull thehiveproject/thehive4:latest

REPOSITORY                     TAG       IMAGE ID       CREATED         SIZE
thehiveproject/thehive4        latest    7183a3524059   3 days ago      722MB

I am no longer seeing the 'Found duplicate entities' logs, but I do still see duplicate observable options. Screenshot below.
image

@packetvitality
Copy link
Author

I was able to resolve this by adjusting my options to drop the database while running the migration tool as follows:

/opt/thehive/bin/migrate \
  --drop-database \
  --output /etc/thehive/application.conf \
  --main-organisation [org]\
  --es-uri http://[ip]:9200 \
  --es-index the_hive \
  --es-single-type true

Prior to running the command above I also had to adjust my docker-compose file to mount the parent directory for the index folder. This allows the migration tool to delete the index folder.
./vol/thehive/opt/thp/thehive:/opt/thp/thehive
instead of:
./vol/thehive/index:/opt/thp/thehive/index

Once the tool was finished, I modified the permissions on my host to ensure all of the files created when running the tool could be accessed when running as thehive user. The better approach may have been to just run the tool as thehive user, but I am not sure if the tool needed to be ran as root or not.
chown -R 1000:1000 ./vol/thehive/opt/thp/thehive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug TheHive4 TheHive4 related issues
Projects
None yet
Development

No branches or pull requests

1 participant