Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Elasticsearch 7.x and 8.x indexing #282

Merged
merged 5 commits into from
Jun 5, 2022

Conversation

JuliusHenke
Copy link
Contributor

@JuliusHenke JuliusHenke commented Jun 3, 2022

Hi, thanks for this cool project. I have added support for indexing pages in ES 7 & 8. I did not update the ACHE frontend search functionality, which likely also needs changes for full compatibility. But I think people will already enjoy the indexing part, since the newer ES versions >= 7.8.0 also come with docker images supporting linux/arm64/v8 OS/architecture (crucial for M1 Macbook users).

I have tested indexing with Elasticsearch version 7.17.0 and 8.2.2.

I used the following docker-compose.yml. Note that the xpack.security config values are important for local development without any security.

version: '2'
services:
  ache:
    image: vidanyu/ache
    command: startCrawl -c /config/ -s /config/docker.seeds -o /data -e crawl-data
    ports:
    - "8080:8080"
    volumes:
    - ./data-ache/:/data
    - ./:/config
    links:
    - elasticsearch
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.2.2
    environment:
      - discovery.type=single-node
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - xpack.security.enabled=false
      - xpack.security.transport.ssl.enabled=false
      - xpack.security.http.ssl.enabled=false
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - ./data-es/:/usr/share/elasticsearch/data # elasticsearch data will be stored at ./data-es/
    ports:
      - "9200:9200"

Until this pull request is accepted you may need to build your own image based on my changes and reference it in the docker-compose.yml instead of vidanyu/ache.

I am open for changes, but currently am not planning on doing the frontend part as well.
Cheers

@aecio
Copy link
Member

aecio commented Jun 3, 2022

Thanks for the contribution. Can you please rebase on the master branch and squash the commits? For some reason, GitHub is not showing the diff correctly.

@aecio
Copy link
Member

aecio commented Jun 3, 2022

Just noticed this is due to commit c97fde2 which changes the line separators. Do you happen to know what were the line separators before and after the commit?

@JuliusHenke
Copy link
Contributor Author

Just noticed this is due to commit c97fde2 which changes the line separators. Do you happen to know what were the line separators before and after the commit?

Before the commit they were CRLF and after it LF. I changed it, since most of the other code I looked at is formatted with LF (Unix and macOS) and my IDE showed a warning. Should I revert the c97fde2 commit or do the squashing?

@aecio aecio merged commit 97f6585 into VIDA-NYU:master Jun 5, 2022
@aecio
Copy link
Member

aecio commented Jun 5, 2022

No need for change. GitHub can squash all commits during the merge. Thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants