Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a simple search feature #8

Open
GoogleCodeExporter opened this issue Apr 26, 2015 · 7 comments
Open

Implement a simple search feature #8

GoogleCodeExporter opened this issue Apr 26, 2015 · 7 comments

Comments

@GoogleCodeExporter
Copy link

There is existing code for a simple search feature in the search.java file. 
However, it is not yet connected to the Search menu item, and I'm not yet sure 
if the code works.

Let's discuss the status of the code, and any technical difficulties such as 
the Saint ID problems that Aleks mentioned by email.

Original issue reported on code.google.com by ps008v...@gmail.com on 6 Feb 2015 at 5:05

@lemtom
Copy link
Contributor

lemtom commented Dec 24, 2020

I'm currently working on this.
afbeelding
I've changed the search results from text to tabular data. Clicking on a row opens the corresponding commemoration in a new window.

Are there any specific features that should be added?

@mamyt
Copy link
Collaborator

mamyt commented Dec 28, 2020 via email

@lemtom
Copy link
Contributor

lemtom commented Dec 28, 2020

I tested with French, and it seems to handle both versions of é fine.

I've currently implemented a checkbox that strips the accents from both the search term and the saint name. So far I've been testing in French, since that's a language I actually know. With the checkbox unchecked, the search term "Melece" doesn't give "St. Mélèce" as a result, with the checkbox it does. I've also added a similar checkbox to ignore capitalization.

The library I'm using (java.text.Normalizer) can probably normalize the church slavonic to some degree, but I'll probably have to find a way to handle the abbreviations (hardcoding per your list, I guess) and the spelling differences related to word formations.

I'm fairly sure the normalization I've implemented so far can handle diacritical marks in Greek, though I'll have to find some examples to be certain.

afbeelding

@mamyt
Copy link
Collaborator

mamyt commented Dec 29, 2020 via email

@lemtom
Copy link
Contributor

lemtom commented Dec 29, 2020

To easily test the cases you give me, I think I'm gonna extract some of the methods I've written to a utility class and write tests for them. I'll probably try to write tests for some of the existing classes as well later on.

I could use help with the church slavonic as well, since I can't even read Cyrillic (I interpreted the і in your equivalent sets as the Latin i at first, and was looking into romanization. I know better now.). Do you know a good source for all the equivalent sets?

I'll implement normalization under the "strip diacritical marks" checkbox in languages that require it, and then the translation strings can be different to indicate it.

Currently I'm only searching for the name, but I can easily add a checkbox to search the getLife() as well.

@mamyt
Copy link
Collaborator

mamyt commented Dec 30, 2020 via email

@lemtom
Copy link
Contributor

lemtom commented Dec 30, 2020

The e-mail on my website should work. My spam filter seems a bit overzealous (it caught e-mails from someone from a different project), so it might be prudent to reply here once you've mailed me, so I know when to check.

Searching through the life is now implemented:
afbeelding

I currently have these test cases based on your comments and my own test in French

//First boolean is ignoreDiacritics and the second is ignoreCapitalization
	@Disabled
	@Test
	void chineseCases(){
		assertTrue(searchName("格奥尔吉", "格奧爾吉", "lang", true, false));
	}

	@Test
	void greekCases(){
		assertTrue(searchName("άγιος", "ἅγιος", "gr", true, false));
		assertFalse(searchName("άγιος", "ἅγιος", "gr", false, false));
	}
	
	@Test
	void slavonicCases(){
		assertTrue(searchName("ѻ҆тє́цъ", "пра́ѻтецъ", "cu", true, false));
		assertFalse(searchName("ѻ҆тє́цъ", "пра́ѻтецъ", "cu", false, false));
	}
	
	void frenchCases(){
		assertTrue(searchName("melece", "Mélèce", "fr", true, true));
		assertFalse(searchName("melece", "Mélèce", "fr", true, false));
		assertFalse(searchName("melece", "Mélèce", "fr", false, true));
	}

I've had to expand the scope of the characters I'm stripping to catch the "COMBINING CYRILLIC PSILI PNEUMATA", but it's caught now.

As expected, there's no easy way to switch from traditional to simplified Chinese and vice versa. There's a library that might handle this, but that seems a bit excessive for such a minor feature (and its documentation is in Chinese).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants