Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix author formatter for unchanged names #6552

Merged
merged 117 commits into from
Jun 16, 2020
Merged
Show file tree
Hide file tree
Changes from 87 commits
Commits
Show all changes
117 commits
Select commit Hold shift + click to select a range
fd405cf
Fix Pattern.compile for frequently used regexes
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
a6354e3
Fix one additional Pattern.compile
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
149ed4f
Fix style and unnecessary escape sequences
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
b57f1b2
Fix invalid index in call to substring
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
fae093b
Refactor name and javadoc of a regex
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
5a23a9a
Fix use of compiled regex for matching department
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
6af8c7e
Fix check for uppercase letter
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
716f885
Fix usage of uncompiled regex
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
cdfd56a
Fix readability?
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
b227edb
Add test cases
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
ef7f979
Fix `null` appearing as part of author name
k3KAW8Pnf7mkmdSMPHz27 May 13, 2020
9ac3993
Refactor name of capital regex pattern
k3KAW8Pnf7mkmdSMPHz27 May 14, 2020
85c96ce
Merge branch 'master' into fix-for-issue-6459
k3KAW8Pnf7mkmdSMPHz27 May 15, 2020
6ded410
Add debug output for reordering of names in fields
k3KAW8Pnf7mkmdSMPHz27 May 15, 2020
e8c3007
Merge branch 'master' into fix-for-issue-6459
k3KAW8Pnf7mkmdSMPHz27 May 18, 2020
72eb1fe
Add helper methods
k3KAW8Pnf7mkmdSMPHz27 May 18, 2020
2eee8dd
Fix missing negation in "uni" matching
k3KAW8Pnf7mkmdSMPHz27 May 18, 2020
cc54029
Fix test cases for corporate authors
k3KAW8Pnf7mkmdSMPHz27 May 18, 2020
1063ae1
Fix to keep all uppercase letters in abbreviation
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
ef94758
Fix commented out code
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
e80bd8b
Fix key for institution's name containing keyword
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
a0ed455
Fix test case for short institution name
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
4db0824
Refactor check for institution types
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
383fc14
Refactor comments and names improving readability?
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
c3e5f09
Refactor to improve readability and closure
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
5990c2a
Fix JavaDoc
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
0df3cdb
Fix JavaDoc typos
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
f0d9601
Fix preliminary order for authors -> latexfree
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
6f26a73
Drop logger
k3KAW8Pnf7mkmdSMPHz27 May 19, 2020
9b717c5
Add convenience methods for cached latexfree names
k3KAW8Pnf7mkmdSMPHz27 May 20, 2020
5ed9c54
Add name format method for names containing latex
k3KAW8Pnf7mkmdSMPHz27 May 20, 2020
d6e9e70
Add call to formatNameLatexFree
k3KAW8Pnf7mkmdSMPHz27 May 20, 2020
2dc664d
Fix unclear statement in JavaDoc
k3KAW8Pnf7mkmdSMPHz27 May 20, 2020
f0fd4f1
Fix to only keep the first character of each word
k3KAW8Pnf7mkmdSMPHz27 May 20, 2020
feebf81
Add latexfree Natbib test cases
k3KAW8Pnf7mkmdSMPHz27 May 21, 2020
d4c2ce3
Fix typo in latex-free test cases
k3KAW8Pnf7mkmdSMPHz27 May 21, 2020
04abe5e
Add Natbib test with escaped brackets
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
3f3ef62
Add Natbib institution test with escaped brackets
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
f4fbec1
Add test for latex-free comma separated lastnames
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
a256fa8
Add test for latex-free comma separated first name
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
4961c53
Add test for latex-free comma separated last name
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
a443db0
Fix adherence to JavaDoc and readability(?)
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
dd492e6
Fix readability(?)
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
71e45d4
Fix CheckStyle issues
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
2748dcd
Merge branch 'master' into fix-for-issue-6459
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
3b0dda3
Fix CHANGELOG.md
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
78b66f7
Fix mistake in BibtexKeyGeneratorTest
k3KAW8Pnf7mkmdSMPHz27 May 22, 2020
55ef8e7
Add test for oxford comma
k3KAW8Pnf7mkmdSMPHz27 May 24, 2020
237dc35
Merge branch 'master' into fix-for-issue-6459
k3KAW8Pnf7mkmdSMPHz27 May 26, 2020
78fade6
Fix miss-capitalization of enum
k3KAW8Pnf7mkmdSMPHz27 May 26, 2020
3de984d
Fix fields not displayed latex-free
k3KAW8Pnf7mkmdSMPHz27 May 26, 2020
ecac673
Fix in-line methods in MainTableNameFormatter
k3KAW8Pnf7mkmdSMPHz27 May 26, 2020
1c4928b
Fix in-line of generateKey() method
k3KAW8Pnf7mkmdSMPHz27 May 26, 2020
55f3f18
Fix separating tests into parsing/representation
k3KAW8Pnf7mkmdSMPHz27 May 26, 2020
e6cda69
Fix cache check and simplify expressions
k3KAW8Pnf7mkmdSMPHz27 May 26, 2020
2e120fd
Drop inlined methods
k3KAW8Pnf7mkmdSMPHz27 May 26, 2020
8cc947c
Fix most abbreviated abbreviations
k3KAW8Pnf7mkmdSMPHz27 May 26, 2020
2fc9e16
Drop old formatName method
k3KAW8Pnf7mkmdSMPHz27 May 27, 2020
b4b3993
Refactor formatNameLatexFree
k3KAW8Pnf7mkmdSMPHz27 May 27, 2020
3cb6232
Refactor new parse tests
k3KAW8Pnf7mkmdSMPHz27 May 27, 2020
b3f0d1b
Add more parse tests
k3KAW8Pnf7mkmdSMPHz27 May 27, 2020
5a27bbc
Drop all test cases containing escaped brackets
k3KAW8Pnf7mkmdSMPHz27 May 27, 2020
c7578b3
Refactor parse with latex tests
k3KAW8Pnf7mkmdSMPHz27 May 27, 2020
b8bf4f3
Fix my own spelling mistakes
k3KAW8Pnf7mkmdSMPHz27 May 27, 2020
cc23e29
Refactor abbreviation name
k3KAW8Pnf7mkmdSMPHz27 May 28, 2020
24398b4
Add latex-free unformatted authors' strings
k3KAW8Pnf7mkmdSMPHz27 May 29, 2020
5308717
Add test for latex-free unformatted authors
k3KAW8Pnf7mkmdSMPHz27 May 29, 2020
b2b523c
Merge branch 'master' into fix-for-issue-6459
k3KAW8Pnf7mkmdSMPHz27 May 29, 2020
94a46e2
Add change to CHANGELOG.md
k3KAW8Pnf7mkmdSMPHz27 May 29, 2020
539c616
Revert "Add change to CHANGELOG.md"
k3KAW8Pnf7mkmdSMPHz27 May 29, 2020
476b5b8
Fix dependence on parse in test cases
k3KAW8Pnf7mkmdSMPHz27 Jun 1, 2020
2642cb9
Add AuthorList cache tests
k3KAW8Pnf7mkmdSMPHz27 Jun 1, 2020
4ecdbf8
Add AuthorList institution cache test
k3KAW8Pnf7mkmdSMPHz27 Jun 1, 2020
b18e60d
Fix readability in a test case
k3KAW8Pnf7mkmdSMPHz27 Jun 1, 2020
7e840bc
Fix readability and memory leak
k3KAW8Pnf7mkmdSMPHz27 Jun 1, 2020
2bfbd71
Merge branch 'master' into fix-for-issue-6459
k3KAW8Pnf7mkmdSMPHz27 Jun 1, 2020
a4cb1ef
Fix int flag, changed to enum
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
ac8759b
Fix readability of TEX_NAMES set?
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
cea4c5e
Fix JavaDoc after flag changed from int to enum
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
03a632c
Fix AuthorList.equals
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
e22a098
Fix abbreviation in JavaDoc
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
0fd46d7
Refactor name of tests for AuthorListParser
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
d50c7cf
Fix re-enable commented out tests
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
4b389a6
Add test for first name starting with umlaut
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
cd68f6c
Fix update JavaDoc
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
f3340b2
Fix unique names in test cases
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
10fff4c
Fix a datastructure in AuthorListParser
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
b5caec8
Fix typos
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
fbe6501
Add missed oxford comma test cases
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
7c4faa3
Add test cases for equals
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
e028993
Add more equals test cases
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
a78460f
Fix typo
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
4abe855
Add tests for `hashCode`
k3KAW8Pnf7mkmdSMPHz27 Jun 3, 2020
551c6e2
Fix removes unnecessary call to `Map.contains`
k3KAW8Pnf7mkmdSMPHz27 Jun 4, 2020
557ffc9
Drop caching of unformatted strings in AuthorList
k3KAW8Pnf7mkmdSMPHz27 Jun 4, 2020
581446b
Fix caching of preferences
k3KAW8Pnf7mkmdSMPHz27 Jun 8, 2020
0020c21
Refactor static import of format preferences enums
k3KAW8Pnf7mkmdSMPHz27 Jun 9, 2020
81c0018
Revert "Drop caching of unformatted strings in..."
k3KAW8Pnf7mkmdSMPHz27 Jun 9, 2020
26cf448
Fix for getting unformatted latex-free names
k3KAW8Pnf7mkmdSMPHz27 Jun 9, 2020
0942e2c
Refactor the fields loop
k3KAW8Pnf7mkmdSMPHz27 Jun 9, 2020
3e562d8
Fix typo in formatFieldValueLatexFree
k3KAW8Pnf7mkmdSMPHz27 Jun 9, 2020
7616fbe
Fix reference of bibDatabaseContext in Formatter
k3KAW8Pnf7mkmdSMPHz27 Jun 10, 2020
4c457f3
Fix access modifier
k3KAW8Pnf7mkmdSMPHz27 Jun 12, 2020
d183705
Fix that removes "LatexFree" when it is implied
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
431811a
Fix that restores class variable `entriesFiltered`
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
8ad3aaf
Fix handling of DisplayStyle.AS_IS
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
a81f1e9
Drop AuthorList caching of unformatted string
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
c55b1b6
Merge branch 'fix-for-issue-6459' of https://github.com/k3KAW8Pnf7mkm…
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
c8380a8
Merge branch 'master' into fix-for-issue-6459
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
6155e14
Add `MainTableFieldValueFormatter` to checkstyle
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
1eeddab
Add MainTableFieldValueFormatter to exclusion list
koppor Jun 16, 2020
2ff9553
Fix JavaDoc for `Author.equals`
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
2396fb2
Revert "Add `MainTableFieldValueFormatter` to ..."
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
b16d457
Merge branch 'fix-for-issue-6459' of https://github.com/k3KAW8Pnf7mkm…
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
a9e4f72
Fix JavaDoc description of AuthorList.equals
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
d7d692e
Merge branch 'fix-for-issue-6459' of https://github.com/k3KAW8Pnf7mkm…
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
0c0e777
Fix JavaDoc adherence with Author.equals
k3KAW8Pnf7mkmdSMPHz27 Jun 16, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ public String formatNameLatexFree(final String nameToFormat) {

switch (nameFormatPreferences.getDisplayStyle()) {
case AS_IS:
return nameToFormat;
return authors.getAsUnformattedLatexFree();
k3KAW8Pnf7mkmdSMPHz27 marked this conversation as resolved.
Show resolved Hide resolved
case NATBIB:
return authors.getAsNatbibLatexFree();
case FIRSTNAME_LASTNAME:
Expand Down
114 changes: 26 additions & 88 deletions src/main/java/org/jabref/model/entry/AuthorList.java
Original file line number Diff line number Diff line change
@@ -1,12 +1,8 @@
package org.jabref.model.entry;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Locale;
import java.util.Objects;
import java.util.WeakHashMap;
import java.util.stream.Collectors;
Expand Down Expand Up @@ -123,8 +119,6 @@
public class AuthorList {

private static final WeakHashMap<String, AuthorList> AUTHOR_CACHE = new WeakHashMap<>();
// Avoid partition where these values are contained
private final static Collection<String> AVOID_TERMS_IN_LOWER_CASE = Arrays.asList("jr", "sr", "jnr", "snr", "von", "zu", "van", "der");
private final List<Author> authors;
private final String[] authorsFirstFirst = new String[4];
private final String[] authorsFirstFirstLatexFree = new String[4];
Expand All @@ -139,6 +133,8 @@ public class AuthorList {
private String authorsFirstFirstAnds;
private String authorsAlph;
private String authorsNatbibLatexFree;
private String authorsUnformatted;
private String authorsUnformattedLatexFree;

/**
* Creates a new list of authors.
Expand All @@ -157,7 +153,7 @@ protected AuthorList(Author author) {
}

public AuthorList() {
this(new ArrayList<Author>());
this(new ArrayList<>());
}

/**
Expand All @@ -168,60 +164,16 @@ public AuthorList() {
* @param authors The string of authors or editors in bibtex format to parse.
* @return An AuthorList object representing the given authors.
*/
public static AuthorList parse(String authors) {
public static AuthorList parse(final String authors) {
Objects.requireNonNull(authors);

// Handle case names in order lastname, firstname and separated by ","
// E.g., Ali Babar, M., Dingsøyr, T., Lago, P., van der Vliet, H.
final boolean authorsContainAND = authors.toUpperCase(Locale.ENGLISH).contains(" AND ");
final boolean authorsContainOpeningBrace = authors.contains("{");
final boolean authorsContainSemicolon = authors.contains(";");
final boolean authorsContainTwoOrMoreCommas = (authors.length() - authors.replace(",", "").length()) >= 2;
if (!authorsContainAND && !authorsContainOpeningBrace && !authorsContainSemicolon && authorsContainTwoOrMoreCommas) {
List<String> arrayNameList = Arrays.asList(authors.split(","));

// Delete spaces for correct case identification
arrayNameList.replaceAll(String::trim);

// Looking for space between pre- and lastname
boolean spaceInAllParts = arrayNameList.stream().filter(name -> name.contains(" ")).collect(Collectors
.toList()).size() == arrayNameList.size();

// We hit the comma name separator case
// Usually the getAsLastFirstNamesWithAnd method would separate them if pre- and lastname are separated with "and"
// If not, we check if spaces separate pre- and lastname
if (spaceInAllParts) {
authors = authors.replaceAll(",", " and");
} else {
// Looking for name affixes to avoid
// arrayNameList needs to reduce by the count off avoiding terms
// valuePartsCount holds the count of name parts without the avoided terms

int valuePartsCount = arrayNameList.size();
// Holds the index of each term which needs to be avoided
Collection<Integer> avoidIndex = new HashSet<>();

for (int i = 0; i < arrayNameList.size(); i++) {
if (AVOID_TERMS_IN_LOWER_CASE.contains(arrayNameList.get(i).toLowerCase(Locale.ROOT))) {
avoidIndex.add(i);
valuePartsCount--;
}
}

if ((valuePartsCount % 2) == 0) {
// We hit the described special case with name affix like Jr
authors = buildWithAffix(avoidIndex, arrayNameList).toString();
}
}
}

AuthorList authorList = AUTHOR_CACHE.get(authors);
if (authorList == null) {
if (!AUTHOR_CACHE.containsKey(authors)) {
k3KAW8Pnf7mkmdSMPHz27 marked this conversation as resolved.
Show resolved Hide resolved
AuthorListParser parser = new AuthorListParser();
authorList = parser.parse(authors);
AuthorList authorList = parser.parse(authors);
authorList.authorsUnformatted = new String(authors);
tobiasdiez marked this conversation as resolved.
Show resolved Hide resolved
AUTHOR_CACHE.put(authors, authorList);
}
return authorList;
return AUTHOR_CACHE.get(authors);
}

/**
Expand Down Expand Up @@ -296,37 +248,6 @@ public static String fixAuthorNatbib(String authors) {
return AuthorList.parse(authors).getAsNatbib();
}

/**
* Builds a new array of strings with stringbuilder.
* Regarding to the name affixes.
*
* @return New string with correct seperation
*/
private static StringBuilder buildWithAffix(Collection<Integer> indexArray, List<String> nameList) {
StringBuilder stringBuilder = new StringBuilder();
// avoidedTimes needs to be increased by the count of avoided terms for correct odd/even calculation
int avoidedTimes = 0;
for (int i = 0; i < nameList.size(); i++) {
if (indexArray.contains(i)) {
// We hit a name affix
stringBuilder.append(nameList.get(i));
stringBuilder.append(',');
avoidedTimes++;
} else {
stringBuilder.append(nameList.get(i));
if (((i + avoidedTimes) % 2) == 0) {
// Hit separation between last name and firstname --> comma has to be kept
stringBuilder.append(',');
} else {
// Hit separation between full names (e.g., Ali Babar, M. and Dingsøyr, T.) --> semicolon has to be used
// Will be treated correctly by AuthorList.parse(authors);
stringBuilder.append(';');
}
}
}
return stringBuilder;
}

/**
* Returns the number of author names in this object.
*
Expand Down Expand Up @@ -644,13 +565,30 @@ public String getAsFirstLastNamesLatexFree(boolean abbreviate, boolean oxfordCom
return authorsFirstFirstLatexFree[abbreviationIndex];
}

public String getAsUnformattedLatexFree() {
if (authorsUnformattedLatexFree != null) {
return authorsUnformattedLatexFree;
}

// The AuthorList was not created with the factory method
if (authorsUnformatted == null) {
authorsUnformatted = getAsFirstLastNamesWithAnd();
}
k3KAW8Pnf7mkmdSMPHz27 marked this conversation as resolved.
Show resolved Hide resolved
authorsUnformattedLatexFree = LatexToUnicodeAdapter.format(authorsUnformatted);
return authorsUnformattedLatexFree;
}

/**
* Compare this object with the given one.
* <p>
* Will return true iff the other object is an Author and all fields are identical on a string comparison.
* @return `true` iff the other object is an AuthorList, all Authors are in the same order, and all Authors field
k3KAW8Pnf7mkmdSMPHz27 marked this conversation as resolved.
Show resolved Hide resolved
* are equal on a string comparison.
*/
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (!(o instanceof AuthorList)) {
return false;
}
Expand Down
Loading